Saturday, December 25, 2004

Excel is unable to Search&Replace double byte characters

One thing that has become painfully clear in these last couple weeks of bug translation is that Excel is incredibly bad at Search & Replace.

Every day new Excel sheets full of bug reports trickles in from the customer, and it is our job to turn those Japanese reports into English. This is no small feat, and at best we can each only crank out around 30 translations a day. So among the five of us, the maximum number of bugs to translate is only about 150 bugs a day. Unfortunately, not all of us speak English fluently (read: only one of us speaks it), so the actual output is around 80 bugs per day.

I account for about 30 of those, and the others pick up the slack. Another guy can do about 25 per day, so it's not like I'm some super-translator or anything. The biggest problem is that we are receiving more than 150 bugs each day and each day we fall further and further behind. Currently, we are about 180 bugs behind. Which is why I'm working on Christmas.

This would all go much faster if we could automate some of this translation. First off, Japanese engineers love their double-byte characters. "->" is not good enough for them, they need to use "→". They use double byte characters for just about everything. Including spaces, brackets, and numbering. If I could just search for each of those and replace it with the appropriate single-byte character, I could cut about 5% off the time it takes to translate.

In fact, if I could search and replace on whole words that would be great too. These are bug reports, not literature, so the reoccurence of words is pretty high.

The problem is that Excel can't treat double-byte characters as characters. No, it treats them as numbers, and for that reason it doesn't have the first idea about how to Search and Replace for them. I am able to do it cell by cell, actually, but such a system is incredibly slow and simply typing the characters in by hand would work just fine. Why can't I do a sheet-wide Search & Replace on double byte characters?