12.07.2015 Views

Topics in Language Resources for Translation ... - ymerleksi - home

Topics in Language Resources for Translation ... - ymerleksi - home

Topics in Language Resources for Translation ... - ymerleksi - home

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

178 Naotaka Kato and Makoto ArisawaFigure 3. The statistical data <strong>for</strong> W<strong>in</strong>dows XP (SP2) PIIvertical axis, the result is the total number of the PII str<strong>in</strong>gs with that number ofappearances.For example, “Name” appears 333 times. Alternatively, 333 PII keys have thestr<strong>in</strong>g “Name”. The only str<strong>in</strong>g that appears 333 times is the str<strong>in</strong>g “Name” andthere<strong>for</strong>e that str<strong>in</strong>g is plotted at (333, 1). The str<strong>in</strong>g “Align Left” appears threetimes. There are 3,882 unique str<strong>in</strong>gs that appear three times and these str<strong>in</strong>gs areplotted at (3, 3882) <strong>in</strong> the right side of Fig. 2. There are 172,058 keys <strong>in</strong> CATIA PIIand there is a po<strong>in</strong>t plotted at (1, 69815). This plot means that 69,815 PII keys haveonly one unique str<strong>in</strong>g <strong>in</strong> the PII files, whereas all of the other str<strong>in</strong>gs <strong>in</strong> the otherkeys are not unique <strong>in</strong> the PII files. There<strong>for</strong>e those other keys, about 100,000 keys,cannot be identified uniquely when such a str<strong>in</strong>g appears <strong>in</strong> the GUI.The statistical results <strong>for</strong> the Microsoft W<strong>in</strong>dows XP (SP2) PII are shown <strong>in</strong>Fig. 3. We see almost the same characteristics as <strong>for</strong> CATIA. We used the “MicrosoftGlossary” data found on the Internet (Microsoft 2005) and analysed that data.Microsoft calls a collection of “PII str<strong>in</strong>gs” a Glossary. We checked 122 applicationsand OS files <strong>for</strong> the Microsoft PII. We found that all of the applications have similarcharacteristics.2.2.2 Statistical characteristics of the PII str<strong>in</strong>gs and difficulties <strong>in</strong> the translationof the PII str<strong>in</strong>gsOur statistical observations give us two facts. First, most PII str<strong>in</strong>gs are short. Consequentlytranslators have to translate short phrases without context. This task isso difficult that many PII str<strong>in</strong>gs receive <strong>in</strong>appropriate translations. Second, thesame PII str<strong>in</strong>gs are often repeated <strong>in</strong> PII files. If grep is used to f<strong>in</strong>d the sourcelocation of those repeated PII str<strong>in</strong>gs, there are many candidates <strong>for</strong> the source

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!