12.07.2015 Views

Topics in Language Resources for Translation ... - ymerleksi - home

Topics in Language Resources for Translation ... - ymerleksi - home

Topics in Language Resources for Translation ... - ymerleksi - home

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 11. Tagg<strong>in</strong>g and trac<strong>in</strong>g Program Integrated In<strong>for</strong>mation 1895.4 ID tag generation programWe developed a Perl program to create a separate set of files with ID tags <strong>for</strong> eachset of PII files. Perl was selected <strong>in</strong> 2003 because of the maturity of its regularexpressions and its Unicode support. As of 2007, other programm<strong>in</strong>g languagescould be used to prepare such a program. We do not change the orig<strong>in</strong>al PII files,but only create additional files to replace the orig<strong>in</strong>al PII files. The additional fileshave no effect on the tested program and its PII files. These files are only used <strong>for</strong>the TVT tests. The addid.pl supports the CATNls files and the Java properties files.The addid.pl program handles one directory of files at a time. A typical directoryis a set of English PII files or Japanese PII files. All of the CATIA PII files <strong>for</strong> alanguage are located <strong>in</strong> one directory. The addid.pl program does not support thenested directory structure of PII files, so the program must be run separately <strong>for</strong>each subdirectory. Care is needed if the PII files have a nested directory structure.The addid.pl program has the follow<strong>in</strong>g parameters. The default values areshown <strong>in</strong> the parentheses.-l Directory name of the PII files (Japanese)-f A filter <strong>for</strong> the file names that are processed (*)-p Prefix (Null)-e End of the ID tag (no)If the -e option (Position of the ID tag) is set to “yes”, then the ID tag is put atthe end of each PII str<strong>in</strong>g <strong>in</strong>stead of at the beg<strong>in</strong>n<strong>in</strong>g (though this option is rarelyused).IfweuseJ7dasaprefix,wewillcopytheJapanesedirectoryofCATIAPIIfiles<strong>in</strong>to the new J7d directory <strong>for</strong> the generated files. The command “addid.pl -l J7d-f *.CATNls -p J7d” generates the follow<strong>in</strong>g output.J7dWithID: An output directory of files with ID tagsThis directory conta<strong>in</strong>s all of the copied files with the PII str<strong>in</strong>gs modified byadd<strong>in</strong>g the ID tags.J7dNumList.txt: A map file from the ID file numbers to the PII file names.This conta<strong>in</strong>s a CSV text mapp<strong>in</strong>g of the ID file numbers to the PII file names.J7dTextList.txt:AtextfilethathasallofthePIIThis text file conta<strong>in</strong>s all of the sets of PII (pairs of a key and a str<strong>in</strong>g) <strong>for</strong> all of thefiles <strong>in</strong> the J7d directory. We call this comprehensive <strong>in</strong>dex file the “Basic Form” <strong>in</strong>Section 5.5.We prepare a J7dIU directory that conta<strong>in</strong>s the IU directories copied andseparated <strong>for</strong> TM (<strong>Translation</strong>Manager) use. If addIUName.pl is executed with

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!