TIPA: A System for Processing Phonetic Symbols in LATEX - TUG
TIPA: A System for Processing Phonetic Symbols in LATEX - TUG
TIPA: A System for Processing Phonetic Symbols in LATEX - TUG
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
all the theoretically possible comb<strong>in</strong>ations of the<br />
tone letter system. In the recent publication of the<br />
International <strong>Phonetic</strong> Association tone letters are<br />
admitted as an official way of represent<strong>in</strong>g tones<br />
but the treatment of tone letters is quite <strong>in</strong>sufficient<br />
<strong>in</strong> that only a limited number of comb<strong>in</strong>ation<br />
is allowed. This is apparently due to the fact<br />
that there has been no ‘portable’ way of comb<strong>in</strong><strong>in</strong>g<br />
symbols that can be used across various computer<br />
environments. There<strong>for</strong>e TEX’s productive system<br />
of macro is an ideal tool <strong>for</strong> handl<strong>in</strong>g a system like<br />
tone letters.<br />
In the process of writ<strong>in</strong>g METAFONT source<br />
codes <strong>for</strong> <strong>TIPA</strong> phonetic symbols there have been<br />
many problems besides the one with the selection<br />
of symbols. One of such problems was that sometimes<br />
the exact shape of a symbol was unclear.<br />
For example, the shapes of the symbols such as Â<br />
(Stretched C), J (Curly-tail J) differ accord<strong>in</strong>g to<br />
sources. This is partly due to the fact that the<br />
IPA has been cont<strong>in</strong>uously revised <strong>for</strong> the past few<br />
decades, and partly due to the fact that different<br />
ways of computeriz<strong>in</strong>g phonetic symbols on different<br />
systems have resulted <strong>in</strong> the diversity of the shapes<br />
of phonetic symbols.<br />
Although there is no def<strong>in</strong>ite answer to such a<br />
problem yet, it seems to me that it is a privilege of<br />
those work<strong>in</strong>g with METAFONT to have a systematic<br />
way of controll<strong>in</strong>g the shapes of phonetic symbols.<br />
Encod<strong>in</strong>g The 256 character encod<strong>in</strong>g of <strong>TIPA</strong> is<br />
now officially called the ‘T3’ encod<strong>in</strong>g. 7 In decid<strong>in</strong>g<br />
this new encod<strong>in</strong>g, care is taken to harmonize with<br />
exist<strong>in</strong>g other encod<strong>in</strong>gs, especially with the T1<br />
encod<strong>in</strong>g. Also the eas<strong>in</strong>ess of <strong>in</strong>putt<strong>in</strong>g phonetic<br />
symbols is taken <strong>in</strong>to consideration <strong>in</strong> such a way<br />
that frequently used symbols can be <strong>in</strong>put with<br />
small number of keystrokes.<br />
Table 1 shows the layout of the T3 encod<strong>in</strong>g.<br />
The basic structure of the encod<strong>in</strong>g found <strong>in</strong> the<br />
first half of the table (character codes ’000-’177)<br />
is based on normal text encod<strong>in</strong>gs (ASCII, OT1<br />
and T1) <strong>in</strong> that section<strong>in</strong>g of this area <strong>in</strong>to several<br />
groups such as the section <strong>for</strong> accents and diacritics,<br />
the section <strong>for</strong> punctuation marks, the section <strong>for</strong><br />
numerals, the sections <strong>for</strong> uppercase and lowercase<br />
letters is basically the same with these encod<strong>in</strong>gs.<br />
Note also that the T3 encod<strong>in</strong>g conta<strong>in</strong>s not<br />
only phonetic symbols but also usual punctuation<br />
marks that are used with phonetic symbols, and <strong>in</strong><br />
such cases the same codes are assigned as the normal<br />
7 In a discussion with the <strong>LATEX</strong>2ε team it was suggested<br />
that the 128 character encod<strong>in</strong>g used <strong>in</strong> WSUIPA would be<br />
refered to as the OT3 encod<strong>in</strong>g.<br />
<strong>TIPA</strong>: A <strong>System</strong> <strong>for</strong> <strong>Process<strong>in</strong>g</strong> <strong>Phonetic</strong> <strong>Symbols</strong> <strong>in</strong> L ATEX<br />
’0 ’1 ’2 ’3 ’4 ’5 ’6 ’7<br />
’00x<br />
Accents and diacritics<br />
’04x<br />
’05x Punctuation marks<br />
’06x Basic IPA symbols I (vowels)<br />
’07x<br />
’10x<br />
Diacritics, etc.<br />
Basic IPA symbols II<br />
’13x Diacritics, etc.<br />
’14x Pct.<br />
Basic IPA symbols III<br />
(lowercase letters)<br />
’17x Diacr.<br />
’20x<br />
Tone letters and other<br />
’23x<br />
suprasegmentals<br />
’24x<br />
Old IPA, non-IPA symbols<br />
’27x<br />
’30x<br />
Extended IPA symbols<br />
’33x Gmn.<br />
’34x<br />
Basic IPA symbols IV<br />
’37x Gmn.<br />
Pct. = Punctuation marks, Diacr. = Diacritics, Gmn. =<br />
<strong>Symbols</strong> <strong>for</strong> Germanic languages.<br />
Table 1: Layout of the T3 encod<strong>in</strong>g<br />
text encod<strong>in</strong>gs. However it is a matter of trade-off to<br />
decide which punctuation marks are to be <strong>in</strong>cluded.<br />
For example ‘:’ and ‘;’ might have been preserved <strong>in</strong><br />
T3 but <strong>in</strong> this case ‘:’ has been traditionally used as<br />
a substitute <strong>for</strong> the length mark ‘:’ so that I decided<br />
to exclude ‘:’ <strong>in</strong> favor of the eas<strong>in</strong>ess of <strong>in</strong>putt<strong>in</strong>g the<br />
length mark by a s<strong>in</strong>gle keystroke.<br />
The encod<strong>in</strong>g of the section <strong>for</strong> accents and<br />
diacritics is closely related to T1 <strong>in</strong> that the accents<br />
commonly <strong>in</strong>cluded <strong>in</strong> T1 and T3 have the same<br />
encod<strong>in</strong>g.<br />
The sections <strong>for</strong> numerals and uppercase letters<br />
are filled with phonetic symbols that are used frequently<br />
<strong>in</strong> many languages, because numerals and<br />
uppercase letters are usually not used as phonetic<br />
symbols. And the assignments made here are used<br />
as the ‘shortcut characters’, which will be expla<strong>in</strong>ed<br />
<strong>in</strong> the section entitled “Ord<strong>in</strong>ary phonetic symbols”<br />
(page 105).<br />
<strong>TUG</strong>boat, 17, Number 2 — Proceed<strong>in</strong>gs of the 1996 Annual Meet<strong>in</strong>g 103