Слайд 1
Слайд 1
Слайд 1
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
4. Russian language: difficulties<br />
Rich morphology → need MORE data<br />
— Set of prefixes, prepositional and adverbial in nature; diminutive,<br />
augmentative, and frequentative suffixes and infixes<br />
— Six cases in two numbers (singular and plural)<br />
— Up to ten additional cases are identified in linguistics textbooks<br />
— Absolutely obeying grammatical gender (masculine, feminine and<br />
neuter)<br />
Some derived languages (sometimes in articles and always in<br />
comments) → need more DIFFERENT text data<br />
— More sources for more training quality due more rules and larger<br />
dictionary<br />
Cyrillic alphabet → encoding and transliteration during LM<br />
training<br />
12