Automatic detection of new domain-specific words, using document ...
Automatic detection of new domain-specific words, using document ...
Automatic detection of new domain-specific words, using document ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
3.2. Quantitative problems with the approach<br />
Arbitrary significance level (p ≥ 0.99):<br />
• influences the number <strong>of</strong> types in each <strong>domain</strong><br />
Differently sized <strong>domain</strong>-<strong>specific</strong> subcorpora:<br />
• causes the according vocabularies to be <strong>of</strong> different size:<br />
Folklore <strong>domain</strong>: 1957 types<br />
Sport <strong>domain</strong>: 16022 types<br />
Average for all <strong>domain</strong>s: 7256 types<br />
−→ Take into account the size <strong>of</strong> these vocabularies