Automatic detection of new domain-specific words, using document ...
Automatic detection of new domain-specific words, using document ...
Automatic detection of new domain-specific words, using document ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
4.3. To compute a score for a certain <strong>domain</strong>. . .<br />
1 count text tokens that are a member <strong>of</strong> the <strong>domain</strong> t ∈ D ∩W<br />
2 weigh count by size <strong>of</strong> <strong>domain</strong>-<strong>specific</strong> vocabulary v = 1 √<br />
|D|<br />
3 weigh score by number <strong>of</strong> <strong>domain</strong>s the text token is<br />
a member <strong>of</strong><br />
w = 1 d<br />
where d = ∑i |t ∩ Di|<br />
4 consider number <strong>of</strong> ‘unknown’ text tokens u (same as n − k)<br />
5 consider number <strong>of</strong> ‘known’ text tokens k (same as n − u)<br />
6 consider text length (to make score relative) n (same as u + k)<br />
sD = 1 k<br />
· · v<br />
n u