E E 3J2 D ata M ining L ecture 3 Z ipf's L aw Stem m ing & Stop L ists
E E 3J2 D ata M ining L ecture 3 Z ipf's L aw Stem m ing & Stop L ists
E E 3J2 D ata M ining L ecture 3 Z ipf's L aw Stem m ing & Stop L ists
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Relat<strong>ing</strong> probability and rank<br />
By sett<strong>ing</strong> these equal we can get an expression for pk in<br />
terms of rk , which is what you need to compare with the<br />
Zipf curve:<br />
( ) α<br />
C<br />
+ B<br />
=<br />
r<br />
k<br />
p<br />
k<br />
This is of the same form as Zipf. The values of B and C<br />
depend on M and α<br />
The value of α depends of M (See Belew, pages 150-152<br />
for details)<br />
For M=26, α = 1.012, C = 0.02, B = 0.54<br />
EE<strong>3J2</strong> D<strong>ata</strong> M<strong>in<strong>ing</strong></strong> 2008 – l<strong>ecture</strong> 3<br />
Slide 18