04.05.2013 Views

E E 3J2 D ata M ining L ecture 3 Z ipf's L aw Stem m ing & Stop L ists

E E 3J2 D ata M ining L ecture 3 Z ipf's L aw Stem m ing & Stop L ists

E E 3J2 D ata M ining L ecture 3 Z ipf's L aw Stem m ing & Stop L ists

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Relat<strong>ing</strong> probability and rank<br />

By sett<strong>ing</strong> these equal we can get an expression for pk in<br />

terms of rk , which is what you need to compare with the<br />

Zipf curve:<br />

( ) α<br />

C<br />

+ B<br />

=<br />

r<br />

k<br />

p<br />

k<br />

This is of the same form as Zipf. The values of B and C<br />

depend on M and α<br />

The value of α depends of M (See Belew, pages 150-152<br />

for details)<br />

For M=26, α = 1.012, C = 0.02, B = 0.54<br />

EE<strong>3J2</strong> D<strong>ata</strong> M<strong>in<strong>ing</strong></strong> 2008 – l<strong>ecture</strong> 3<br />

Slide 18

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!