31.07.2013 Views

Jure Leskovec, Stanford University - SNAP - Stanford University

Jure Leskovec, Stanford University - SNAP - Stanford University

Jure Leskovec, Stanford University - SNAP - Stanford University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Filtering a stream:<br />

Select elements with<br />

property x from stream<br />

Bloom filters<br />

Counting distinct elements:<br />

Number of distinct elements in<br />

the last k elements of the stream<br />

Flajolet-Martin:<br />

Item<br />

hash<br />

func h<br />

0010001011000<br />

Drop the item<br />

For each item a, let r(a) be the # of trailing 0s in h(a)<br />

Record R = the maximum r(a) seen<br />

R = max a r(a), over all the items a seen so far<br />

Estimated number of distinct elements = 2 R<br />

Output the<br />

item since it<br />

may be in S;<br />

Bit<br />

array B<br />

3/9/2011 <strong>Jure</strong> <strong>Leskovec</strong>, <strong>Stanford</strong> C246: Mining Massive Datasets 34

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!