08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

No items<br />

E(x) ≥ 0.1<br />

At least one<br />

occurrence<br />

<strong>of</strong> item in<br />

10% <strong>of</strong> the<br />

graphs<br />

For 10% <strong>of</strong> the<br />

graphs, x ≥ 1<br />

Figure 4.6: If the expected fraction <strong>of</strong> the number <strong>of</strong> graphs in which an item occurs did<br />

not go to zero, then E (x), the expected number <strong>of</strong> items per graph, could not be zero.<br />

Suppose 10% <strong>of</strong> the graphs had at least one occurrence <strong>of</strong> the item. Then the expected<br />

number <strong>of</strong> occurrences per graph must be at least 0.1. Thus, E (x) → 0 implies the<br />

probability that a graph has an occurrence <strong>of</strong> the item goes to zero. However, the other<br />

direction needs more work. If E (x) is large, a second moment argument is needed to<br />

conclude that the probability that a graph picked at random has an occurrence <strong>of</strong> the<br />

item is non-negligible, since there could be a large number <strong>of</strong> occurrences concentrated on<br />

a vanishingly small fraction <strong>of</strong> all graphs. The second moment argument claims that for<br />

a nonnegative random variable x with E (x) > 0, if Var(x) is o(E 2 (x)) or alternatively if<br />

E (x 2 ) ≤ E 2 (x) (1 + o(1)), then almost surely x > 0.<br />

latter case uses what we call the second moment method. The first and second moment<br />

methods are broadly used. We describe the second moment method in some generality<br />

now.<br />

When the expected value <strong>of</strong> x(n), the number <strong>of</strong> occurrences <strong>of</strong> an item, goes to infinity,<br />

we cannot conclude that a graph picked at random will likely have a copy since<br />

the items may all appear on a vanishingly small fraction <strong>of</strong> the graphs. We resort to a<br />

technique called the second moment method. It is a simple idea based on Chebyshev’s<br />

inequality.<br />

Theorem 4.3 (Second Moment method) Let x(n) be a random variable with E(x) > 0.<br />

If<br />

( )<br />

Var(x) = o E 2 (x) ,<br />

then x is almost surely greater than zero.<br />

Pro<strong>of</strong>: If E(x) > 0, then for x to be less than or equal to zero, it must differ from its<br />

expected value by at least its expected value. Thus,<br />

(<br />

)<br />

Prob(x ≤ 0) ≤ Prob |x − E(x)| ≥ E(x) .<br />

81

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!