08.10.2016 Views

Foundations of Data Science

2dLYwbK

2dLYwbK

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

same or are disjoint,<br />

⎛( n∑<br />

) ⎞ 2<br />

E(x 2 ) = E ⎝ x i<br />

⎠ = ∑<br />

i=1<br />

i,j<br />

E(x i x j ) = ∑ i<br />

E(x 2 i ) + ∑ i≠j<br />

E(x i x j )<br />

= E(x) + ∑ ∑<br />

Prob ( cc(i) = cc(j) = S ) + ∑ ∑<br />

Prob ( cc(i) = S; cc(j) = T )<br />

i≠j S<br />

i≠j S,T<br />

disjoint<br />

= E(x) + ∑ ∑<br />

Prob(S is a c.c. ) + ∑ ∑<br />

Prob(S, T each a cc )<br />

i≠j S:i,j∈S i≠j S,T :i∈S,j∈T<br />

disjoint<br />

= E(x) + ∑ ∑<br />

Prob ( S is a c.c. )<br />

i≠j S:i,j∈S<br />

+ ∑ ∑<br />

Prob ( S is a c.c ) Prob ( T is a c.c ) (1 − p) −|S||T |<br />

i≠j<br />

S,T :i∈S,j∈T<br />

disjoint<br />

( ∑<br />

≤ O(n) + (1 − p) −|S||T | Prob ( cc(i) = S )) ( ∑<br />

Prob ( cc(j) = T ))<br />

S<br />

T<br />

≤ O(n) + ( 1 + o(1) ) E(x)E(x).<br />

In the next to last line, if S containing i and T containing j are disjoint sets, then the two<br />

events, S is a connected component and T is a connected component, depend on disjoint<br />

sets <strong>of</strong> edges except for the |S||T | edges between S vertices and T vertices. Let c 4 be a<br />

constant in the interval (c 3 , 1). Then, by Chebyshev inequality,<br />

Prob(x > c 4 n) ≤<br />

Var(x)<br />

(c 4 − c 3 ) 2 n 2 ≤ O(n) + o(1)c2 3n 2<br />

(c 4 − c 3 ) 2 n 2 = o(1).<br />

For the pro<strong>of</strong> <strong>of</strong> (3) suppose a pair <strong>of</strong> vertices u and v belong to two different connected<br />

components, each <strong>of</strong> size at least n 2/3 . We show that with high probability, they should<br />

have merged into one component producing a contradiction. First, run the breadth first<br />

search process starting at v for 1 2 n2/3 steps. Since v is in a connected component <strong>of</strong> size<br />

n 2/3 , there are Ω(n 2/3 ) frontier vertices. The expected size <strong>of</strong> the frontier continues to grow<br />

until some constant times n and the actual size <strong>of</strong> the frontier does not differ significantly<br />

from the expected size. The size <strong>of</strong> the component also grows linearly with n. Thus,<br />

the frontier is <strong>of</strong> size n 2 3 . See Exercise 4.22. By the assumption, u does not belong to<br />

this connected component. Now, temporarily stop the breadth first search tree <strong>of</strong> v and<br />

begin a breadth first search tree starting at u, again for 1 2 n2/3 steps. It is important to<br />

understand that this change <strong>of</strong> order <strong>of</strong> building G(n, p) does not change the resulting<br />

graph. We can choose edges in any order since the order does not affect independence or<br />

conditioning. The breadth first search tree from u also will have Ω(n 2/3 ) frontier vertices<br />

with high probability . Now grow the u tree further. The probability that none <strong>of</strong> the<br />

95

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!