The condensation transition in random hypergraph 2-coloring
The condensation transition in random hypergraph 2-coloring
The condensation transition in random hypergraph 2-coloring
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
mutually at Hamm<strong>in</strong>g distance at least 0.48n. (Indeed,<strong>in</strong>ductively choose S i to be the local cluster C 0.01 (σ) ofsome 2-color<strong>in</strong>g σ ∉ ⋃ jr cond + o k (1), we have(exp −kr )(1.6)2 k−1 ln 2 > ln 2 + r ln(1 − 2 1−k ).− 1In effect, we cannot extrapolate the evolution of thetotal number Z of 2-color<strong>in</strong>gs and the median clustersize from (1.4) and (1.5) to the regime r > r cond ,because then (1.6) would lead to the absurd conclusionthat the size of a typical cluster is greater than the totalnumber of 2-color<strong>in</strong>gs. Hence, it can’t be that ln Z andthe median cluster size trace the functions on the r.h.s.of (1.4) and (1.5) beyond r cond + o k (1).As our next theorem will show, it is <strong>in</strong>deed the casethat ln Z follows a different trajectory than (1.4) beyondr cond + o k (1). This means that for r > r cond + o k (1) itwill not be true anymore that ln Z ∼ ln E [Z] w.h.p. Inother words, beyond r cond + o k (1), the expected numberE [Z] of 2-color<strong>in</strong>gs is <strong>in</strong>deed driven up excessively bya t<strong>in</strong>y m<strong>in</strong>ority of <strong>hypergraph</strong>s with an abundance of2-color<strong>in</strong>gs.<strong>The</strong>orem 1.2. <strong>The</strong>re exist a constant k 0 ≥ 3 and asequence ε k → 0 such that for any k ≥ k 0 there areδ k > 0, ζ k > 0 such that the follow<strong>in</strong>g two statementsare true.1. W.h.p. H k (n, m) is 2-colorable for all r < r cond +ε k + δ k .2. For any density r with r cond + ε k < r < r col wehave(1.7) ln Z < ln E [Z] − ζ k n w.h.p.<strong>The</strong> second statement asserts that for densities betweenr cond +ε k and the actual (unknown) 2-colorabilitythreshold r col , the expected number E [Z] of 2-color<strong>in</strong>gsexceeds the actual number Z by an exponential factorexp(ζ k n) w.h.p. This contrasts with <strong>The</strong>orem 1.1, whichshows that below r cond , Z is of the same exponential orderas E [Z] w.h.p. Furthermore, the first part of <strong>The</strong>orem1.2 ensures that the regime of densities where (1.7)holds is non-empty, as the true threshold r col is <strong>in</strong>deedstrictly greater than r cond + ε k .In mathematical physics, the term ‘phase <strong>transition</strong>’is usually def<strong>in</strong>ed as a po<strong>in</strong>t where the function1F (r) = lim E [ln(1 + Z)]n→∞ nis non-analytic. With Z the number of 2-color<strong>in</strong>gsof H k (n, m), it is not currently known whether thelimit F (r) exists. 2 But if it does, then <strong>The</strong>orems 1.1and 1.2 imply that around some r = r cond + o k (1), thefunction F is non-analytic (because for r < r cond , F (r)1co<strong>in</strong>cides with the l<strong>in</strong>ear function lim n→∞ nln E [Z] =ln 2 + r ln(1 − 2 1−k )). In any case, what we see isthat the sequence of functions r ↦→ 1 nE [ln(1 + Z)] doesnot converge to an analytic limit <strong>in</strong> an <strong>in</strong>terval oflength o k (1) around r cond . In this sense, <strong>The</strong>orem 1.2establishes the existence of a phase <strong>transition</strong> “near”r cond , which we call the <strong>condensation</strong> <strong>transition</strong>. Itsexistence was predicted on the basis of non-rigorousstatistical mechanics arguments [10, 18]. We emphasizethat the existence of the limit F (r) for r cond < r < r colrema<strong>in</strong>s an important open problem.2 Bayati, Gamarnik and Tetali [8] proved that the limit existsif Z is the partition function at any fixed positive temperature,but not that it does at zero temperature, which is the case weconsider here.243 Copyright © SIAM.Unauthorized reproduction of this article is prohibited.
the experiment corresponds to sampl<strong>in</strong>g a <strong>random</strong> 2-<strong>The</strong> term ‘<strong>condensation</strong>’ is meant to express thatpairs that we call the Gibbs distribution. For r < r col (or other related measures) is far more complicated <strong>in</strong>w.h.p. the set S(H k (n, m)) of all 2-color<strong>in</strong>gs has adrastically different shape than <strong>in</strong> the ‘shattered’ regimeof Corollary 1.1. To spell this out, let us call a 2-color<strong>in</strong>gcolor<strong>in</strong>g of a <strong>random</strong> <strong>hypergraph</strong>, and thus understand<strong>in</strong>gthe above experiment is key to study<strong>in</strong>g the comb<strong>in</strong>atorialnature of the <strong>hypergraph</strong> 2-colorability problem.of a <strong>hypergraph</strong> H on n vertices (α, β, γ)-condensed ifBut the experiment seems genu<strong>in</strong>ely difficult toanalyze. In fact, even for densities r = O(2 k−1 /k) farCO1. <strong>The</strong>re is no 2-color<strong>in</strong>g τ ∈ S(H) with αn < below the threshold for 2-colorability, it is not currentlydist(σ, τ) < βn.known how to efficiently construct, let alone sample, aCO2. <strong>The</strong> set C α (σ) of all 2-color<strong>in</strong>gs τ ∈ S(H) with2-color<strong>in</strong>g of a <strong>random</strong> <strong>hypergraph</strong> [3].dist(σ, τ) ≤ αn has size |C α (σ)| ≥ exp(−γn)Z(H).But there is a related experiment called the plantedmodel that is rather easy to implement and to study.(<strong>The</strong> difference between SH1–SH2 and the above isthat CO2 imposes a lower bound on |C α (σ)|.)P1. Choose σ ∈ {0, 1} n uniformly at <strong>random</strong>.P2. Choose a <strong>hypergraph</strong> H = HCorollary 1.2. <strong>The</strong>re exist a constant k 0 ≥ 3 and ak (n, m, σ) with medges uniformly at <strong>random</strong> among all <strong>hypergraph</strong>ssequence ε k → 0 such that for any k ≥ k 0 there exist afor which σ is a 2-color<strong>in</strong>g, and output (H, σ).sequence r(n) of densities satisfy<strong>in</strong>g |r(n) − r cond | ≤ ε ksuch that H k (n, m) with m = r(n) · n has the follow<strong>in</strong>gtwo properties w.h.p.1. H k (n, m) is 2-colorable.Let p k,n,m denote the distribution on Λ k (n, m) <strong>in</strong>ducedby P1–P2. It is not difficult to show that prior to the<strong>condensation</strong> phase, the distributions <strong>in</strong>duced by thetwo experiments are ‘close’.2. A <strong>random</strong> 2-color<strong>in</strong>g σ ∈ S(H k (n, m)) is(0.01, 0.49, o(1))-condensed w.h.p.Proposition 1.1. ([1]) Let Z be the event{ln Z ∼ ln E [Z]}. If r < r first is a density suchThis means that at a particular density r(n) the size that H k (n, m) ∈ Z w.h.p., thenof the local cluster of a ‘typical’ 2-color<strong>in</strong>g σ of H k (n, m)satisfies ln |C 0.01 (σ)| ∼ ln Z w.h.p. In other words, the(1.8) ln(g k,n,m [B|Z]) ≤ ln(p k,n,m [B]) + o(n)size of the cluster of a ‘typical’ 2-color<strong>in</strong>g has the sameexponential order as the set of all 2-color<strong>in</strong>gs. Thisfor any event B.contrasts with the ‘shattered’ scenario of Corollary 1.1, <strong>The</strong> relationship (1.8) allows us to bound the probabilitywhere w.h.p. all clusters only comprise an exponentially of some ‘bad’ event B <strong>in</strong> the Gibbs distribution bysmall fraction of the entire set S(H k (n, m)). <strong>The</strong> (nonrigorous)statistical physics work [10, 18] predicts that Indeed, Proposition 1.1 was used <strong>in</strong> [1] to study variousestimat<strong>in</strong>g its probability <strong>in</strong> the planted distribution.the conclusions of Corollary 1.2 should hold <strong>in</strong> the entire properties of ‘typical’ 2-color<strong>in</strong>gs of H k (n, m). Inregime between the <strong>condensation</strong> <strong>transition</strong> and r col . comb<strong>in</strong>ation with <strong>The</strong>orem 1.1 and the methods of [1],Discussion. <strong>The</strong> significance of the slightly betterProposition 1.1 can be used to get a pretty good idealower bound on the threshold for <strong>hypergraph</strong> 2-colorability provided by <strong>The</strong>orem 1.1 is that it allowsus to prove the existence of the <strong>condensation</strong> <strong>transition</strong>.Beyond the <strong>condensation</strong> <strong>transition</strong>, the comb<strong>in</strong>atorialnature of the problem seem to become more complicated.To see why, consider the follow<strong>in</strong>g <strong>random</strong> experimentwith r < r col (so that H k (n, m) is 2-colorablew.h.p.).what a 2-color<strong>in</strong>g of the <strong>random</strong> <strong>hypergraph</strong> H k (n, m)“typically looks like” before the <strong>condensation</strong> <strong>transition</strong>.But beyond the <strong>condensation</strong> <strong>transition</strong>, mattersappear to be more complicated. As <strong>The</strong>orem 1.2 shows,<strong>in</strong> the condensed regime we have ln Z < ln E [Z] − Ω(n)w.h.p., i.e., the assumption of Proposition 1.1 is violated.Roughly speak<strong>in</strong>g, the gap ln Z < ln E [Z] − Ω(n)implies that a pair chosen from the planted distributionP1–P2 corresponds to a pair chosen from the GibbsG1. Choose a <strong>random</strong> <strong>hypergraph</strong> H = H k (n, m),distribution only with exponentially small probability.conditional on H be<strong>in</strong>g 2-colorable.In fact, for densities beyond the <strong>condensation</strong> <strong>transition</strong>our proof of <strong>The</strong>orem 1.2 exhibits an event B forG2. Choose a 2-color<strong>in</strong>g σ ∈ S(H) uniformly at <strong>random</strong>and output (H, σ).which (1.8) is violated, i.e., the planted model is nolonger a good approximation to the Gibbs distribution.<strong>The</strong> above experiment <strong>in</strong>duces a probability distributiong k,n,m on the set Λ k (n, m) of <strong>hypergraph</strong>/2-color<strong>in</strong>gFurthermore, the statistical mechanics cavity techniquesuggests that gett<strong>in</strong>g a handle on the Gibbs measure244 Copyright © SIAM.Unauthorized reproduction of this article is prohibited.
the <strong>condensation</strong> phase. Overcom<strong>in</strong>g this obstacle appearsto be the rema<strong>in</strong><strong>in</strong>g challenge to obta<strong>in</strong> the precisethreshold for <strong>hypergraph</strong> 2-colorability. <strong>The</strong> statisticalmechanics reason<strong>in</strong>g [10, 18] suggestsConjecture 1. <strong>The</strong>re is ε k → 0 such that r col ∼2 k−1 ln 2 − ( ln 22 + 1 4 ) + ε k.One limitation of our approach is that we need toassume that k ≥ k 0 is sufficiently big (whereas thestandard second moment argument [4] applies to anyk ≥ 3). We need the lower bound on k to carry out asufficiently accurate analysis of comb<strong>in</strong>atorial structureof the solution space S(H k (n, m)). No attempt has beenmade to compute (let alone optimize) k 0 or the variousother constants.2 Related work<strong>The</strong> two <strong>in</strong>equalities <strong>in</strong> (1.2) state the best previousbounds on the threshold for <strong>hypergraph</strong> 2-colorabilityfrom the paper of Achlioptas and Moore [4], whichprovided the prototype for the second moment analyses<strong>in</strong> other sparse <strong>random</strong> CSPs (e.g., [5, 6]). S<strong>in</strong>ce thesecond moment method is non-constructive, there is theseparate algorithmic question: for what densities can a2-color<strong>in</strong>g of a <strong>random</strong> <strong>hypergraph</strong> be constructed <strong>in</strong>polynomial time w.h.p.? <strong>The</strong> best current algorithmis known to succeed up to r = c · 2 k−1 /k for someconstant c > 0, i.e., up to a factor of about k belowthe 2-colorability threshold [3].In [1] the geometry of the set S(H k (n, m)) of 2-color<strong>in</strong>gs of the <strong>random</strong> <strong>hypergraph</strong> was <strong>in</strong>vestigated(among other th<strong>in</strong>gs). It was shown that S(H k (n, m))shatters <strong>in</strong>to exponentially small well-separated ‘clusters’for densities (1 + ε k )2 k−1 ln(k)/k < r < r second .Corollary 1.1 extends this picture up to r cond . Inaddition, [1] also proved that <strong>in</strong> the regime (1 +ε k )2 k−1 ln(k)/k < r < r second a typical 2-color<strong>in</strong>g σof H k (n, m) is rigid w.h.p. <strong>in</strong> the sense that for mostvertices v any 2-color<strong>in</strong>g τ with σ(v) ≠ τ(v) has Hamm<strong>in</strong>gdistance Ω(n) from σ. Our analysis, most notablythe study of the structure of a typical ‘local cluster’ <strong>in</strong>Section 4, builds substantially on the concepts of shatter<strong>in</strong>gand rigidity from [1], but we elaborate them <strong>in</strong>considerably more detail to get close quantitative estimates.In many <strong>random</strong> CSPs other than <strong>random</strong> <strong>hypergraph</strong>2-color<strong>in</strong>g the best current bounds on the thresholdsfor the existence of solutions derive from the secondmoment method as well. <strong>The</strong> most prom<strong>in</strong>ent examplesare <strong>random</strong> graph k-color<strong>in</strong>g [5] and <strong>random</strong> k-SAT [6].But the second moment argument extends naturally toa range of ‘symmetric’ <strong>random</strong> CSPs [19]. It would be<strong>in</strong>terest<strong>in</strong>g to see if/how our techniques can be generalized<strong>in</strong> order to prove the existence of a <strong>condensation</strong>phase <strong>in</strong> these other problems, particularly <strong>random</strong>graph k-color<strong>in</strong>g. However, s<strong>in</strong>ce even the standard secondmoment analysis is quite <strong>in</strong>volved <strong>in</strong> <strong>random</strong> graphk-color<strong>in</strong>g, such a generalization will be technically challeng<strong>in</strong>g.<strong>The</strong> <strong>random</strong> k-SAT problem is conceptually differentbecause it is not ‘symmetric’. More precisely, <strong>in</strong><strong>random</strong> <strong>hypergraph</strong> 2-color<strong>in</strong>g the <strong>in</strong>verse 1 − σ of a 2-color<strong>in</strong>g σ is a 2-color<strong>in</strong>g as well. This symmetry, whichgreatly simplifies the second moment argument, is absent<strong>in</strong> <strong>random</strong> k-SAT. As a consequence, as elaborated<strong>in</strong> [4, 6], <strong>in</strong> k-SAT the bound E [ Z 2] = O(E [Z] 2 ) doesnot hold for any density. Roughly speak<strong>in</strong>g, to overcomethis problem [6] focuses on a special type of satisfy<strong>in</strong>gassignments (“balanced” ones), whose numberZ ∗ satisfies E [ Z∗] 2 = O(E [Z∗ ] 2 ). Technically, this is accomplishedby weight<strong>in</strong>g satisfy<strong>in</strong>g assignments cleverly.While our techniques can be extended easily to establishthe existence of a <strong>condensation</strong> phase for these balancedsatisfy<strong>in</strong>g assignments <strong>in</strong> <strong>random</strong> k-SAT, this does notimply that <strong>condensation</strong> occurs with respect to the biggerset of all satisfy<strong>in</strong>g assignments. This would requirea new approach for the direct analysis of the total numberof satisfy<strong>in</strong>g assignments <strong>in</strong> <strong>random</strong> k-SAT.We emphasize that our techniques are quite differentfrom the ‘weighted’ second moment method <strong>in</strong> [6].Indeed, the ‘asymmetry’ that motivated the weight<strong>in</strong>gscheme <strong>in</strong> [6] is absent <strong>in</strong> <strong>random</strong> <strong>hypergraph</strong> 2-color<strong>in</strong>g.Instead of weight<strong>in</strong>g, we employ a new idea that exploitsthe comb<strong>in</strong>atorial structure of the ‘clusters’ <strong>in</strong>to whichthe set S(H k (n, m)) of 2-color<strong>in</strong>gs decomposes.An example of a <strong>random</strong> CSP <strong>in</strong> which the precisethreshold for the existence of solutions is known is<strong>random</strong> k-XORSAT. In this problem a second momentargument yields the precise thresholds (after ‘prun<strong>in</strong>g’the underly<strong>in</strong>g <strong>hypergraph</strong>) [12, 21]. <strong>The</strong> explanationfor this success is that <strong>random</strong> k-XORSAT does nothave a <strong>condensation</strong> phase due to the algebraic natureof the problem. Similarly, <strong>in</strong> <strong>random</strong> k-SAT withk > log 2 n (i.e., the clause length is grow<strong>in</strong>g withn) there is no <strong>condensation</strong> phase and, <strong>in</strong> effect, thesecond moment method yields the precise satisfiabilitythreshold [9, 15]. A further class of problems wherethe condensed phase is conjectured to be empty are the‘locked’ problems of [24].In statistical mechanics the <strong>condensation</strong> <strong>transition</strong>was first predicted (us<strong>in</strong>g non-rigorous techniques) forthe <strong>random</strong> k-SAT and the <strong>random</strong> graph k-color<strong>in</strong>gproblems [18]. For <strong>random</strong> <strong>hypergraph</strong> 2-color<strong>in</strong>g thestatistical mechanics prediction for the <strong>condensation</strong>threshold was derived <strong>in</strong> [10]. <strong>The</strong> structure of the condensedphase is described us<strong>in</strong>g a non-rigorous frame-245 Copyright © SIAM.Unauthorized reproduction of this article is prohibited.
work called one-step replica symmetry break<strong>in</strong>g. Interest<strong>in</strong>gly,it was also conjectured that the structure ofthe condensed phase for large k is very similar to thestructure of the <strong>random</strong> subcube model [20] 3 . Ourproofs verify this for <strong>random</strong> <strong>hypergraph</strong> 2-color<strong>in</strong>g, seeSection 4 for more details.Random CSPs, <strong>in</strong>clud<strong>in</strong>g <strong>random</strong> <strong>hypergraph</strong> 2-color<strong>in</strong>g, have been studied <strong>in</strong> statistical mechanics asmodels of disordered systems (such as glasses) underthe name ‘diluted mean field models’. In this contextthe <strong>condensation</strong> <strong>transition</strong> corresponds to the so-calledKauzmann <strong>transition</strong> [17]. <strong>The</strong> present paper providesthe first rigorous proof that this phase <strong>transition</strong> actuallyexists <strong>in</strong> a ‘diluted mean field model’.3 Reach<strong>in</strong>g the <strong>condensation</strong> <strong>transition</strong>: proofof <strong>The</strong>orem 1.1In the rest of this paper, we assume that k ≥ k 0 for somelarge enough constant k 0 . Moreover, to avoid floor andceil<strong>in</strong>g signs, we assume that n is even.<strong>The</strong> vanilla second moment argument. We beg<strong>in</strong>by briefly review<strong>in</strong>g the ‘vanilla’ second moment methodfrom [4]. This will provide the background for theenhanced the second moment argument that yields<strong>The</strong>orem 1.1. As a first step, we need to work out theexpected number E [Z] of 2-color<strong>in</strong>gs. Us<strong>in</strong>g the l<strong>in</strong>earityof the expectation, it is easy to obta<strong>in</strong>Lemma 3.1. We have E [Z] ∼ 2 n (1 − 2 1−k ) m .Our goal is to identify the regime of densitiesr where E [ Z 2] = O(E [Z] 2 ), i.e., where the secondmoment method ‘works’. A technical issue is that Z<strong>in</strong>cludes 2-color<strong>in</strong>gs σ whose color classes have (very)different sizes. To simplify our calculations we are go<strong>in</strong>gto conf<strong>in</strong>e ourselves to color<strong>in</strong>gs σ whose color classesσ −1 (0), σ −1 (1) have the same size. More precisely, let uscall σ : V → {0, 1} equitable if |σ −1 (0)| = |σ −1 (1)| =n2 , and let Z e be the number of equitable 2-color<strong>in</strong>gs ofH k (n, m). Us<strong>in</strong>g Stirl<strong>in</strong>g’s formula and, once more, thel<strong>in</strong>earity of the expectation, it is not difficult to computeE [Z e ]: we haveThus, we need to compute E [Z e |σ is 2-col]. In otherwords, for a fixed equitable σ ∈ {0, 1} n we need to studythe <strong>random</strong> <strong>hypergraph</strong> H k (n, m) given that σ is a 2-color<strong>in</strong>g. This conditional distribution can be expressedeasily: just choose a set of m edges uniformly at <strong>random</strong>from all edges that are bichromatic under σ (cf. step P2of the ‘planted model’ above). Let H k (n, m, σ) denotethe result<strong>in</strong>g <strong>random</strong> <strong>hypergraph</strong>. Furthermore, givenσ, let Z e (d) be the number of equitable 2-color<strong>in</strong>gsτ with Hamm<strong>in</strong>g distance dist(σ, τ) = d. Similarly,let Z(d) be the total number of 2-color<strong>in</strong>gs τ withdist(σ, τ) = d. <strong>The</strong>n(3.10)E [Z e |σ is 2-col] =n∑E Hk (n,m,σ) [Z e (d)] .d=0Fact 3.2. ([4]) For any 0 < d < n we haveE Hk (n,m,σ) [Z(d)] =Θ( √ n/(d · (n − d))) · exp(ψ(d/n)),E Hk (n,m,σ) [Z e (d)] =ψ = ψ k,r : (0, 1) → R,Θ(n/(d · (n − d))) · exp(ψ(d/n)), withx ↦→ −x ln(x) − (1 − x) ln(1 − x)+r · ln[1 − 1 − xk − (1 − x) k ]2 k−1 .− 1Fact 3.2 and (3.10) reduce the problem of comput<strong>in</strong>gE [Z e |σ is a 2-color<strong>in</strong>g] (and thus E [ Z 2 e])) to an exercise<strong>in</strong> calculus: we just need to study the functionψ.Lemma 3.2. ([4]) <strong>The</strong> function ψ satisfies ψ(1/2) ∼1n ln E [Z], ψ(1−x) = ψ(x), ψ′ (1/2) = 0, and ψ ′′ (1/2) ψ(x) for all x ∈ (0, 1/2), thenE [Z e |σ is a 2-color<strong>in</strong>g] ≤ O(E [Z e ]).2. if there is some x ∈ (0, 1) with ψ(x) > ψ(1/2), thenE [Z e ]∼√2π · 2n√ n(1 − 2 1−k ) mE [Z e |σ is a 2-color<strong>in</strong>g] > E [Z] · exp(Ω(n)).(3.9)= Θ(1/ √ n) · E [Z] .Now, for what r do we have E [ Z 2 e]= O(E [Ze ] 2 )? Weuse the follow<strong>in</strong>g elementary relation.Fact 3.1. For any equitable σ : V → {0, 1} we haveE [ Z 2 e]= E [Ze ] · E [Z e |σ is a 2-color<strong>in</strong>g] .3 Similar conjectures were proposed by D. Achlioptas andA. Montanari (unpublished).Lemma 3.2 shows that the second moment method‘works’ if and only if r is such that the function ψtakes its global maximum at 1 2 . Thus, let r second bethe supremum of all r > 0 with this property. Us<strong>in</strong>gbasic calculus, one verifies that r second = 2 k−1 ln 2 −12 (1 + ln 2) + o k(1) (see [4, Section 7]), and that forr > r second the function ψ atta<strong>in</strong>s its maximum,strictly greater than ψ(1/2), <strong>in</strong> the <strong>in</strong>terval (0, 2 −k/2 ).In effect, the second part of Lemma 3.2 shows that246 Copyright © SIAM.Unauthorized reproduction of this article is prohibited.
<strong>The</strong> last <strong>in</strong>equality is crucial. It holds because if σ isgood, then |C(σ)| ≤ E [Z e ] with certa<strong>in</strong>ty, by the verydef<strong>in</strong>ition of ‘good’. Furthermore,∑E [Z g (d)|σ is good](3.14)αn
Lemma 4.1. W.h.p. we have |core(H)| ≥ n(1 −λ 2 exp(−λ)).<strong>The</strong> vertices v ∈ core(H) are difficult to recolor.Indeed, flipp<strong>in</strong>g the color of v necessitates recolor<strong>in</strong>g atleast one more vertex <strong>in</strong> each of the ≥ 3 edges that vsupports and that consist of vertices <strong>in</strong> core(H) only.This triggers an avalanche of recolor<strong>in</strong>gs that will onlystop after a substantial fraction of all vertices have beenrecolored. More precisely, us<strong>in</strong>g expansion properties ofthe <strong>random</strong> <strong>hypergraph</strong>, we can showLemma 4.2. W.h.p. H = H k (n, m, σ) does not have a2-color<strong>in</strong>g τ such that0 < |{v ∈ core(H) : σ(v) ≠ τ(v)}| < n/k 3 .In particular, w.h.p. there is no 2-color<strong>in</strong>g τ ∈ C(σ) thatdisagrees with σ on a vertex <strong>in</strong> core(H).To complete the proof of (4.18), we call a vertex vattached if v supports an edge e such that all othervertices of e belong to core(H). (In particular, allvertices <strong>in</strong> the core are attached.) Thus, to change thecolor of an attached vertex it is necessary to also alterthe color of a vertex <strong>in</strong> the core. Hence, Lemma 4.2entails that w.h.p. for all τ ∈ C(σ) and all attached vwe have τ(v) = σ(v).Lemma 4.3. W.h.p. the number of non-attached verticesis ≤ (1 + o k (1)) exp(−λ)n.Lemmas 4.1 and 4.3 directly imply that |C(σ)| ≤2 (1+o k(1))n/ exp(λ)w.h.p. An asymptotically match<strong>in</strong>glower bound can be obta<strong>in</strong>ed by estimat<strong>in</strong>g the numberof edges that conta<strong>in</strong> two non-attached vertices andwhose rema<strong>in</strong><strong>in</strong>g vertices have the same color under σ.Remark 1. <strong>The</strong> above sketch leaves out many detailsof the proof, which will be conta<strong>in</strong>ed <strong>in</strong> the full versionof this work. S<strong>in</strong>ce some of these details may be of<strong>in</strong>dependent <strong>in</strong>terest, we give a brief summary. If welet Q be the set of vertices that are neither <strong>in</strong> the corenor attached to it, then our methods allow us• to obta<strong>in</strong> the asymptotically size |Q| (up to o(n)vertices) by trac<strong>in</strong>g the construction of the coreand the attachment property via the method ofdifferential equations [23], and• to describe the distribution of the non-uniform<strong>hypergraph</strong> H Q with vertex set Q and edge set{e ∩ Q : e ∈ H}. In comb<strong>in</strong>ation with resultsfrom [22], this description shows that w.h.p. all buto(n) vertices of H Q form a hyperforest with all componentsof size O(1).As a consequence, we can show that the local cluster size|C(σ)| is tightly concentrated.5 Small clusters and the <strong>condensation</strong> phase:proof of <strong>The</strong>orem 1.2<strong>The</strong> goal <strong>in</strong> this section is establish <strong>The</strong>orem 1.2, i.e.,to prove that there is a non-empty regime of densitiesr cond < r < r col <strong>in</strong> which H k (n, m) is 2-colorable w.h.p.but ln Z < ln E [Z] − Ω(n) w.h.p.Why would it happen that ln Z < ln E [Z] − Ω(n)?Let r crit be the smallest density for which the lowerbound on the typical size of C(σ) from Proposition 4.1exceeds the expected number E [Z] of 2-color<strong>in</strong>gs ofH k (n, m). From Lemma 3.1 and Proposition 4.1 onecan easily derive that r crit = r cond + o k (1), and that<strong>in</strong>deed |C(σ)| > E [Z] · exp(Ω(n)) w.h.p. However, if itwas true that for such r we have ln Z ∼ ln E [Z] w.h.p.,then Proposition 1.1 would imply that the plantedmodel and the Gibbs distribution G1–G2 (first chooseH k (n, m), then choose σ ∈ S(H k (n, m)) <strong>random</strong>ly) areessentially equivalent. In particular, <strong>in</strong> a pair (H, σ)chosen from the Gibbs distribution the local clusterC(σ) would have size E [Z] exp(Ω(n)) w.h.p., lead<strong>in</strong>g tothe absurd conclusion that H k (n, m) is w.h.p. such thatZ ≥ |C(σ)| ≥ E [Z] exp(Ω(n)). Hence, <strong>in</strong>tuitively the<strong>condensation</strong> <strong>transition</strong> occurs because the typical sizeof the local cluster <strong>in</strong> the planted model H k (n, m, σ)surpasses the expected number E [Z] of 2-color<strong>in</strong>gs ofH k (n, m). Indeed, it is not difficult to turn this <strong>in</strong>tuition<strong>in</strong>to a proof of part 2 of <strong>The</strong>orem 1.2.But the above still allows for the possibility that the<strong>condensation</strong> phase may just be empty. <strong>The</strong> follow<strong>in</strong>gproposition shows that it is not.Proposition 5.1. <strong>The</strong>re exists r > r crit such thatH k (n, m) is 2-colorable w.h.p.To prove Proposition 5.1, we need to show thatw.h.p. H k (n, m) has a 2-color<strong>in</strong>g σ whose local clusterC(σ) is smaller than E [Z], i.e., much smaller than thelocal cluster <strong>in</strong> the planted model H k (n, m, σ). As wesaw <strong>in</strong> Section 4, the size of the local cluster C(σ) is thesmaller the more vertices are ‘blocked’ by critical edges.<strong>The</strong>refore, it seems natural to expect that 2-color<strong>in</strong>gswith a particularly high number of critical edges havesmall local clusters.Thus, we say that a 2-color<strong>in</strong>g σ of H k (n, m) is(1+β)-critical if σ is equitable and the total number ofcritical edges equals λ β n with λ β = (1+β)kr/(2 k−1 −1)(cf. (4.18)). Hence, the average number of edges thateach vertex supports equals λ β . Let Z 1+β be the numberof (1+β)-critical 2-color<strong>in</strong>gs. We call a (1+β)-critical σgood if |C(σ)| ≤ E [Z 1+β ]. Let Z g,1+β be the number ofgood (1+β)-critical 2-color<strong>in</strong>gs. Extend<strong>in</strong>g the analysisof the local cluster size from Section 4 to (1 + β)-critical2-color<strong>in</strong>gs yields249 Copyright © SIAM.Unauthorized reproduction of this article is prohibited.