The condensation transition in random hypergraph 2-coloring

mutually at Hamming distance at least 0.48n. (Indeed,inductively choose S i to be the local cluster C 0.01 (σ) ofsome 2-coloring σ ∉ ⋃ jr cond + o k (1), we have(exp −kr )(1.6)2 k−1 ln 2 > ln 2 + r ln(1 − 2 1−k ).− 1In effect, we cannot extrapolate the evolution of thetotal number Z of 2-colorings and the median clustersize from (1.4) and (1.5) to the regime r > r cond ,because then (1.6) would lead to the absurd conclusionthat the size of a typical cluster is greater than the totalnumber of 2-colorings. Hence, it can’t be that ln Z andthe median cluster size trace the functions on the r.h.s.of (1.4) and (1.5) beyond r cond + o k (1).As our next theorem will show, it is indeed the casethat ln Z follows a different trajectory than (1.4) beyondr cond + o k (1). This means that for r > r cond + o k (1) itwill not be true anymore that ln Z ∼ ln E [Z] w.h.p. Inother words, beyond r cond + o k (1), the expected numberE [Z] of 2-colorings is indeed driven up excessively bya tiny minority of hypergraphs with an abundance of2-colorings.Theorem 1.2. There exist a constant k 0 ≥ 3 and asequence ε k → 0 such that for any k ≥ k 0 there areδ k > 0, ζ k > 0 such that the following two statementsare true.1. W.h.p. H k (n, m) is 2-colorable for all r < r cond +ε k + δ k .2. For any density r with r cond + ε k < r < r col wehave(1.7) ln Z < ln E [Z] − ζ k n w.h.p.The second statement asserts that for densities betweenr cond +ε k and the actual (unknown) 2-colorabilitythreshold r col , the expected number E [Z] of 2-coloringsexceeds the actual number Z by an exponential factorexp(ζ k n) w.h.p. This contrasts with Theorem 1.1, whichshows that below r cond , Z is of the same exponential orderas E [Z] w.h.p. Furthermore, the first part of Theorem1.2 ensures that the regime of densities where (1.7)holds is non-empty, as the true threshold r col is indeedstrictly greater than r cond + ε k .In mathematical physics, the term ‘phase transition’is usually defined as a point where the function1F (r) = lim E [ln(1 + Z)]n→∞ nis non-analytic. With Z the number of 2-coloringsof H k (n, m), it is not currently known whether thelimit F (r) exists. 2 But if it does, then Theorems 1.1and 1.2 imply that around some r = r cond + o k (1), thefunction F is non-analytic (because for r < r cond , F (r)1coincides with the linear function lim n→∞ nln E [Z] =ln 2 + r ln(1 − 2 1−k )). In any case, what we see isthat the sequence of functions r ↦→ 1 nE [ln(1 + Z)] doesnot converge to an analytic limit in an interval oflength o k (1) around r cond . In this sense, Theorem 1.2establishes the existence of a phase transition “near”r cond , which we call the condensation transition. Itsexistence was predicted on the basis of non-rigorousstatistical mechanics arguments [10, 18]. We emphasizethat the existence of the limit F (r) for r cond < r < r colremains an important open problem.2 Bayati, Gamarnik and Tetali [8] proved that the limit existsif Z is the partition function at any fixed positive temperature,but not that it does at zero temperature, which is the case weconsider here.243 Copyright © SIAM.Unauthorized reproduction of this article is prohibited.

the experiment corresponds to sampling a random 2-The term ‘condensation’ is meant to express thatpairs that we call the Gibbs distribution. For r < r col (or other related measures) is far more complicated inw.h.p. the set S(H k (n, m)) of all 2-colorings has adrastically different shape than in the ‘shattered’ regimeof Corollary 1.1. To spell this out, let us call a 2-coloringcoloring of a random hypergraph, and thus understandingthe above experiment is key to studying the combinatorialnature of the hypergraph 2-colorability problem.of a hypergraph H on n vertices (α, β, γ)-condensed ifBut the experiment seems genuinely difficult toanalyze. In fact, even for densities r = O(2 k−1 /k) farCO1. There is no 2-coloring τ ∈ S(H) with αn < below the threshold for 2-colorability, it is not currentlydist(σ, τ) < βn.known how to efficiently construct, let alone sample, aCO2. The set C α (σ) of all 2-colorings τ ∈ S(H) with2-coloring of a random hypergraph [3].dist(σ, τ) ≤ αn has size |C α (σ)| ≥ exp(−γn)Z(H).But there is a related experiment called the plantedmodel that is rather easy to implement and to study.(The difference between SH1–SH2 and the above isthat CO2 imposes a lower bound on |C α (σ)|.)P1. Choose σ ∈ {0, 1} n uniformly at random.P2. Choose a hypergraph H = HCorollary 1.2. There exist a constant k 0 ≥ 3 and ak (n, m, σ) with medges uniformly at random among all hypergraphssequence ε k → 0 such that for any k ≥ k 0 there exist afor which σ is a 2-coloring, and output (H, σ).sequence r(n) of densities satisfying |r(n) − r cond | ≤ ε ksuch that H k (n, m) with m = r(n) · n has the followingtwo properties w.h.p.1. H k (n, m) is 2-colorable.Let p k,n,m denote the distribution on Λ k (n, m) inducedby P1–P2. It is not difficult to show that prior to thecondensation phase, the distributions induced by thetwo experiments are ‘close’.2. A random 2-coloring σ ∈ S(H k (n, m)) is(0.01, 0.49, o(1))-condensed w.h.p.Proposition 1.1. ([1]) Let Z be the event{ln Z ∼ ln E [Z]}. If r < r first is a density suchThis means that at a particular density r(n) the size that H k (n, m) ∈ Z w.h.p., thenof the local cluster of a ‘typical’ 2-coloring σ of H k (n, m)satisfies ln |C 0.01 (σ)| ∼ ln Z w.h.p. In other words, the(1.8) ln(g k,n,m [B|Z]) ≤ ln(p k,n,m [B]) + o(n)size of the cluster of a ‘typical’ 2-coloring has the sameexponential order as the set of all 2-colorings. Thisfor any event B.contrasts with the ‘shattered’ scenario of Corollary 1.1, The relationship (1.8) allows us to bound the probabilitywhere w.h.p. all clusters only comprise an exponentially of some ‘bad’ event B in the Gibbs distribution bysmall fraction of the entire set S(H k (n, m)). The (nonrigorous)statistical physics work [10, 18] predicts that Indeed, Proposition 1.1 was used in [1] to study variousestimating its probability in the planted distribution.the conclusions of Corollary 1.2 should hold in the entire properties of ‘typical’ 2-colorings of H k (n, m). Inregime between the condensation transition and r col . combination with Theorem 1.1 and the methods of [1],Discussion. The significance of the slightly betterProposition 1.1 can be used to get a pretty good idealower bound on the threshold for hypergraph 2-colorability provided by Theorem 1.1 is that it allowsus to prove the existence of the condensation transition.Beyond the condensation transition, the combinatorialnature of the problem seem to become more complicated.To see why, consider the following random experimentwith r < r col (so that H k (n, m) is 2-colorablew.h.p.).what a 2-coloring of the random hypergraph H k (n, m)“typically looks like” before the condensation transition.But beyond the condensation transition, mattersappear to be more complicated. As Theorem 1.2 shows,in the condensed regime we have ln Z < ln E [Z] − Ω(n)w.h.p., i.e., the assumption of Proposition 1.1 is violated.Roughly speaking, the gap ln Z < ln E [Z] − Ω(n)implies that a pair chosen from the planted distributionP1–P2 corresponds to a pair chosen from the GibbsG1. Choose a random hypergraph H = H k (n, m),distribution only with exponentially small probability.conditional on H being 2-colorable.In fact, for densities beyond the condensation transitionour proof of Theorem 1.2 exhibits an event B forG2. Choose a 2-coloring σ ∈ S(H) uniformly at randomand output (H, σ).which (1.8) is violated, i.e., the planted model is nolonger a good approximation to the Gibbs distribution.The above experiment induces a probability distributiong k,n,m on the set Λ k (n, m) of hypergraph/2-coloringFurthermore, the statistical mechanics cavity techniquesuggests that getting a handle on the Gibbs measure244 Copyright © SIAM.Unauthorized reproduction of this article is prohibited.

the condensation phase. Overcoming this obstacle appearsto be the remaining challenge to obtain the precisethreshold for hypergraph 2-colorability. The statisticalmechanics reasoning [10, 18] suggestsConjecture 1. There is ε k → 0 such that r col ∼2 k−1 ln 2 − ( ln 22 + 1 4 ) + ε k.One limitation of our approach is that we need toassume that k ≥ k 0 is sufficiently big (whereas thestandard second moment argument [4] applies to anyk ≥ 3). We need the lower bound on k to carry out asufficiently accurate analysis of combinatorial structureof the solution space S(H k (n, m)). No attempt has beenmade to compute (let alone optimize) k 0 or the variousother constants.2 Related workThe two inequalities in (1.2) state the best previousbounds on the threshold for hypergraph 2-colorabilityfrom the paper of Achlioptas and Moore [4], whichprovided the prototype for the second moment analysesin other sparse random CSPs (e.g., [5, 6]). Since thesecond moment method is non-constructive, there is theseparate algorithmic question: for what densities can a2-coloring of a random hypergraph be constructed inpolynomial time w.h.p.? The best current algorithmis known to succeed up to r = c · 2 k−1 /k for someconstant c > 0, i.e., up to a factor of about k belowthe 2-colorability threshold [3].In [1] the geometry of the set S(H k (n, m)) of 2-colorings of the random hypergraph was investigated(among other things). It was shown that S(H k (n, m))shatters into exponentially small well-separated ‘clusters’for densities (1 + ε k )2 k−1 ln(k)/k < r < r second .Corollary 1.1 extends this picture up to r cond . Inaddition, [1] also proved that in the regime (1 +ε k )2 k−1 ln(k)/k < r < r second a typical 2-coloring σof H k (n, m) is rigid w.h.p. in the sense that for mostvertices v any 2-coloring τ with σ(v) ≠ τ(v) has Hammingdistance Ω(n) from σ. Our analysis, most notablythe study of the structure of a typical ‘local cluster’ inSection 4, builds substantially on the concepts of shatteringand rigidity from [1], but we elaborate them inconsiderably more detail to get close quantitative estimates.In many random CSPs other than random hypergraph2-coloring the best current bounds on the thresholdsfor the existence of solutions derive from the secondmoment method as well. The most prominent examplesare random graph k-coloring [5] and random k-SAT [6].But the second moment argument extends naturally toa range of ‘symmetric’ random CSPs [19]. It would beinteresting to see if/how our techniques can be generalizedin order to prove the existence of a condensationphase in these other problems, particularly randomgraph k-coloring. However, since even the standard secondmoment analysis is quite involved in random graphk-coloring, such a generalization will be technically challenging.The random k-SAT problem is conceptually differentbecause it is not ‘symmetric’. More precisely, inrandom hypergraph 2-coloring the inverse 1 − σ of a 2-coloring σ is a 2-coloring as well. This symmetry, whichgreatly simplifies the second moment argument, is absentin random k-SAT. As a consequence, as elaboratedin [4, 6], in k-SAT the bound E [ Z 2] = O(E [Z] 2 ) doesnot hold for any density. Roughly speaking, to overcomethis problem [6] focuses on a special type of satisfyingassignments (“balanced” ones), whose numberZ ∗ satisfies E [ Z∗] 2 = O(E [Z∗ ] 2 ). Technically, this is accomplishedby weighting satisfying assignments cleverly.While our techniques can be extended easily to establishthe existence of a condensation phase for these balancedsatisfying assignments in random k-SAT, this does notimply that condensation occurs with respect to the biggerset of all satisfying assignments. This would requirea new approach for the direct analysis of the total numberof satisfying assignments in random k-SAT.We emphasize that our techniques are quite differentfrom the ‘weighted’ second moment method in [6].Indeed, the ‘asymmetry’ that motivated the weightingscheme in [6] is absent in random hypergraph 2-coloring.Instead of weighting, we employ a new idea that exploitsthe combinatorial structure of the ‘clusters’ into whichthe set S(H k (n, m)) of 2-colorings decomposes.An example of a random CSP in which the precisethreshold for the existence of solutions is known israndom k-XORSAT. In this problem a second momentargument yields the precise thresholds (after ‘pruning’the underlying hypergraph) [12, 21]. The explanationfor this success is that random k-XORSAT does nothave a condensation phase due to the algebraic natureof the problem. Similarly, in random k-SAT withk > log 2 n (i.e., the clause length is growing withn) there is no condensation phase and, in effect, thesecond moment method yields the precise satisfiabilitythreshold [9, 15]. A further class of problems wherethe condensed phase is conjectured to be empty are the‘locked’ problems of [24].In statistical mechanics the condensation transitionwas first predicted (using non-rigorous techniques) forthe random k-SAT and the random graph k-coloringproblems [18]. For random hypergraph 2-coloring thestatistical mechanics prediction for the condensationthreshold was derived in [10]. The structure of the condensedphase is described using a non-rigorous frame-245 Copyright © SIAM.Unauthorized reproduction of this article is prohibited.

work called one-step replica symmetry breaking. Interestingly,it was also conjectured that the structure ofthe condensed phase for large k is very similar to thestructure of the random subcube model [20] 3 . Ourproofs verify this for random hypergraph 2-coloring, seeSection 4 for more details.Random CSPs, including random hypergraph 2-coloring, have been studied in statistical mechanics asmodels of disordered systems (such as glasses) underthe name ‘diluted mean field models’. In this contextthe condensation transition corresponds to the so-calledKauzmann transition [17]. The present paper providesthe first rigorous proof that this phase transition actuallyexists in a ‘diluted mean field model’.3 Reaching the condensation transition: proofof Theorem 1.1In the rest of this paper, we assume that k ≥ k 0 for somelarge enough constant k 0 . Moreover, to avoid floor andceiling signs, we assume that n is even.The vanilla second moment argument. We beginby briefly reviewing the ‘vanilla’ second moment methodfrom [4]. This will provide the background for theenhanced the second moment argument that yieldsTheorem 1.1. As a first step, we need to work out theexpected number E [Z] of 2-colorings. Using the linearityof the expectation, it is easy to obtainLemma 3.1. We have E [Z] ∼ 2 n (1 − 2 1−k ) m .Our goal is to identify the regime of densitiesr where E [ Z 2] = O(E [Z] 2 ), i.e., where the secondmoment method ‘works’. A technical issue is that Zincludes 2-colorings σ whose color classes have (very)different sizes. To simplify our calculations we are goingto confine ourselves to colorings σ whose color classesσ −1 (0), σ −1 (1) have the same size. More precisely, let uscall σ : V → {0, 1} equitable if |σ −1 (0)| = |σ −1 (1)| =n2 , and let Z e be the number of equitable 2-colorings ofH k (n, m). Using Stirling’s formula and, once more, thelinearity of the expectation, it is not difficult to computeE [Z e ]: we haveThus, we need to compute E [Z e |σ is 2-col]. In otherwords, for a fixed equitable σ ∈ {0, 1} n we need to studythe random hypergraph H k (n, m) given that σ is a 2-coloring. This conditional distribution can be expressedeasily: just choose a set of m edges uniformly at randomfrom all edges that are bichromatic under σ (cf. step P2of the ‘planted model’ above). Let H k (n, m, σ) denotethe resulting random hypergraph. Furthermore, givenσ, let Z e (d) be the number of equitable 2-coloringsτ with Hamming distance dist(σ, τ) = d. Similarly,let Z(d) be the total number of 2-colorings τ withdist(σ, τ) = d. Then(3.10)E [Z e |σ is 2-col] =n∑E Hk (n,m,σ) [Z e (d)] .d=0Fact 3.2. ([4]) For any 0 < d < n we haveE Hk (n,m,σ) [Z(d)] =Θ( √ n/(d · (n − d))) · exp(ψ(d/n)),E Hk (n,m,σ) [Z e (d)] =ψ = ψ k,r : (0, 1) → R,Θ(n/(d · (n − d))) · exp(ψ(d/n)), withx ↦→ −x ln(x) − (1 − x) ln(1 − x)+r · ln[1 − 1 − xk − (1 − x) k ]2 k−1 .− 1Fact 3.2 and (3.10) reduce the problem of computingE [Z e |σ is a 2-coloring] (and thus E [ Z 2 e])) to an exercisein calculus: we just need to study the functionψ.Lemma 3.2. ([4]) The function ψ satisfies ψ(1/2) ∼1n ln E [Z], ψ(1−x) = ψ(x), ψ′ (1/2) = 0, and ψ ′′ (1/2) ψ(x) for all x ∈ (0, 1/2), thenE [Z e |σ is a 2-coloring] ≤ O(E [Z e ]).2. if there is some x ∈ (0, 1) with ψ(x) > ψ(1/2), thenE [Z e ]∼√2π · 2n√ n(1 − 2 1−k ) mE [Z e |σ is a 2-coloring] > E [Z] · exp(Ω(n)).(3.9)= Θ(1/ √ n) · E [Z] .Now, for what r do we have E [ Z 2 e]= O(E [Ze ] 2 )? Weuse the following elementary relation.Fact 3.1. For any equitable σ : V → {0, 1} we haveE [ Z 2 e]= E [Ze ] · E [Z e |σ is a 2-coloring] .3 Similar conjectures were proposed by D. Achlioptas andA. Montanari (unpublished).Lemma 3.2 shows that the second moment method‘works’ if and only if r is such that the function ψtakes its global maximum at 1 2 . Thus, let r second bethe supremum of all r > 0 with this property. Usingbasic calculus, one verifies that r second = 2 k−1 ln 2 −12 (1 + ln 2) + o k(1) (see [4, Section 7]), and that forr > r second the function ψ attains its maximum,strictly greater than ψ(1/2), in the interval (0, 2 −k/2 ).In effect, the second part of Lemma 3.2 shows that246 Copyright © SIAM.Unauthorized reproduction of this article is prohibited.

The last inequality is crucial. It holds because if σ isgood, then |C(σ)| ≤ E [Z e ] with certainty, by the verydefinition of ‘good’. Furthermore,∑E [Z g (d)|σ is good](3.14)αn

Lemma 4.1. W.h.p. we have |core(H)| ≥ n(1 −λ 2 exp(−λ)).The vertices v ∈ core(H) are difficult to recolor.Indeed, flipping the color of v necessitates recoloring atleast one more vertex in each of the ≥ 3 edges that vsupports and that consist of vertices in core(H) only.This triggers an avalanche of recolorings that will onlystop after a substantial fraction of all vertices have beenrecolored. More precisely, using expansion properties ofthe random hypergraph, we can showLemma 4.2. W.h.p. H = H k (n, m, σ) does not have a2-coloring τ such that0 < |{v ∈ core(H) : σ(v) ≠ τ(v)}| < n/k 3 .In particular, w.h.p. there is no 2-coloring τ ∈ C(σ) thatdisagrees with σ on a vertex in core(H).To complete the proof of (4.18), we call a vertex vattached if v supports an edge e such that all othervertices of e belong to core(H). (In particular, allvertices in the core are attached.) Thus, to change thecolor of an attached vertex it is necessary to also alterthe color of a vertex in the core. Hence, Lemma 4.2entails that w.h.p. for all τ ∈ C(σ) and all attached vwe have τ(v) = σ(v).Lemma 4.3. W.h.p. the number of non-attached verticesis ≤ (1 + o k (1)) exp(−λ)n.Lemmas 4.1 and 4.3 directly imply that |C(σ)| ≤2 (1+o k(1))n/ exp(λ)w.h.p. An asymptotically matchinglower bound can be obtained by estimating the numberof edges that contain two non-attached vertices andwhose remaining vertices have the same color under σ.Remark 1. The above sketch leaves out many detailsof the proof, which will be contained in the full versionof this work. Since some of these details may be ofindependent interest, we give a brief summary. If welet Q be the set of vertices that are neither in the corenor attached to it, then our methods allow us• to obtain the asymptotically size |Q| (up to o(n)vertices) by tracing the construction of the coreand the attachment property via the method ofdifferential equations [23], and• to describe the distribution of the non-uniformhypergraph H Q with vertex set Q and edge set{e ∩ Q : e ∈ H}. In combination with resultsfrom [22], this description shows that w.h.p. all buto(n) vertices of H Q form a hyperforest with all componentsof size O(1).As a consequence, we can show that the local cluster size|C(σ)| is tightly concentrated.5 Small clusters and the condensation phase:proof of Theorem 1.2The goal in this section is establish Theorem 1.2, i.e.,to prove that there is a non-empty regime of densitiesr cond < r < r col in which H k (n, m) is 2-colorable w.h.p.but ln Z < ln E [Z] − Ω(n) w.h.p.Why would it happen that ln Z < ln E [Z] − Ω(n)?Let r crit be the smallest density for which the lowerbound on the typical size of C(σ) from Proposition 4.1exceeds the expected number E [Z] of 2-colorings ofH k (n, m). From Lemma 3.1 and Proposition 4.1 onecan easily derive that r crit = r cond + o k (1), and thatindeed |C(σ)| > E [Z] · exp(Ω(n)) w.h.p. However, if itwas true that for such r we have ln Z ∼ ln E [Z] w.h.p.,then Proposition 1.1 would imply that the plantedmodel and the Gibbs distribution G1–G2 (first chooseH k (n, m), then choose σ ∈ S(H k (n, m)) randomly) areessentially equivalent. In particular, in a pair (H, σ)chosen from the Gibbs distribution the local clusterC(σ) would have size E [Z] exp(Ω(n)) w.h.p., leading tothe absurd conclusion that H k (n, m) is w.h.p. such thatZ ≥ |C(σ)| ≥ E [Z] exp(Ω(n)). Hence, intuitively thecondensation transition occurs because the typical sizeof the local cluster in the planted model H k (n, m, σ)surpasses the expected number E [Z] of 2-colorings ofH k (n, m). Indeed, it is not difficult to turn this intuitioninto a proof of part 2 of Theorem 1.2.But the above still allows for the possibility that thecondensation phase may just be empty. The followingproposition shows that it is not.Proposition 5.1. There exists r > r crit such thatH k (n, m) is 2-colorable w.h.p.To prove Proposition 5.1, we need to show thatw.h.p. H k (n, m) has a 2-coloring σ whose local clusterC(σ) is smaller than E [Z], i.e., much smaller than thelocal cluster in the planted model H k (n, m, σ). As wesaw in Section 4, the size of the local cluster C(σ) is thesmaller the more vertices are ‘blocked’ by critical edges.Therefore, it seems natural to expect that 2-coloringswith a particularly high number of critical edges havesmall local clusters.Thus, we say that a 2-coloring σ of H k (n, m) is(1+β)-critical if σ is equitable and the total number ofcritical edges equals λ β n with λ β = (1+β)kr/(2 k−1 −1)(cf. (4.18)). Hence, the average number of edges thateach vertex supports equals λ β . Let Z 1+β be the numberof (1+β)-critical 2-colorings. We call a (1+β)-critical σgood if |C(σ)| ≤ E [Z 1+β ]. Let Z g,1+β be the number ofgood (1+β)-critical 2-colorings. Extending the analysisof the local cluster size from Section 4 to (1 + β)-critical2-colorings yields249 Copyright © SIAM.Unauthorized reproduction of this article is prohibited.

The condensation transition in random hypergraph 2-coloring

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?