13.07.2015 Views

Learning model structures based on marginal model structures of ...

Learning model structures based on marginal model structures of ...

Learning model structures based on marginal model structures of ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<str<strong>on</strong>g>Learning</str<strong>on</strong>g> <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>marginal</strong><str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g> <strong>of</strong> undirected graphsbySung-Ho KimBK21 Research Report09 - 04March 25, 2009DEPARTMENT OF MATHEMATICAL SCIENCES


STARTERSSticky Wings $9Chicken wings fried and glazed with your choice <strong>of</strong>sauce: sweet teriyaki, fiery-hot szechuan or tangysweet & sour.Bucket <strong>of</strong> Tots (vo) $7Tater tots served in a bucket with our secrettater tot sauce.Spring Rolls (ve, gf) $5Filled with rice noodles, carrots, red peppers,cilantro, basil and mint. Served with a peanutsauce for dipping. Served fresh, not fried.T<strong>of</strong>u Bites (ve, gfo) $4Crispy chunks <strong>of</strong> t<strong>of</strong>u, tossed with teriyaki sauce,sesame seeds and scalli<strong>on</strong>s.Pork Belly Slider $4Pan-seared and szechuan glazed pork bellytopped with asian slaw** <strong>on</strong> a foldedsteam bun.Samurai Shrimp $8Six tempura-fried shrimp with three differentdipping sauces:sesame-soy, p<strong>on</strong>zu and sweet& sour.Soups&SaladsMiso (ve, gf) $3A house traditi<strong>on</strong>. Miso broth with seaweed,t<strong>of</strong>u and diced scalli<strong>on</strong>s.Hot and Sour $4A classic pork <str<strong>on</strong>g>based</str<strong>on</strong>g> soup with shiitake mushrooms,<strong>on</strong>i<strong>on</strong>s and t<strong>of</strong>u. Topped with crispy w<strong>on</strong>t<strong>on</strong> noodles.Teriyaki Salad (vo, gfo) $8/$9/MP*Choice <strong>of</strong> teriyaki glazed t<strong>of</strong>u or chicken <strong>on</strong> top <strong>of</strong>mixed greens with tomatoes, carrots, <strong>on</strong>i<strong>on</strong>s,cucumbers and crispy noodles.Edamame (ve, gf) $4Steamed soy beans topped with salt andsesame seeds.Spicy Asian Ribs $8Five crispy, tender ribs smothered in a spicychili pepper sauce.Lettuce Wraps (vo, gfo) $8/9Your choice <strong>of</strong> t<strong>of</strong>u or chicken stir-fried in a teriyakisauce with shiitake mushrooms, water chestnuts,carrots, and <strong>on</strong>i<strong>on</strong>s. Served with lettuce leavesand crispy w<strong>on</strong>t<strong>on</strong> noodles.Chili Calamari $8Seas<strong>on</strong>ed and golden fried calamari ringsserved with a sweet chili lime dipping sauce.Gyoza (Pot Stickers) $7Steamed or pan-fried asian dumplings with asesame-soy dipping sauce. Your choice <strong>of</strong>chicken or pork dumplings.Wakame Salad (ve) $4Seaweed topped with sesame seeds.Veggie Tempura Plate (ve) $7Tempura-fried sweet potatoes, broccoli, pineapple,zucchini and <strong>on</strong>i<strong>on</strong> rings. Served with a p<strong>on</strong>zudipping sauce.Asian Garden (gfo) $10Sautéed butterflied shrimp over mixed greens,tomatoes, carrots, peppers, <strong>on</strong>i<strong>on</strong>s, toastedalm<strong>on</strong>ds and mandarin oranges. Topped withcrispy noodles.Soba Salad (ve) $7Marinated soba noodles in a cilantro vinaigrettetossed with carrots,spinach, <strong>on</strong>i<strong>on</strong>s and edamame.Add a sliced chicken breast for $2.House or Spinach Salad (vo, gfo) $5Legendv = vegetarianve = veganvo = vegan opti<strong>on</strong>gf = gluten freegfo = gluten free opti<strong>on</strong>MP= Market PriceAll salads served with choice <strong>of</strong> orange-gingervinaigrette (ve, gf), lime-wasabi vinaigrette (ve, gf), ranch (gf)or blue cheese (gf) dressings.FYIFor parties <strong>of</strong> 5 or more, no separate checksand 20% gratuity may be included.Disclaimer* C<strong>on</strong>suming raw or undercooked meats, poultry, seafood, shellfish,or eggs may increase your risk <strong>of</strong> food-borne illness, especially ifyou have a medical c<strong>on</strong>diti<strong>on</strong>.**Asian slaw c<strong>on</strong>tains fish sauce.


1G 1 2 G2133 4Figure 1: Two <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g> <strong>of</strong> graphical <str<strong>on</strong>g>model</str<strong>on</strong>g>s.& Kim (1999)]. The relati<strong>on</strong>ship becomes more explicit when the distributi<strong>on</strong> is multivariate normal.Let X be a normal random vector. The precisi<strong>on</strong> matrix <strong>of</strong> the c<strong>on</strong>diti<strong>on</strong>al distributi<strong>on</strong> <strong>of</strong> asubvector X 1 given the remaining part <strong>of</strong> X is the same as the X 1 part <strong>of</strong> the precisi<strong>on</strong> matrix <strong>of</strong> X[Secti<strong>on</strong> 5.7, Whittaker (1990)]. Marginals <strong>of</strong> a joint probability distributi<strong>on</strong> are not in general representedas parts <strong>of</strong> the joint distributi<strong>on</strong>. However, there is a way that we can express explicitly therelati<strong>on</strong>ship between joint and <strong>marginal</strong> distributi<strong>on</strong>s under the assumpti<strong>on</strong> that the joint (as against<strong>marginal</strong>) probability <str<strong>on</strong>g>model</str<strong>on</strong>g> is graphical and decomposable (Kim, 2006b).In addressing the issue <strong>of</strong> informati<strong>on</strong> reuse in the form <strong>of</strong> combining graphical <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g>,we can not help using independence graphs and related theories to derive desired results withmore clarity and refinement. The c<strong>on</strong>diti<strong>on</strong>al independence embedded in a distributi<strong>on</strong> can be expressedto some level <strong>of</strong> satisfacti<strong>on</strong> by a graph in the form <strong>of</strong> graph-separateness [see, for example,the separati<strong>on</strong> theorem in p. 67, Whittaker (1990)]. We instrument the noti<strong>on</strong> <strong>of</strong> c<strong>on</strong>diti<strong>on</strong>al independencewith some particular sets <strong>of</strong> random variables in a <str<strong>on</strong>g>model</str<strong>on</strong>g>, where the sets form a basis <strong>of</strong>the <str<strong>on</strong>g>model</str<strong>on</strong>g> structure so that the Markov property am<strong>on</strong>g the variables <strong>of</strong> the <str<strong>on</strong>g>model</str<strong>on</strong>g> may be preservedbetween the <str<strong>on</strong>g>model</str<strong>on</strong>g> and its <strong>marginal</strong>s. The sets are called prime separators for decomposable graphsand self-c<strong>on</strong>nected (SC) separators for n<strong>on</strong>-decomposable graphs and defined in secti<strong>on</strong>s 2 and 4respectively.It is shown that if we are given a graphical <str<strong>on</strong>g>model</str<strong>on</strong>g> with its independence graph, G, and some <strong>of</strong>its <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s, then we can find, under the assumpti<strong>on</strong> that the graphical <str<strong>on</strong>g>model</str<strong>on</strong>g> is decomposable,a graph, say H, which is not smaller than G and in which the graph-separateness in the given<strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s is preserved. This graph-separateness is substantiated by the prime separators andSC-separators which are found in the graphs <strong>of</strong> the <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s. In combining <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>sinto H, we see to it that these prime separators appear as the <strong>on</strong>ly prime separators in H when thegraphs are decomposable (Kim & Lee, 2008). In this paper, we will extend the graphical combinati<strong>on</strong><strong>of</strong> <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s to the case where the <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s are not necessarily decomposable.There have been a number <strong>of</strong> papers applying <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s for data analysis during the last15 years or so. Some <strong>of</strong> the applicati<strong>on</strong>s are for parameter estimati<strong>on</strong> <strong>of</strong> a <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> medicaldata from crossover experiments (Balagtas, Becker & Lang, 1995), for estimating joint probabilitiesby applying the iterative proporti<strong>on</strong>al fitting technique (Molenberghs & Lesaffre (1999), for analyzingsociological data (Becker, 1994; Becker, Minick & Yang, 1998), and for analyzing c<strong>on</strong>tingencytable data with ordinal resp<strong>on</strong>se variables (Colombi & Forcina, 2001). In most <strong>of</strong> these applicati<strong>on</strong>s<strong>of</strong> <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s to multivariate statistical problems, we impose structural restricti<strong>on</strong>s <strong>on</strong> certainsubsets <strong>of</strong> variables that are involved in a given data set (Liang, Zeger & Qaqish, 1992; Gl<strong>on</strong>ek &McCullagh, 1995; Bergsma, 1997; Bartolucci & Forcina, 2002; Bergsma & Rudas, 2002; Rudas& Bergsma, 2004). There also have been remarkable improvements in learning graphical <str<strong>on</strong>g>model</str<strong>on</strong>g>sin the form <strong>of</strong> a Bayesian network (Pearl 1986, 1988; Meek, 1995; Spirtes, Glymour & Scheines,2000; Neapolitan, 2004) from data. This learning however is mainly instrumented by heuristicsearching algorithms since the <str<strong>on</strong>g>model</str<strong>on</strong>g> searching is usually NP-hard (Chickering, 1996). In our pro-2


posed method <strong>of</strong> structural learning, we will assume that <strong>on</strong>ly the informati<strong>on</strong> which is embeddedin a given set <strong>of</strong> <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g> is available.This paper is organized in 7 secti<strong>on</strong>s. After introducing graphical terminologies and notati<strong>on</strong> insecti<strong>on</strong> 2, we derive a result which shows how two sets <strong>of</strong> graphs, where the graphs in <strong>on</strong>e set aresome type <strong>of</strong> subgraphs <strong>of</strong> the graphs in the other set, are related in stochastic c<strong>on</strong>text and introducea type <strong>of</strong> graph, called a combined <str<strong>on</strong>g>model</str<strong>on</strong>g> structure (CMS), with regard to the relati<strong>on</strong>ship <strong>of</strong> thetwo sets <strong>of</strong> graphs. In secti<strong>on</strong> 4, we c<strong>on</strong>sider some types <strong>of</strong> separators <strong>of</strong> undirected graphs, called aself-c<strong>on</strong>nected separator and a prime separator, and use them to further investigate the relati<strong>on</strong>shipbetween the two sets <strong>of</strong> graphs. We then c<strong>on</strong>sider the noti<strong>on</strong> <strong>of</strong> graphical compatibility (Dawid &Studeny, 1999) in secti<strong>on</strong> 5 as a necessary relati<strong>on</strong>ship between graphs and show existence <strong>of</strong> aCMS <strong>of</strong> set <strong>of</strong> graphs when the compatibility c<strong>on</strong>diti<strong>on</strong> is satisfied am<strong>on</strong>g the graphs. In secti<strong>on</strong>6, we propose a combinati<strong>on</strong> method <strong>of</strong> graphs as a way <strong>of</strong> informati<strong>on</strong> reuse from a given set <strong>of</strong><strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s. Finally, in secti<strong>on</strong> 7, we close the paper with some discussi<strong>on</strong> and c<strong>on</strong>cludingremarks.2 NOTATION AND PRELIMINARIESWe will c<strong>on</strong>sider <strong>on</strong>ly undirected graphs in the paper. We denote a graph by G = (V, E), where Vis the set <strong>of</strong> the indexes <strong>of</strong> the variables involved in G and E is a collecti<strong>on</strong> <strong>of</strong> ordered pairs, eachpair representing that the nodes <strong>of</strong> the pair are c<strong>on</strong>nected by an edge. Since G is undirected, that(u, v) is in E is the same as that (v, u) is in E. If (u, v) ∈ E, we say that u is a neighbor node <strong>of</strong> oradjacent to v or vice versa. We say that a set <strong>of</strong> nodes <strong>of</strong> G forms a complete subgraph <strong>of</strong> G if everypair <strong>of</strong> nodes in the set is adjacent to each other. If every node in A is adjacent to all the nodes in B,we will say that A is adjacent to B. The set <strong>of</strong> all the neighbor nodes <strong>of</strong> a node v in G is denoted bybd G (v); if v becomes a set, A say, we define bd G (A) = ∪ v∈A bd G (v) \ A. We define the closure <strong>of</strong>a set A as cl G (A) = bd G (A) ∪ A. We denote by C(G) the set <strong>of</strong> cliques <strong>of</strong> G.A path <strong>of</strong> length n is a sequence <strong>of</strong> nodes u = v 0 , · · · , v n = v such that (v i , v i+1 ) ∈ E,i = 0, 1, · · · , n − 1 and u ≠ v. If u = v, the path is called an n-cycle. If u ≠ v and u and v arec<strong>on</strong>nected by a path, we write u ⇋ v. We define the c<strong>on</strong>nectivity comp<strong>on</strong>ent <strong>of</strong> u as[u] = {v ∈ V ; v ⇋ u} ∪ {u}.We say that a path, v 1 , · · · , v n , v 1 ≠ v n , is intersected by A if A∩{v 1 , · · · , v n } ̸= ∅ and neither<strong>of</strong> the end nodes <strong>of</strong> the path is in A. We say that nodes u and v are separated by A if all the pathsfrom u and v are intersected by A. In the same c<strong>on</strong>text, we say that, for three disjoint sets A, B,and C, A is separated from B by C if all the paths from A to B are intersected by C and write〈A|C|B〉 G . The complement <strong>of</strong> a set A is denoted by A c and the cardinality <strong>of</strong> a set A by |A|. Fortwo collecti<strong>on</strong> <strong>of</strong> sets, A and B, we write A ≼ B if, for every set a in A, there exists a set b in Bsuch that a ⊆ b.For A ⊂ V , we define an induced subgraph <strong>of</strong> G c<strong>on</strong>fined to A as GAind = (A, E ∩ (A × A)).We also define a graph, called a Markovian subgraph <strong>of</strong> G c<strong>on</strong>fined to A, which is formed fromby completing the boundaries in G <strong>of</strong> the c<strong>on</strong>nectivity comp<strong>on</strong>ents <strong>of</strong> the complement <strong>of</strong> Aand denote it by G A . If G ′ is a Markovian subgraph <strong>of</strong> G, we write G ′ ⊆ M G.If G = (V, E), G ′ = (V, E ′ ), and E ′ ⊆ E, then we say that G ′ is an edge-subgraph <strong>of</strong> G andwrite G ′ ⊆ e G. If G ′ is a subgraph <strong>of</strong> G, we call G a supergraph <strong>of</strong> G ′ . For a graph G, we will denotethe set <strong>of</strong> nodes <strong>of</strong> G by V (G).The cliques are elementary graphical comp<strong>on</strong>ents and so we will call the intersecti<strong>on</strong> <strong>of</strong> neighboringcliques a prime separator <strong>of</strong> the decomposable graph G. The prime separators in a decomposablegraph may be extended to separators <strong>of</strong> prime graphs in some graphs, where the prime graphsGAind3


are defined as the maximal subgraphs without a complete separator in Cox & Wermuth (1999). If Gis not decomposable, separators are not obtained as intersecti<strong>on</strong>s <strong>of</strong> neighboring cliques.3 MARGINAL MODELS AND MARKOVIAN SUBGRAPHSSuppose that we are given a probability <str<strong>on</strong>g>model</str<strong>on</strong>g>, P , with its interacti<strong>on</strong> graph, G, and some <strong>of</strong> its<strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s are also given whose interacti<strong>on</strong> graphs are Markovian subgraphs <strong>of</strong> G. In this secti<strong>on</strong>,we will look into the relati<strong>on</strong>ship between P and the <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s through the relati<strong>on</strong>shipbetween G and the Markovian subgraphs.A distributi<strong>on</strong> P is said to be globally Markov with respect to a graph G if, for a triple (A, B, C)<strong>of</strong> disjoint subsets A, B, C <strong>of</strong> V , random vectors X A and X B are c<strong>on</strong>diti<strong>on</strong>ally independent givenan outcome <strong>of</strong> random vector X C whenever A is separated from B by C in G.In additi<strong>on</strong> to the global Markov property, we will c<strong>on</strong>sider another property for a probabilitydistributi<strong>on</strong>. A distributi<strong>on</strong> P with probability functi<strong>on</strong> f is said to be factorized (FA) accordingto G [Secti<strong>on</strong> 3.2, Lauritzen (1996)] if for all c ∈ C(G) there exist n<strong>on</strong>-negative functi<strong>on</strong>s ψ c thatdepend <strong>on</strong> x through x c <strong>on</strong>ly such thatf(x) =∏ψ c (x).c∈C(G)We will denote the collecti<strong>on</strong> <strong>of</strong> the distributi<strong>on</strong>s that are globally Markov with respect to G byM G (G).For a probability distributi<strong>on</strong> P <strong>of</strong> X V , let the logarithm <strong>of</strong> the density <strong>of</strong> P be expanded intointeracti<strong>on</strong> terms and let the set <strong>of</strong> the maximal domain sets <strong>of</strong> these interacti<strong>on</strong> terms be denoted byΓ(P ), where maximality is in the sense <strong>of</strong> set-inclusi<strong>on</strong>. We will call the set, Γ(P ), the generatingclass <strong>of</strong> P and denote by G(Γ(P )) = (V, E) the interacti<strong>on</strong> graph <strong>of</strong> P which satisfies, under thehierarchy assumpti<strong>on</strong> for probability <str<strong>on</strong>g>model</str<strong>on</strong>g>s,(u, v) ∈ E ⇐⇒ {u, v} ⊆ a for some a ∈ Γ(P ).It is well known in literature (Pearl & Paz, 1987) that if a probability distributi<strong>on</strong> <strong>on</strong> X V ispositive, then the three types <strong>of</strong> Markov property, pairwise Markov (PM), locally Markov (LM),and globally Markov (GM) properties relative to an undirected graph, are equivalent. Furthermore,for any probability distributi<strong>on</strong>, it holds that(F A) =⇒ (GM) =⇒ (LM) =⇒ (P M) (1)[see Propositi<strong>on</strong> 3.8 in Lauritzen (1996)]. Under the positivity c<strong>on</strong>diti<strong>on</strong> <strong>of</strong> the probability distributi<strong>on</strong>,we have (F A) ⇐⇒ (P M) by Hammersley and Clifford (1971). From this and expressi<strong>on</strong> (1),it follows, under the positivity c<strong>on</strong>diti<strong>on</strong>, that(F A) ⇐⇒ (GM).For notati<strong>on</strong> c<strong>on</strong>venience, we will write M(G) instead <strong>of</strong> M G (G) and we will simply say that adistributi<strong>on</strong> P is Markov with respect to G when P ∈ M G (G). For A ⊂ V , we denote by J A thecollecti<strong>on</strong> <strong>of</strong> the c<strong>on</strong>nectivity comp<strong>on</strong>ents in GA ind c and letβ(J A ) = {bd(B); B ∈ J A }.4


Let P A be the <strong>marginal</strong> <strong>of</strong> P <strong>on</strong> X A . We then define ¯Γ(P A ) asFrom this, it follows that¯Γ(P A ) = (Γ(P ) ∩ A) ∪ β(J A ). (2)β(J A ) ≼ ¯Γ(P A ) ≼ C(G(¯Γ(P A ))).The sec<strong>on</strong>d ≼ holds since it is possible that, for some B ∈ J A , bd(B) is a strict subset <strong>of</strong> a cliquein G(¯Γ(P A )).The following result is immediate from (2).THEOREM 1. For a distributi<strong>on</strong> P <strong>of</strong> X V and A ⊆ V ,G(¯Γ(P A )) = G(P ) A .Pro<strong>of</strong>. By definiti<strong>on</strong>, the interacti<strong>on</strong> graph corresp<strong>on</strong>ding to the right hand side <strong>of</strong> (2) is G(P ) A .Thus the result follows.From this theorem and the fact that Γ(P A ) ≼ ¯Γ(P A ), we haveCOROLLARY 1. For a distributi<strong>on</strong> P <strong>of</strong> X V and A ⊆ V ,P A ∈ M(G(P ) A ).From Theorem 1, we can also derive a result c<strong>on</strong>cerning both the relati<strong>on</strong>ship between a distributi<strong>on</strong>P and a graph G and the relati<strong>on</strong>ship between P A and G A .COROLLARY 2. For a distributi<strong>on</strong> P <strong>of</strong> X V and A ⊆ V , suppose that P ∈ M(G) for a graph G.ThenP A ∈ M(G A ).Pro<strong>of</strong>. Since P ∈ M(G), we have G(P ) ⊆ e G. This implies that G(P ) A ⊆ e G A . So, by Corollary1, we have the desired result.We call G A a Markovian subgraph <strong>of</strong> G in the c<strong>on</strong>text <strong>of</strong> Corollary 2.For A ⊆ V , we define M(G) A and L(G A ) asandM(G) A = {P A ; P ∈ M(G)}L(G A ) = {P ; P A ∈ M(G A )}.M(G) A is the set <strong>of</strong> the <strong>marginal</strong> distributi<strong>on</strong>s <strong>on</strong> X A <strong>of</strong> a distributi<strong>on</strong> P which is Markov withrespect to G; L(G A ) is the set <strong>of</strong> the distributi<strong>on</strong>s <strong>of</strong> X V whose <strong>marginal</strong> P A <strong>on</strong> X A is Markov withrespect to G A .By definiti<strong>on</strong> and Corollary 2, we have the following:L(G) = M(G),M(G) ⊆ L(G A ), (by Corollary 2) (3)P ∈ L(G A ) ⇐⇒ P A ∈ M(G A )5


andM(G) A ⊆ M(G A ).The last expressi<strong>on</strong> holds since, if a distributi<strong>on</strong> Q is in M(G) A , it means that Q = P A for somedistributi<strong>on</strong> P in M(G), and so, by Corollary 2, it follows that Q ∈ M(G A ).It follows from (3) that, for A, B ⊆ V ,M(G) ⊆ L(G A ) ∩ L(G B ).We will derive a generalized versi<strong>on</strong> <strong>of</strong> this result below.Let V be a set <strong>of</strong> subsets <strong>of</strong> V . We will define another set <strong>of</strong> distributi<strong>on</strong>s,˜L(G A , A ∈ V) = {P ; P A ∈ M(G A ), A ∈ V}.˜L(G A , A ∈ V) is the set <strong>of</strong> the distributi<strong>on</strong>s each <strong>of</strong> whose <strong>marginal</strong>s is Markov with respect to itscorresp<strong>on</strong>ding Markovian subgraph <strong>of</strong> G.THEOREM 2. For a collecti<strong>on</strong> V <strong>of</strong> subsets <strong>of</strong> V with a graph G,M(G) ⊆ ˜L(G A , A ∈ V).Pro<strong>of</strong>. Let P ∈ M(G). Then, by (3), P ∈ L(G A ) for A ∈ V. By definiti<strong>on</strong>, P A ∈ M(G A ). Sincethis holds for all A ∈ V, it follows that P ∈ ˜L(G A , A ∈ V). This completes the pro<strong>of</strong>.Theorem 2 shows the relati<strong>on</strong>ship between a graphical <str<strong>on</strong>g>model</str<strong>on</strong>g> with its graph G and a set <strong>of</strong>Markovian subgraphs <strong>of</strong> G. The set M(G) <strong>of</strong> the probability distributi<strong>on</strong>s each <strong>of</strong> which is Markovwith respect to G is c<strong>on</strong>tained in the set ˜L(G A , A ∈ V) <strong>of</strong> the distributi<strong>on</strong>s each <strong>of</strong> which has its<strong>marginal</strong>s Markov with respect to their corresp<strong>on</strong>ding Markovian subgraphs G A , A ∈ V. This resultsheds light <strong>on</strong> our efforts in searching for M(G) since it can be found as a subset <strong>of</strong> ˜L(G A , A ∈ V).Let G = (V, E) be the graph <strong>of</strong> a graphical <str<strong>on</strong>g>model</str<strong>on</strong>g> and let V 1 , V 2 , · · · , V m be subsets <strong>of</strong> V . Them Markovian subgraphs, G V1 , G V2 , · · · , G Vm , may be regarded as the <str<strong>on</strong>g>structures</str<strong>on</strong>g> <strong>of</strong> m sub<str<strong>on</strong>g>model</str<strong>on</strong>g>s <strong>of</strong>the graphical <str<strong>on</strong>g>model</str<strong>on</strong>g>. In this c<strong>on</strong>text, we may refer to a Markovian subgraph as a <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>structure. For simplicity, we write G i = G Vi .DEFINITION 1. Suppose there are m Markovian subgraphs, G 1 , · · · , G m . Then we say that graphH <strong>of</strong> a set <strong>of</strong> variables V is a combined <str<strong>on</strong>g>model</str<strong>on</strong>g> structure (CMS) <strong>of</strong> G 1 , · · · , G m , if the followingc<strong>on</strong>diti<strong>on</strong>s hold:(i) ∪ m i=1 V i = V.(ii) H Vi = G i , for i = 1, · · · , m. That is, G i are Markovian subgraphs <strong>of</strong> H.We will call H a maximal CMS <strong>of</strong> G 1 , · · · , G m if adding any edge to H invalidates c<strong>on</strong>diti<strong>on</strong>(ii) for at least <strong>on</strong>e i = 1, · · · , m.Let M be the collecti<strong>on</strong> <strong>of</strong> G(P ) A , A ∈ V. We can c<strong>on</strong>struct a maximal CMS, H ∗ say, byadding edges, if any, to G in such a way that c<strong>on</strong>diti<strong>on</strong> (ii) <strong>of</strong> Definiti<strong>on</strong> 1 is satisfied. SinceH ∗ A = G(P ) A ,if we put G = G(P ) in Theorem 2, we end up with a summarizing expressi<strong>on</strong>,M(G(P )) ⊆ M(H ∗ ) ⊆ ˜L (G(P ) A , A ∈ V) , (4)6


where the first inequality follows since G(P ) ⊆ e H ∗ . Since P ∈ M(G(P )), expressi<strong>on</strong> (4) impliesthat P is also Markov relative to H ∗ .If two nodes u and v are separated in a Markovian subgraph <strong>of</strong> a graph G, then so are theyin G by the property <strong>of</strong> a graph. We can extend this result to disjoint sets. Let C G (A) denote thecollecti<strong>on</strong> <strong>of</strong> the cliques which include nodes <strong>of</strong> A in G.THEOREM 3. (Theorem 4.2, Kim & Lee (2008)) Let G ′ = (V ′ , E ′ ) be a Markovian subgraph <strong>of</strong> Gand suppose that, for three disjoint subsets A, B, C <strong>of</strong> V ′ , 〈A|B|C〉 G ′. Then(i) 〈A|B|C〉 G ;(ii) For W ∈ C G (A) and W ′ ∈ C G (C), 〈W |B|W ′ 〉 G .4 MARKOVIAN SUBGRAPHS OF UNDIRECTED GRAPHSC<strong>on</strong>sider two Markovian subgraphs <strong>of</strong> G <strong>on</strong> A and B, G A and G B . Then by the transitivity property<strong>of</strong> the Markovian <strong>marginal</strong>izati<strong>on</strong> (Kim, 2006b), (G A ) B is also a Markovian subgraph <strong>of</strong> G, andsimilarly for (G B ) A . Furthermore, we can see, by definiti<strong>on</strong>, that(G A ) B = (G B ) A = G A∩B . (5)DEFINITION 2. For three disjoint and exhaustive subsets, A, B, and C, <strong>of</strong> V = V (G), we will callC a self-c<strong>on</strong>nected (SC) separator in G, if the following c<strong>on</strong>diti<strong>on</strong>s hold:(i) 〈A|C|B〉 G .(ii) G indCis c<strong>on</strong>nected.(iii) GCind does not c<strong>on</strong>tain any n-cycle (n > 3) nor a clique <strong>of</strong> G which c<strong>on</strong>sists <strong>of</strong> more than tw<strong>on</strong>odes.(iv) GA∪C ind and GindB∪Ceach c<strong>on</strong>sists <strong>of</strong> n-cycles (n > 3) or cliques <strong>of</strong> G <strong>on</strong>ly.According to the definiti<strong>on</strong>, we can see that, a SC-separator is given as a uni<strong>on</strong> <strong>of</strong> some intersecti<strong>on</strong>s<strong>of</strong> n-cycles (n > 3) or cliques. For example, in Figure 2, the SC-separator, {1, 2, 4, 5, 6},is the uni<strong>on</strong> <strong>of</strong> the intersecti<strong>on</strong>s <strong>of</strong> the following two pairs <strong>of</strong> cycles:12347856Figure 2: An undirected graph <strong>of</strong> 8 nodes.7


(a) (b) (c) (d)3559575 228 26112434 13 4 13555934722586114 13 4 124Figure 3: Undirected graphs and their Markovian subgraphs. The slant “=” <strong>on</strong> node v means that vis removed from the graph.pair 1 : {1, 2, 4, 5, 6, 8} and {1, 2, 3, 4}pair 2 : {1, 2, 4, 5, 6, 8} and {3, 4, 5, 6}In this respect, it follows that, if G is a decomposable graph, then all <strong>of</strong> its SC-separators are primeseparators.Note that c<strong>on</strong>diti<strong>on</strong> (iv) in the definiti<strong>on</strong> does not imply c<strong>on</strong>diti<strong>on</strong> (ii). For example, if, in Figure2, we let A = {3, 4}, B = {7, 8}, and C = {1, 2, 5, 6}, then c<strong>on</strong>diti<strong>on</strong>s (i), (iii), (iv) are satisfiedfor the three sets, but (ii) is not.If G is Markovian-<strong>marginal</strong>ized over a node v which is included in a SC-separator <strong>of</strong> the graphG, then new SC-separators are created in G V \{v} . For example, in column (b) <strong>of</strong> Figure 3, node3 is removed from the graph at the top and the resultant Markovian subgraph is given at the bottom,where {1, 2, 5} and {2, 4, 5} are new SC-separators. In column (c), node 2 is removed andthe removal yields a new SC-separator, {3, 5}, in the Markovian subgraph. In column (d), theset {3, 5} is a SC-separator and removal <strong>of</strong> node 3 creates a new SC-separator {1, 5}. Note that{1, 3, 5, 8, 9} forms a 5-cycle and {2, 3, 4, 5, 6, 7} forms a clique and that the removal creates a newclique, {1, 2, 4, 5, 6, 7}, and a new SC-separator {1, 5}. On the other hand, removal <strong>of</strong> a node whichis not a member <strong>of</strong> a SC-separator does not create any new SC-separator as we see in column (a) <strong>of</strong>Figure 3.Let M and S be, respectively, a set <strong>of</strong> nodes to be removed and a set <strong>of</strong> nodes which formSC-separators. Since Markovian subgraphs are not dependent up<strong>on</strong> the order <strong>of</strong> node-removal, wecan begin node-removal with the nodes in M ∩ S or with the nodes in M \ S. The <strong>on</strong>ly difference isthat the removal <strong>of</strong> a node in M \ S simply reduces the size <strong>of</strong> a cycle or a clique while the removal<strong>of</strong> a node in M ∩ S creates new SC-separators.THEOREM 4. Let G ′ be a Markovian subgraph <strong>of</strong> an undirected graph G. If A is a SC-separator inG ′ , then there exists a SC-separator, S, in G such that A ∩ S ≠ ∅.Pro<strong>of</strong>. Since A is a SC-separator in G ′ , we can find disjoint sets, B and C, in V (G ′ ) \ A suchthat A ∪ B ∪ C = V (G ′ ) and 〈B|A|C〉 G ′. Then, by Theorem 3, it follows that 〈B|A|C〉 G . LetD = V (G) \ V (G ′ ). Then, by the property <strong>of</strong> an undirected graph, we have 〈B|A ∪ D|C〉 G . Now,we have <strong>on</strong>ly to show (i) that the set A ∪ D is itself a SC-separator or (ii) that A ∪ D c<strong>on</strong>tains a8


SC-separator as a subset in G.In case (i), we have the desired result. In case (ii), there are two possibilities. One possibilityis that there is a SC-separator A ′ in A ∪ D such that A ⊆ A ′ , and the other that A is itself a SCseparatorin G. In the former situati<strong>on</strong>, at least <strong>on</strong>e node is removed from A ′ in the <strong>marginal</strong>izati<strong>on</strong><strong>of</strong> G, and in the latter situati<strong>on</strong> the removal takes place outside the neighborhood <strong>of</strong> A in G. Inthe latter situati<strong>on</strong>, A itself is a SC-separator in G; and in the former situati<strong>on</strong>, if node v ∈ A ′ isremoved from G, all <strong>of</strong> its neighbor nodes become adjacent to each other, which means that newSC-separators are created in G v c as in panels (b), (c), and (d) in Figure 3, where v c = V (G) \ {v}.If multiple nodes, v 1 , · · · , v r , are removed from A ′ \ A, we can see by the same argument that wehave new SC-separators in G R where R = V (G) \ {v 1 , · · · , v r }. This completes the pro<strong>of</strong>.From this theorem, we can see that a SC-separator, S say, in a Markovian subgraph <strong>of</strong> G meansthat there is a SC-separator in G which shares at least <strong>on</strong>e node with S. An analogous but moretangible result holds when G is decomposable. In the theorem below, χ(G) is the set <strong>of</strong> all the primeseparators <strong>of</strong> a decomposable graph G.THEOREM 5. (Theorem 4 <strong>of</strong> Kim (2006b)) Let there be Markovian subgraphs G i , i = 1, 2, · · · , m,<strong>of</strong> a decomposable graph G. Then(i)∪ m i=1χ(G i ) ⊆ χ(G);(ii) for any maximal CMS H,∪ m i=1χ(G i ) = χ(H).The above two theorems say that,(a) when G is decomposable, every prime separator that is found in a Markovian subgraph <strong>of</strong> G isalso found in G; but(b) when G is not decomposable, every SC-separator, A say, that is found in a Markovian subgraph<strong>of</strong> G has at least <strong>on</strong>e SC-separator in G which shares at least <strong>on</strong>e node with A.There is another noteworthy difference between the two types <strong>of</strong> graphs. In a decomposablegraph, if a node which is included in a prime separator is removed, then a new clique is formedby the nodes <strong>of</strong> the cliques that share the prime separator. This means that the prime separatordisappears with no trace left. On the other hand, if a node which is included in a SC-separator isremoved from a n<strong>on</strong>-decomposable graph, then new SC-separators are created as shown in Figure3 unless the SC-separator is shared by neighboring cliques <strong>on</strong>ly. From this, we can see that primeseparators in a decomposable graph may easily be lost in its Markovian subgraphs when at least <strong>on</strong>e<strong>of</strong> the nodes in a prime separator is removed. On the other hand, node-removal from a SC-separator,S say, in a n<strong>on</strong>-decomposable graph create new SC-separators in the Markovian subgraph whichshare nodes with S if the number <strong>of</strong> the removed nodes from S is less than |S|.5 GRAPHICAL COMPATIBILITY AND EXISTENCE OF CMS’SFor C ⊆ V (G) ∩ V (H), G and H are said to be C-compatible (Dawid & Studeny, 1999) ifG C = H C .For graphs, G 1 , · · · , G k , and sets <strong>of</strong> nodes, A 1 , · · · , A k−1 , if G i and G i+1 are A i -compatible fori = 1, 2, · · · , k − 1, then we say that G 1 and G k are compatible with regard to G i , i = 2, · · · , k − 1.9


Denote V (G i ) by V i and suppose that (G 1 ) A1 = (G 2 ) A1 and (G 2 ) A2 = (G 3 ) A2 . If there exist graphs,G ′ and G ′′ , such thatG ′ V 1= G 1 and G ′ V 2= G 2andthen it follows that G ′ A 2= G ′′ A 2sinceG ′′V 2= G 2 and G ′′V 3= G 3 ,G ′ A 2= (G ′ V 2) A2 = G 2 A 2= G 3 A 2= (G ′′V 3) A2 = G ′′ A 2,where the first and the last equality hold by (5) and the inequality, A 2 ⊆ V 2 ∩ V 3 .If we assume that G 1 , · · · , G k are Markovian subgraphs <strong>of</strong> an undirected graph G, then theremust exist such graphs as G ′ and G ′′ for every pair <strong>of</strong> Markovian subgraphs that share at least <strong>on</strong>enode.Let 〈i〉 = {1, 2, · · · , i}. Suppose that V i ∩ V i+1 ≠ ∅ for i = 1, 2, · · · , k − 1 and that we havea graph G 〈j〉 for 1 < j < k whose Markovian subgraphs are G i , i = 1, 2, · · · , j. Then there mustexist G 〈j+1〉 <strong>of</strong> which G 〈j〉 and G j+1 are Markovian subgraphs. Otherwise, the assumpti<strong>on</strong> for Gbecomes invalid. We state this in a formal manner below.THEOREM 6. If two graphs, G and H, are C-compatible for C = V (G) ∩ V (H), then there existsa CMS <strong>of</strong> G and H.Pro<strong>of</strong>. When G = H, the result is trivial since a graph is a CMS <strong>of</strong> itself. Suppose that|V (G)\V (H)| = 1. Then we can c<strong>on</strong>struct a graph H 1 <strong>of</strong> which G and H are Markovian subgraphsas described below.Let {α} = V (G) \ V (H). Then we can think <strong>of</strong> the following three cases:(i) bd G (α) = C and there exists a c<strong>on</strong>nectivity comp<strong>on</strong>ent g in H for which C ⊆ cl H (g).(ii) bd G (α) = C and there does not exist any c<strong>on</strong>nectivity comp<strong>on</strong>ent as in (i) but a c<strong>on</strong>nectivitycomp<strong>on</strong>ent g ′ for which ∅ ⊂ cl H (g ′ ) ∩ C ⊂ C.(iii) bd G (α) ⊂ C, i.e., 〈α|bd G (α)|C \ cl G (α)〉 G .In case (i): In this case, C ⊆ cl H (g). So node α may be attached to any clique in H that isc<strong>on</strong>nected to C in H, in such a way that h ∪ {α} may form a new clique in H 1 .In case (ii): In this case, α is attached to H to form H 1 such that bd H 1(α) = C.In case (iii): If there exists a c<strong>on</strong>nectivity comp<strong>on</strong>ent g in H such that bd G (α) ⊆ cl H (g) andC \ bd G (α) = C \ cl H (g), then α can be attached to any clique in H that is c<strong>on</strong>nected to bd G (α) t<strong>of</strong>orm a new clique in H 1 . If there is no such a c<strong>on</strong>nectivity comp<strong>on</strong>ent in H, then we attach α to Hsuch that bd H 1(α) = C.Now suppose that |V (G) \ V (H)| > 1. Let G 0 = G C and V (G) \ V (H) = {α 1 , · · · , α k }. LetC i = C ∪ {α 1 , · · · , α i } and G i = G Ci for i = 1, · · · , k, where G i ⊆ M G i+1 , i = 0, 1, · · · , k − 1.Then, by the transitivity property <strong>of</strong> Markovian subgraphs (Theorem 6 in Kim (2006b)), we haveG i ⊆ M G, i = 0, 1, · · · , k.By applying the above argument, we can obtain a graph H i <strong>of</strong> which H i−1 and G i are Markoviansubgraphs for i = 1, · · · , k, where H 0 = H. By the transitivity property <strong>of</strong> Markovian subgraphs,we have H ⊆ M H k . Therefore, H and G = G k are Markovian subgraphs <strong>of</strong> H k . This completesthe pro<strong>of</strong>.10


This theorem can be extended to a set <strong>of</strong> graphs where each graph is compatible with at least<strong>on</strong>e <strong>of</strong> the other graphs <strong>of</strong> the set as shown in the following corollary.COROLLARY 3. For graphs G i , i = 1, 2, · · · , m, let G 〈i〉 be a graph <strong>of</strong> which G j is a Markovian subgraph,j ≤ i. If G 〈i〉 and G i+1 are C i -compatible with C i = V (G 〈i〉 )∩V (G i+1 ), i = 1, 2, · · · , m−1,then there exists a CMS <strong>of</strong> G i , i = 1, 2, · · · , m.Pro<strong>of</strong>. Since G 〈m−1〉 and G m are C m−1 -compatible by the c<strong>on</strong>diti<strong>on</strong> <strong>of</strong> the corollary, thereexists, by Theorem 6, a CMS, I, <strong>of</strong> G 〈m−1〉 and G m . By the transitivity property <strong>of</strong> Markoviansubgraphs, I is a CMS <strong>of</strong> G i , i = 1, 2, · · · , m.6 MARKOVIAN COMBINATION OF MARGINAL MODELSIn the pro<strong>of</strong> <strong>of</strong> Theorem 6, we c<strong>on</strong>sidered, to show existence <strong>of</strong> a CMS, how we can add an edge betweena node in V (G)\V (H) and another node in H with no c<strong>on</strong>flicti<strong>on</strong> with the node-separatenessthat is found in at least <strong>on</strong>e <strong>of</strong> the graphs. The two graphs in Figure 1 are {1, 3}-compatible andtheir CMS’s are as in Figure 4. As for the two graphs in Figure 1, c<strong>on</strong>sider adding edges betweennode 4 in V (G 2 ) \ V (G 1 ) and some nodes in G 1 . Because <strong>of</strong> the node-separateness in G 1 , node 4can <strong>on</strong>ly be adjacent to nodes 1 and 2 or to nodes 2 and 3 as in Figure 4.Since a CMS, H say, <strong>of</strong> a pair <strong>of</strong> compatible graphs, G ′ and G ′′ say, is obtained in the form<strong>of</strong> attaching the nodes in V (G ′ ) \ V (G ′′ ) (or V (G ′′ ) \ V (G ′ )) to G ′′ (or G ′ ), it may be regarded ascombining the two graphs together. We will call this combinati<strong>on</strong> a Markovian combinati<strong>on</strong> in thesense thatM(H) ⊆ ˜L(G ′ , G ′′ );in other words, a probability <str<strong>on</strong>g>model</str<strong>on</strong>g> P which is globally Markov with respect to H has its <strong>marginal</strong>s,P V (G ′ ) and P V (G ′′ ), globally Markov with respect to G ′ and G ′′ respectively.Since a maximal CMS has a better property than CMS’s in the c<strong>on</strong>text <strong>of</strong> Theorem 5, we willpropose a combinati<strong>on</strong> method for maximal CMS’s <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> a set <strong>of</strong> <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g>.In the combinati<strong>on</strong>, it is imperative that node-separateness is preserved between a graph and itsMarkovian subgraph. This is reflected in the combinati<strong>on</strong> process in such a way that the followingc<strong>on</strong>diti<strong>on</strong> is satisfied:[Separateness c<strong>on</strong>diti<strong>on</strong> ] Let M be a set <strong>of</strong> Markovian subgraphs <strong>of</strong> G and H a maximal CMS <strong>of</strong>M. If two nodes are in a graph in M and they are not adjacent in the graph, then neither arethey in H. Otherwise, adjacency <strong>of</strong> the nodes in H is determined by checking separateness <strong>of</strong>the nodes in M.Two main rules <strong>of</strong> Markovian combinati<strong>on</strong> are ‘uni<strong>on</strong>’ and ‘check <strong>of</strong> separateness.’ We willdescribe each <strong>of</strong> them below.11234234Figure 4: Two CMS’s <strong>of</strong> the graphs in Figure 1.11


135713 517214682G G 1 G 23 57edge appearing in both <strong>of</strong> G 1 and G2edge between the nodes that appearin <strong>on</strong>ly <strong>on</strong>e <strong>of</strong> G 1 and G2628 4 63 517828284(a) Uni<strong>on</strong>64 6(b) Check <strong>of</strong> separatenessFigure 5: Markovian combinati<strong>on</strong> <strong>of</strong> graphs. The Markovian subgaphs G 1 and G 2 <strong>of</strong> G are combinedin two steps, uni<strong>on</strong> and check <strong>of</strong> separateness. Different colors are used for G 1 (in blue) and G 2 (inred). When an edge appears in both <strong>of</strong> the graphs, it is in black; an edge is colored green when itstwo nodes are not in the same graph <strong>of</strong> G 1 or G 2 .Uni<strong>on</strong>. Suppose we have two Markovian subgraphs, G 1 and G 2 , <strong>of</strong> a graph G. If nodes u and v arenot separated in any <strong>of</strong> G 1 and G 2 , we put an edge between the two nodes. We do the samefor all the pairs <strong>of</strong> nodes that are not separated in any <strong>of</strong> the subgraphs.If two nodes are not in the same subgraph, then we put an edge between them. If two nodesare shared by G 1 and G 2 and they are c<strong>on</strong>nected by an edge in <strong>on</strong>e subgraph but not in theother, we leave them separated. We denote the graph resulting from this operati<strong>on</strong> by G ∗ .Check <strong>of</strong> separateness. We check if the separateness that is found in G 1 and G 2 holds in G ∗ also. Ifan edge in G ∗ is in c<strong>on</strong>flict with the separateness <strong>of</strong> some pair <strong>of</strong> nodes, we remove the edgefrom G ∗ . We denote the graph resulting from this operati<strong>on</strong> by G ∗∗ .This combining process is illustrated in Figure 5. Note that in panel (a), edges, (3,4), (3,7), (4,5),(5,7), are created since the nodes in each <strong>of</strong> the pairs are not in the same graph <strong>of</strong> G 1 or G 2 . Two <strong>of</strong>the edges are removed in panel (b) since their existence are in c<strong>on</strong>flict with the node-separatenessthat is embedded in G 1 and G 2 . The combined result c<strong>on</strong>tains two edges more than the true graphG in Figure 5. It is interesting to note in this figure that G 1 and G 2 are decomposable while n<strong>on</strong>e <strong>of</strong>G and the combined graph is. This is an example that the Markovian combinati<strong>on</strong> <strong>of</strong> decomposablegraphs does not necessarily produce a decomposable combined graph.Another illustrati<strong>on</strong> is given in Figure 6 where the graph G is not a chain <strong>of</strong> cycles as in Figure5 but a more general form <strong>of</strong> undirected graphs. A 4-cycle {3, 4, 7, 8} is surrounded by a 7-cycle{1, 2, 5, 6, 9, 10, 11} in the graph. The combined graph which appears in panel (b) c<strong>on</strong>tains allthe edges in the graph G in additi<strong>on</strong> to the edges (1,4), (1,7), (5,7), and (7,10). These four edgesappeared in G 1 or G 2 and are not in c<strong>on</strong>flict with any node-separateness that is found in G 1 and G 2 .Note, in Figures 5 and 6, that the black edges in panel (a) which appear in both <strong>of</strong> the graphs, G 1 andG 2 in each <strong>of</strong> the figures, are preserved in the combined graph in panel (b). This is c<strong>on</strong>sequential<strong>on</strong> the fact that the adjacency <strong>of</strong> a pair <strong>of</strong> nodes in both <strong>of</strong> G 1 and G 2 is in no c<strong>on</strong>flict with the12


1212G5346785349111078134G 1 6951778 102 4G11210edge appearing in both <strong>of</strong> G 1 and G2edge between the nodes that appearin <strong>on</strong>ly <strong>on</strong>e <strong>of</strong> G 1 and G25917382 4911106(a) Uni<strong>on</strong>11611(b) Check <strong>of</strong> separatenessFigure 6: Markovian combinati<strong>on</strong> <strong>of</strong> graphs. In panel (a), there are 12 green edges for the pairs <strong>of</strong>nodes that do not appear in the same graph <strong>of</strong> G 1 or G 2 . Three <strong>of</strong> the green edges remain in panel(b).node-separateness in both <strong>of</strong> the graphs.The combined graphs which are obtained through the two operati<strong>on</strong>s are maximal CMS’s <strong>of</strong> agiven set <strong>of</strong> Markovian subgraphs as shown in the theorem below.THEOREM 7. The combinati<strong>on</strong> process by the two operati<strong>on</strong>s <strong>of</strong> Uni<strong>on</strong> and Check <strong>of</strong> separatenessproduces a maximal CMS.Pro<strong>of</strong>. Let M be a set <strong>of</strong> Markovian subgraphs <strong>of</strong> a graph. The “Uni<strong>on</strong>” operati<strong>on</strong> puts an edgebetween a pair <strong>of</strong> nodes, u and v say, unless u and v are both in a graph in M and separated therein.Denote the graph from this operati<strong>on</strong> by G ∗ . It is obvious that G ′ ⊆ e G ∗ V (G ′ ) for every G′ ∈ M.The “Check <strong>of</strong> separateness” operati<strong>on</strong> removes edges from G ∗ in such a way that the followingc<strong>on</strong>diti<strong>on</strong> is satisfied for every G ′ in M:For any pair <strong>of</strong> n<strong>on</strong>-adjacent nodes u and v in G ′ and a set C in G ′ which is disjoint with {u, v},〈u|C|v〉 G ′ if and <strong>on</strong>ly if 〈u|C|v〉 G ∗. (6)Denote a graph obtained from this check-<strong>of</strong>-separateness by G ∗∗ . Then any pair <strong>of</strong> n<strong>on</strong>-adjacentnodes, u and v say, in G ∗∗ mean either (i) that they are n<strong>on</strong>-adjacent in at least <strong>on</strong>e <strong>of</strong> the graphsin M or (ii) that they bel<strong>on</strong>g to different graphs each and putting an edge between the nodes incursc<strong>on</strong>flicts with the node-separateness in some <strong>of</strong> the graphs in M.Therefore, adding any edge to G ∗∗ into another graph G ′′ disqualifies G ′′ as a CMS <strong>of</strong> the graphsin M. This means that G ′′ is a maximal CMS <strong>of</strong> the graphs in M.13


2213 1⇐⇒3 1 3(1a)(1b)Interacti<strong>on</strong> graphs (G 1 ) <strong>of</strong> X 1 , X 2 , X 3 G ′222134134 134⇐⇒(2a) (2a’) (2b)Interacti<strong>on</strong> graphs (G 2 ) <strong>of</strong> X 1 , · · · , X 4 G ′′Figure 7: Some simple examples where each <strong>of</strong> the graphs in the right column are not a Markoviansubgraph <strong>of</strong> any <strong>of</strong> the graphs <strong>on</strong> the left-hand side <strong>of</strong> ⇐⇒.1347 FURTHER DISCUSSIONIn Theorem 2, we are given a set <strong>of</strong> Markovian subgraphs <strong>of</strong> G. But in reality, we are <strong>of</strong>tengiven a set <strong>of</strong> <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g> that are assumed to be interacti<strong>on</strong> graphs <strong>of</strong> the <strong>marginal</strong><str<strong>on</strong>g>model</str<strong>on</strong>g>s. The interacti<strong>on</strong> graphs may not be Markovian subgraphs <strong>of</strong> the unknown G. In this case,maximal CMS’s may not c<strong>on</strong>tain G as an edge-subgraph. Simple examples <strong>of</strong> this situati<strong>on</strong> aredisplayed in Figure 7. In the first row <strong>of</strong> the figure are two interacti<strong>on</strong> graphs (G 1 ) for X 1 , X 2 , X 3and a subgraph G ′ which is not Markovian with respect to G 1 , and similarly in the sec<strong>on</strong>d row forX 1 , · · · , X 4 . Under the hierarchy assumpti<strong>on</strong> for c<strong>on</strong>tingency tables, n<strong>on</strong>e <strong>of</strong> the graphical loglinear<str<strong>on</strong>g>model</str<strong>on</strong>g>s (1a), (2a), and (2a’) is compatible with the graphical sub<str<strong>on</strong>g>model</str<strong>on</strong>g>s at the right ends <strong>of</strong>the corresp<strong>on</strong>ding rows by Theorem 2.3 <strong>of</strong> Asmussen and Edwards (1983). The <str<strong>on</strong>g>model</str<strong>on</strong>g> G ′ in Figure7 is possible with the graphical log-linear <str<strong>on</strong>g>model</str<strong>on</strong>g> (1b) in the figure whenE[(P (X {1,3} = x {1,3} |X 2 )] = P (X 1 = x 1 )P (X 3 = x 3 ) for all x {1,3} ∈ X {1,3} , (7)where X i is the support <strong>of</strong> X i and X a = ∏ i∈a X i. The graphical log-linear <str<strong>on</strong>g>model</str<strong>on</strong>g> G ′′ in Figure 7 isalso possible from the graphical <str<strong>on</strong>g>model</str<strong>on</strong>g> (2b) in the figure. Instances <strong>of</strong> this phenomen<strong>on</strong> follow.Example 1. Probability distributi<strong>on</strong>s corresp<strong>on</strong>ding to some <strong>of</strong> the graphs in Figure 7. We willpresent c<strong>on</strong>tingency tables for which the pair <strong>of</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s, (1b) and G ′ in Figure 7, are possible and soare the pair <strong>of</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s, (2b) and G ′′ . When X i and X j are c<strong>on</strong>diti<strong>on</strong>ally independent given X k , wewill simply write i ⊥j|k.(a) C<strong>on</strong>cerning <str<strong>on</strong>g>model</str<strong>on</strong>g>s (1b) and G ′ :x 2 x 1 x 3 P (X = x) x 2 x 1 x 3 P (X = x)0 0 0 1/24 1 0 0 2/241 3/24 1 6/241 0 2/24 1 0 1/241 6/24 1 3/24This distributi<strong>on</strong> satisfies that 1 ⊥3|2 and 1 ⊥3.(b) C<strong>on</strong>cerning <str<strong>on</strong>g>model</str<strong>on</strong>g>s (2b) and G ′′ :14


x 2 x 3 x 1 x 4 P (X = x) x 2 x 3 x 1 x 4 P (X = x)0 0 0 0 1/42 1 0 0 0 3/421 2/42 1 1/421 0 2/42 1 0 6/421 4/42 1 2/421 0 0 2/42 1 0 0 6/421 4/42 1 2/421 0 1/42 1 0 3/421 2/42 1 1/42This distributi<strong>on</strong> satisfies the c<strong>on</strong>diti<strong>on</strong>al independencies displayed in graph (2b) in Figure 7.The <strong>marginal</strong> for X {1,3,4} satisfies the c<strong>on</strong>diti<strong>on</strong>al independence 1 ⊥4|3.Although we have seen examples where subgraphs <strong>of</strong> graphical log-linear <str<strong>on</strong>g>model</str<strong>on</strong>g>s are not Markovian,Markovian subgraphs are usual situati<strong>on</strong>s under the hierarchy assumpti<strong>on</strong> for <str<strong>on</strong>g>model</str<strong>on</strong>g>s. As indicatedin (7), in order for a subgraph to be n<strong>on</strong>-Markovian, a certain set <strong>of</strong> equati<strong>on</strong>s must be satisfiedbetween the set <strong>of</strong> parameters <strong>of</strong> a joint <str<strong>on</strong>g>model</str<strong>on</strong>g> and that <strong>of</strong> its interested n<strong>on</strong>-Markovian subgraph.This implies that n<strong>on</strong>-Markovian subgraphs are a rare situati<strong>on</strong> under the hierarchy assumpti<strong>on</strong> asl<strong>on</strong>g as interacti<strong>on</strong> graphs are c<strong>on</strong>cerned. Furthermore, when the distributi<strong>on</strong> is Normal, we can seeby its density functi<strong>on</strong> that the subgraphs are Markovian. Based <strong>on</strong> this point <strong>of</strong> view <strong>on</strong> Markoviansubgraphs, we have assumed in this paper that all the interacti<strong>on</strong> graphs <strong>of</strong> subsets V i <strong>of</strong> randomvariables are Markovian.The combinati<strong>on</strong> <strong>of</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g> is in two steps, Uni<strong>on</strong> and Check <strong>of</strong> separateness. Supposewe combine the graphs in M. At the ‘Uni<strong>on</strong>’ step, we put an edge between every pair <strong>of</strong> nodesunless there exists at least <strong>on</strong>e graph in M where both <strong>of</strong> the nodes appear and are not adjacent; atthe ‘Check <strong>of</strong> separateness’ step, we then remove an edge when its existence is in c<strong>on</strong>flict with thenode-separateness in the graphs in M. In this process, we d<strong>on</strong>’t need data but the <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g>.In this sense, the proposed method reuses the informati<strong>on</strong> that is embedded in the <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g><str<strong>on</strong>g>structures</str<strong>on</strong>g> for learning <str<strong>on</strong>g>structures</str<strong>on</strong>g> <strong>of</strong> a larger set <strong>of</strong> random variables which are involved in at least<strong>on</strong>e <strong>of</strong> the graphs in M.REFERENCESBalagtas, C.C., Becker, M.P. & Lang, J.B. (1995). Marginal <str<strong>on</strong>g>model</str<strong>on</strong>g>ling <strong>of</strong> categorical data fromcrossover experiments, Appl. Statist. 44, 63-77.Bartolucci, F. & Forcina, A. (2002). Extended RC associati<strong>on</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s allowing for order restricti<strong>on</strong>sand <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>ing, J. Am. Statist. Assoc. 97, 1192-9.Becker, M.P. (1994). Analysis <strong>of</strong> repeated categorical measurements using <str<strong>on</strong>g>model</str<strong>on</strong>g>s for <strong>marginal</strong>distributi<strong>on</strong>s: an applicati<strong>on</strong> to trends in attitudes <strong>on</strong> legalized aborti<strong>on</strong>. In SociologicalMethodology, Ed. P.V. Marsden, pp. 229-65. Oxford: Blackwell.Becker, M.P., Minick, S. & Yang, I. (1998). Specificati<strong>on</strong>s <strong>of</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s for cross-classified counts:comparis<strong>on</strong>s <strong>of</strong> the log-linear <str<strong>on</strong>g>model</str<strong>on</strong>g> and <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g> perspectives, Sociological Methodsand Research 26, 511-29.15


Bergsma, W.P. (1997). Marginal Models for Categorical Data. Tilburg: Tilburg University Press.Bergsma, W.P. & Rudas, T. (2002). Marginal <str<strong>on</strong>g>model</str<strong>on</strong>g>s for categorical data, Ann. Statist. 30, 140-59.Chickering, D. (1996). <str<strong>on</strong>g>Learning</str<strong>on</strong>g> Bayesian networks is NP-complete. In <str<strong>on</strong>g>Learning</str<strong>on</strong>g> from Data, Eds.D. Fisher & H. Lenz, pp. 121-130. Springer-Verlag.Colombi, R. & Forcina, A. (2001). Marginal regressi<strong>on</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g>s for the analysis <strong>of</strong> positive associati<strong>on</strong><strong>of</strong> ordinal resp<strong>on</strong>se variables, Biometrika 88, 1007-19.Cox, D.R. & Wermuth, N. (1999). Likelihood factorizati<strong>on</strong>s for mixed discrete and c<strong>on</strong>tinuousvariables, Scand. J. Statist., 26, 209-220.Dawid, A. P. & Studeny, M. (1999). C<strong>on</strong>diti<strong>on</strong>al products: An alternative approach to c<strong>on</strong>diti<strong>on</strong>alindependence. In Artificial Intelligence and Statistics 99, Eds. D. Heckerman & J. Whittaker,pp. 32-40. Morgan Kaufmann.Fienberg, S.E. & Kim, S.-H. (1999). Combining c<strong>on</strong>diti<strong>on</strong>al log-linear <str<strong>on</strong>g>structures</str<strong>on</strong>g>, J. Am. Statist.Ass., 445(94), 229-239.Gl<strong>on</strong>ek, G.J.N. & McCullagh, P. (1995). Multivariate logistic <str<strong>on</strong>g>model</str<strong>on</strong>g>s, J. R. Statist. Soc. B 57,533-46.Hammersley, J.M. & Clifford, P.E. (1971). Markov fields <strong>on</strong> finite graphs and lattices. Unpublishedmanuscript.Kim, S.-H. (2006a). C<strong>on</strong>diti<strong>on</strong>al log-linear <str<strong>on</strong>g>structures</str<strong>on</strong>g> for log-linear <str<strong>on</strong>g>model</str<strong>on</strong>g>ling, Computati<strong>on</strong>alStatistics and Data Analysis, 50(8), 2044-2064.Kim, S.-H. (2006b). Properties <strong>of</strong> Markovian subgraphs <strong>of</strong> a decomposable graph, Lecture Notesin Artificial Intelligence, LNAI 4293, 15-26.Kim, S.-H. & Lee, S. (2008). Searching <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g> <str<strong>on</strong>g>based</str<strong>on</strong>g> <strong>on</strong> <strong>marginal</strong> <str<strong>on</strong>g>model</str<strong>on</strong>g> <str<strong>on</strong>g>structures</str<strong>on</strong>g>. InNew Developments in Robotics, Automati<strong>on</strong> and C<strong>on</strong>trol, Eds. A. Lazinica, pp. 355-376.Vienna, Austria: In-Tech Educati<strong>on</strong> and Publishing.Lauritzen, S.L. (1996). Graphical Models. Oxford: Oxford University Press.Lauritzen, S.L. & Spiegelhalter, D.J. (1988). Local computati<strong>on</strong>s with probabilities <strong>on</strong> graphical<str<strong>on</strong>g>structures</str<strong>on</strong>g> and their applicati<strong>on</strong> to expert systems, J. R. Statist. Soc. B, 50(2), 157-224.Liang, K.-Y., Zeger, S. L. & Qaqish, B. (1992). Multivariate regressi<strong>on</strong> analyses for categoricaldata. J. R. Statist. Soc. B 54, 3-40.Meek, C. (1995). Causal influence and causal explanati<strong>on</strong> with background knowledge, Uncertaintyin Artificial Intelligence 11, 403-410.Molenberghs, G. & Lesaffre, E. (1999).Statistics in Medicine 18, 2237-2255.Marginal <str<strong>on</strong>g>model</str<strong>on</strong>g>ling <strong>of</strong> multivariate categorical data,Neapolitan, R.E. (2004). <str<strong>on</strong>g>Learning</str<strong>on</strong>g> Bayesian Networks, Upper Saddle River, NJ: Pears<strong>on</strong> PrenticeHall.16


Pearl, J. (1986). Fusi<strong>on</strong>, propagati<strong>on</strong> and structuring in belief networks, Artificial Intelligence, 29,241-288.Pearl, J. (1988). Probabilistic Reas<strong>on</strong>ing In Intelligent Systems: Networks <strong>of</strong> Plausible Inference,San Mateo, CA.: Morgan Kaufmann.Pearl, J. & Paz, A. (1987). Graphoids: a graph <str<strong>on</strong>g>based</str<strong>on</strong>g> logic for reas<strong>on</strong>ing about relevancy relati<strong>on</strong>s.In Advances in Artificial Intelligence II, Eds. B.D. Boulay, D. Hogg & L. Steel, pp. 357-363.Amsterdam: North-Holland, .Spirtes, P., Glymour, C., & Scheines, R. (2000). Causati<strong>on</strong>, Predicti<strong>on</strong>, and Search, 2nd ed. NewYork: Springer-Verlag.Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics, New York: Wiley17

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!