12.07.2015 Views

Stat 5101 Lecture Notes - School of Statistics

Stat 5101 Lecture Notes - School of Statistics

Stat 5101 Lecture Notes - School of Statistics

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

5.4. THE MULTINOMIAL DISTRIBUTION 157The last equality records the same kind <strong>of</strong> simplification we saw in deriving thebinomial density. The product from 1 to n in the next to last line may repeatsome <strong>of</strong> the p’s. How <strong>of</strong>ten are they repeated? There is one p j for each X ij thatis equal to one, and there are Y j = ∑ i X ij <strong>of</strong> them.We are not done, however, because more than one outcome can lead to thesame right hand side here. How many ways are there to get exactly y j <strong>of</strong> theX ij equal to one? This is the same as asking how many ways there are to assignthe numbers i =1,..., n to one <strong>of</strong> k categories, so that there are y i in the i-thcategory, and the answer is the multinomial coefficient()n=y 1 ,...,y kThus the density is( )n ∏ kf(y) =y 1 ,...,y kwhere the sample space S is defined byn!y 1 !···y k !j=1p yjj ,S = { y ∈ N k : y 1 + ···y k =n}y ∈ Swhere N denotes the “natural numbers” 0, 1, 2, .... In other words, the samplespace S consists <strong>of</strong> vectors y having nonnegative integer coordinates that sumto n.5.4.5 Marginals and “Sort Of” MarginalsThe univariate marginals are obvious. Since the univariate marginals <strong>of</strong>Ber(p) are Ber(p i ), the univariate marginals <strong>of</strong> Multi(n, p) are Bin(n, p i ).Strictly speaking, the multivariate marginals do not have a brand name distribution.Lindgren (Theorem 8 <strong>of</strong> Chapter 6) says the marginals <strong>of</strong> a multinomialare multinomial, but this is, strictly speaking, complete rubbish, given theway he (and we) defined “marginal” and “multinomial.” It is obviously wrong.If X =(X 1 ,...,X k ) is multinomial, then it is degenerate. But (X 1 ,...,X k−1 )is not degenerate, hence not multinomial (all multinomial distributions are degenerate).The same goes for any other subvector, (X 2 ,X 5 ,X 10 ), for example.Of course, Lindgren knows this as well as I do. He is just being sloppyabout terminology. What he means is clear from his discussion leading up tothe “theorem” (really a non-theorem). Here’s the correct statement.Theorem 5.16. Suppose Y = Multi k (n, p) and Z is a random vector formedby collapsing some <strong>of</strong> the categories for Y, that is, each component <strong>of</strong> Z has theformZ j = Y i1 + ···+Y imjwhere each Y i contributes to exactly one Z j so thatZ 1 + ···+Z l =Y 1 +···+Y k =n,

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!