www.downloadslide.com 3.7 The Hypergeometric Probability Distribution 125 *3.101 a We observe a sequence of independent identical trials **with** two possible outcomes on each trial, S and F, and **with** P(S) = p. The number of the trial on which we observe the fifth success, Y , has a negative binomial distribution **with** parameters r = 5 and p. Suppose that we observe the fifth success on the eleventh trial. Find the value of p that maximizes P(Y = 11). b Generalize the result from part (a) to find the value of p that maximizes P(Y = y 0 ) when Y has a negative binomial distribution **with** parameters r (known) and p. 3.7 The Hypergeometric Probability Distribution In Example 3.6 we considered a population of voters, 40% of whom favored candidate Jones. A sample of voters was selected, and Y (the number favoring Jones) was to be observed. We concluded that if the sample size n was small relative to the population size N, the distribution of Y could be approximated by a binomial distribution. We also determined that if n was large relative to N, the conditional probability of selecting a supporter of Jones on a later draw would be significantly affected by the observed preferences of persons selected on earlier draws. Thus the trials were not independent and the probability distribution for Y could not be approximated adequately by a binomial probability distribution. Consequently, we need to develop the probability distribution for Y when n is large relative to N. Suppose that a population contains a finite number N of elements that possess one of two characteristics. Thus, r of the elements might be red and b = N − r, black. A sample of n elements is randomly selected from the population, and the random variable of interest is Y , the number of red elements in the sample. This random variable has what is known as the hypergeometric probability distribution. For example, the number of workers who are women, Y , in Example 3.1 has the hypergeometric distribution. The hypergeometric probability distribution can be derived by using the combinatorial theorems given in Section 2.6 and the sample-point approach. A sample point in the sample space S will correspond to a unique selection of n elements, some red and the remainder black. As in the binomial experiment, each sample point can be characterized by an n-tuple whose elements correspond to a selection of n elements from the total of N. If each element in the population were numbered from 1 to N, the sample point indicating the selection of items 5, 7, 8, 64, 17,..., 87 would appear as the n-tuple (5, 7, 8, 64, 17,...,87). } {{ } n positions The total number of sample points in S, therefore, will equal the number of ways of selecting a subset of n elements from a population of N, or ( N n ) . Because random selection implies that all sample points are equiprobable, the probability of a sample

www.downloadslide.com 126 Chapter 3 Discrete Random Variables and Their Probability Distributions point in S is P(E i ) = 1 ( N n ), all E i ∈ S. The total number of sample points in the numerical event Y = y is the number of sample points in S that contain y red and (n − y) black elements. This number can be obtained by applying the mn rule (Section 2.6). The number of ways of selecting y red elements to fill y positions in the n-tuple representing a sample point is the number of ways of selecting y from a total of r, or ( r y) . [We use the convention ( a b) = 0ifb > a.] The total number of ways of selecting (n − y) black elements to fill the remaining (n − y) positions in the n-tuple is the number of ways of selecting (n − y) black elements from a possible (N − r), or ( N−r n−y) . Then the number of sample points in the numerical event Y = y is the number of ways of combining a set of y red and (n − y) black elements. By the mn rule, this is the product ( ) ( r y × N−r n−y) . Summing the probabilities of the sample points in the numerical event Y = y (multiplying the number of sample points by the common probability per sample point), we obtain the hypergeometric probability function. DEFINITION 3.10 A random variable Y is said to have a hypergeometric probability distribution if and only if ( )( ) r N − r y p(y) = n − y ( ) , N n where y is an integer 0, 1, 2,...,n, subject to the restrictions y ≤ r and n − y ≤ N − r. With the convention ( ) a b = 0ifb > a, it is clear that p(y) ≥ 0 for the hypergeometric probabilities. The fact that the hypergeometric probabilities sum to 1 follows from the fact that n∑ ( )( ) ( ) r N − r N = . i n − i n i=0 A sketch of the proof of this result is outlined in Exercise 3.216. EXAMPLE 3.16 An important problem encountered by personnel directors and others faced **with** the selection of the best in a finite set of elements is exemplified by the following scenario. From a group of 20 Ph.D. engineers, 10 are randomly selected for employment. What is the probability that the 10 selected include all the 5 best engineers in the group of 20? Solution For this example N = 20, n = 10, and r = 5. That is, there are only 5 in the set of 5 best engineers, and we seek the probability that Y = 5, where Y denotes the number

