an introduction to generalized linear models - GDM@FUDAN ...

Recommendations

Info

|An| = det A are all positive. 4. The rank ofthe matrix A is also called the degrees offreedom ofthe quadratic form Q = y T Ay. 5. Suppose Y1, ..., Yn are independent random variables each with the Normal distribution N(0,σ2 ). Let Q = �n 2 i=1 Yi and let Q1, ..., Qk be quadratic forms in the Yi’s such that Q = Q1 + ... + Qk where Qi has mi degrees offreedom (i =1,... ,k). Then Q1, ..., Qk are independent random variables and Q1/σ 2 ∼ χ 2 (m1), Q2/σ 2 ∼ χ 2 (m2), ···and Qk/σ 2 ∼ χ 2 (mk), ifand only if, m1 + m2 + ... + mk = n. This is Cochran’s theorem; for a proof see, for example, Hogg and Craig (1995). A similar result holds for non-central distributions; see Chapter 3 ofRao (1973). 6. A consequence ofCochran’s theorem is that the difference oftwo independent random variables, X 2 1 ∼ χ 2 (m) and X 2 2 ∼ χ 2 (k), also has a chi-squared distribution provided that X 2 ≥ 0 and m>k. 1.6 Estimation 1.6.1 Maximum likelihood estimation X 2 = X 2 1 − X 2 2 ∼ χ 2 (m − k) Let y =[Y1, ..., Yn] T denote a random vector and let the joint probability density function of the Yi’s be f(y; θ) which depends on the vector ofparameters θ =[θ1, ..., θp] T . The likelihood function L(θ; y) is algebraically the same as the joint probability density function f(y; θ) but the change in notation reflects a shift ofemphasis from the random variables y, with θ fixed, to the parameters θ with y fixed. Since L is defined in terms ofthe random vector y, it is itselfa random variable. Let Ω denote the set ofall possible values ofthe parameter vector θ; Ω is called the parameter space. The maximum likelihood estimator of θ is the value � θ which maximizes the likelihood function, that is L( � θ; y) ≥ L(θ; y) for all θ in Ω. Equivalently, � θ is the value which maximizes the log-likelihood function © 2002 by Chapman & Hall/CRC
l(θ; y) = log L(θ; y), since the logarithmic function is monotonic. Thus l( � θ; y) ≥ l(θ; y) for all θ in Ω. Often it is easier to work with the log-likelihood function than with the likelihood function itself. Usually the estimator � θ is obtained by differentiating the log-likelihood function with respect to each element θj of θ and solving the simultaneous equations ∂l(θ; y) = 0 for j =1, ..., p. (1.9) ∂θj It is necessary to check that the solutions do correspond to maxima of l(θ; y) by verifying that the matrix of second derivatives ∂2l(θ; y) ∂θj∂θk evaluated at θ = � θ is negative definite. For example, if θ has only one element θ this means it is necessary to check that � 2 ∂ l(θ, y) ∂θ2 � < 0. It is also necessary to check ifthere are any values ofθ at the edges ofthe parameter space Ω that give local maxima of l(θ; y). When all local maxima have been identified, the value of � θ corresponding to the largest one is the maximum likelihood estimator. (For most ofthe models considered in this book there is only one maximum and it corresponds to the solution ofthe equations ∂l/∂θj =0,j =1, ..., p.) An important property ofmaximum likelihood estimators is that ifg(θ) is any function of the parameters θ, then the maximum likelihood estimator of g(θ) isg( � θ). This follows from the definition of � θ. It is sometimes called the invariance property ofmaximum likelihood estimators. A consequence is that we can work with a function of the parameters that is convenient for maximum likelihood estimation and then use the invariance property to obtain maximum likelihood estimates for the required parameters. In principle, it is not necessary to be able to find the derivatives ofthe likelihood or log-likelihood functions or to solve equation (1.9) if � θ can be found numerically. In practice, numerical approximations are very important for generalized linear models. Other properties ofmaximum likelihood estimators include consistency, sufficiency, asymptotic efficiency and asymptotic normality. These are discussed in books such as Cox and Hinkley (1974) or Kalbfleisch (1985, Chapters 1 and 2). © 2002 by Chapman & Hall/CRC θ= � θ
Page 1 and 2: CHAPMAN & HALL/CRC Texts in Statist
Page 3 and 4: AN INTRODUCTION TO GENERALIZED LINE
Page 5 and 6: Preface Contents 1 Introduction 1.1
Page 7 and 8: 10 Survival Analysis 10.1 Introduct
Page 9 and 10: 1 Introduction 1.1 Background This
Page 11 and 12: Table 1.1 Major methods of statisti
Page 13 and 14: ofgeneralized linear models althoug
Page 15 and 16: 3. Let Y1, ..., Yn denote Normally
Page 17: divided by its degrees offreedom, F
Page 21 and 22: (i.e., the matrix ofsecond derivati
Page 23 and 24: Table 1.3 Successive approximations
Page 25 and 26: 2 Model Fitting 2.1 Introduction Th
Page 27 and 28: IfH1is true, then the log-likelihoo
Page 29 and 30: estimated in order to calculate to
Page 31 and 32: where xjk is the gestational age of
Page 33 and 34: Table 2.4 Summary of data on birthw
Page 35 and 36: Residuals Residuals Percent 2 1 0 -
Page 37 and 38: sampling distributions ofthe corres
Page 39 and 40: is categorical how many categories
Page 41 and 42: Cox and Snell, 1968; Prigibon, 1981
Page 43 and 44: 2.4Notation and coding for explanat
Page 45 and 46: and the rows of X are as follows Gr
Page 47 and 48: (a) Conduct an exploratory analysis
Page 49 and 50: (d) List the assumptions made for (
Page 51 and 52: putation involving numerical optimi
Page 53 and 54: worthwhile trying to identify a tra
Page 55 and 56: We also need expressions for the ex
Page 57 and 58: 1. Response variables Y1,... ,YN wh
Page 59 and 60: Table 3.2 Numbers of deaths from co
Page 61 and 62: 3.4 Use results (3.9) and (3.12) to
Page 63 and 64: 4 Estimation 4.1 Introduction This
Page 65 and 66: x (m - 1) x (m) Figure 4.3 Newton-R
Page 67 and 68: Table 4.2 Details of Newton-Raphson
Page 69 and 70:
y differentiating (4.13) and substi
Page 71 and 72:
Table 4.3 Data for Poisson regressi
Page 73 and 74:
Table 4.4 Successive approximations
Page 75 and 76:
5 Inference 5.1 Introduction The tw
Page 77 and 78:
consistent with the general result
Page 79 and 80:
approximated by its expected value
Page 81 and 82:
Hence E � (b − β)(b − β) T
Page 83 and 84:
For Yi’s with other distributions
Page 85 and 86:
8, D has a chi-squared distribution
Page 87 and 88:
Consider the null hypothesis ⎡
Page 89 and 90:
(a) Find the Wald statistic (�π
Page 91 and 92:
6.2.2 Least squares estimation IfE(
Page 93 and 94:
Table 6.2 Multiple hypothesis tests
Page 95 and 96:
the minimum value ofthe sum ofsquar
Page 97 and 98:
and ⎡ X T ⎢ X = ⎢ ⎣ 20 923
Page 99 and 100:
or ‘worst possible’ value of S.
Page 101 and 102:
Table 6.6 Dried weights yi of plant
Page 103 and 104:
The first row (or column) ofthe (J
Page 105 and 106:
so For the plant weight data and Y
Page 107 and 108:
4. The model formed by omitting eff
Page 109 and 110:
Finally for the model with only a m
Page 111 and 112:
6.5 Analysis of covariance Analysis
Page 113 and 114:
For the reduced model (6.14) �
Page 115 and 116:
6.7 Exercises 6.1 Table 6.15 shows
Page 117 and 118:
Table 6.17 Cholesterol (CHOL), age
Page 119 and 120:
6.8 Table 6.20 shows the data from
Page 121 and 122:
Table 7.1 Frequencies for N binomia
Page 123 and 124:
x x Figure 7.2 Normal distribution:
Page 125 and 126:
and log(1 − πi) =− log [1 + ex
Page 127 and 128:
Table 7.4 Comparison of observed nu
Page 129 and 130:
Proportion germinated 0.7 0.6 0.5 4
Page 131 and 132:
which is asymptotically equivalent
Page 133 and 134:
From equation (7.5), �m residuals
Page 135 and 136:
Proportion with symptoms of senilit
Page 137 and 138:
Table 7.10 Hosmer-Lemeshow test for
Page 139 and 140:
(a) Are the proportions of graduate
Page 141 and 142:
category 2, and so on, then let ⎡
Page 143 and 144:
(iii) Likelihood ratio chi-squared
Page 145 and 146:
Women: preference for air condition
Page 147 and 148:
Table 8.3 Results from fitting the
Page 149 and 150:
π 1 π 2 π 3 π 4 C1 C2 C3 Figure
Page 151 and 152:
The adjacent category logit model i
Page 153 and 154:
Table 8.4 Results of proportional o
Page 155 and 156:
(c) Use a Wald statistic to test th
Page 157 and 158:
variables. The study design may mea
Page 159 and 160:
series expansion given in Section 7
Page 161 and 162:
for smokers and zero for non-smoker
Page 163 and 164:
column totals. It appears that Hutc
Page 165 and 166:
similar to the ulcer patients with
Page 167 and 168:
� k θ.k =1. This hypothesis can
Page 169 and 170:
9.6 Inference for log-linear models
Page 171 and 172:
Table 9.10 Log-linear models for th
Page 173 and 174:
9.9 Exercises 9.1 Let Y1, ..., YN b
Page 175 and 176:
Table 9.14 Contingency table with 2
Page 177 and 178:
1 2 D 3 A D 4 L 5 D D TO TL TC time
Page 179 and 180:
ofthe distribution. The median surv
Page 181 and 182:
10.2.3 Weibull distribution Another
Page 183 and 184:
Table 10.1 Remission times of leuke
Page 185 and 186:
log H(y) 1 0 -1 -2 0 1 2 3 log (y)
Page 187 and 188:
As there are r uncensored observati
Page 189 and 190:
small number ofcategorical explanat
Page 191 and 192:
Cox Snell residuals 3 2 1 0 Devianc
Page 193 and 194:
is sometimes used for modelling sur
Page 195 and 196:
11 Clustered and Longitudinal Data
Page 197 and 198:
data from the stroke example in Sec
Page 199 and 200:
score 100 80 60 40 20 0 2 4 6 8 wee
Page 201 and 202:
Table11.3 Results of naive analyses
Page 203 and 204:
Table 11.6 Analysis of variance of
Page 205 and 206:
1. All the off-diagonal elements ar
Page 207 and 208:
These are also called the quasi-sco
Page 209 and 210:
andom effect.This is an example ofa
Page 211 and 212:
Table 11.7 Comparison of analyses o
Page 213 and 214:
Table 11.8 Measurements of left ven
Page 215 and 216:
Table 11.9 Numbers of ears clear of
Page 217 and 218:
References Aitkin, M., Anderson, D.
Page 219 and 220:
Diggle, P. J., Liang, K.-Y. and Zeg
Page 221:
Roberts, G., Martyn, A. L., Dobson,
show all

an introduction to generalized linear models - GDM@FUDAN ...

Create successful ePaper yourself

Delete template?

Save as template?