an introduction to generalized linear models - GDM@FUDAN ...

Recommendations

Info

there are only two categories: male, female; dead, alive; smooth leaves, serrated leaves. Ifthere are more than two categories the variable is called polychotomous, polytomous or multinomial. 2. Ordinal classifications in which there is some natural order or ranking between the categories: e.g., young, middle aged, old; diastolic blood pressures grouped as ≤ 70, 71-90, 91-110, 111-130, ≥131mm Hg. 3. Continuous measurements where observations may, at least in theory, fall anywhere on a continuum: e.g., weight, length or time. This scale includes both interval scale and ratio scale measurements – the latter have a well-defined zero. A particular example ofa continuous measurement is the time until a specific event occurs, such as the failure of an electronic component; the length oftime from a known starting point is called the failure time. Nominal and ordinal data are sometimes called categorical or discrete variables and the numbers ofobservations, counts or frequencies in each category are usually recorded. For continuous data the individual measurements are recorded. The term quantitative is often used for a variable measured on a continuous scale and the term qualitative for nominal and sometimes for ordinal measurements. A qualitative, explanatory variable is called a factor and its categories are called the levels for the factor. A quantitative explanatory variable is sometimes called a covariate. Methods ofstatistical analysis depend on the measurement scales ofthe response and explanatory variables. This book is mainly concerned with those statistical methods which are relevant when there is just one response variable, although there will usually be several explanatory variables. The responses measured on different subjects are usually assumed to be statistically independent random variables although this requirement is dropped in the final chapter which is about correlated data. Table 1.1 shows the main methods ofstatistical analysis for various combinations ofresponse and explanatory variables and the chapters in which these are described. The present chapter summarizes some ofthe statistical theory used throughout the book. Chapters 2 to 5 cover the theoretical framework that is common to the subsequent chapters. Later chapters focus on methods for analyzing particular kinds ofdata. Chapter 2 develops the main ideas ofstatistical modelling. The modelling process involves four steps: 1. Specifying models in two parts: equations linking the response and explanatory variables, and the probability distribution ofthe response variable. 2. Estimating parameters used in the models. 3. Checking how well the models fit the actual data. 4. Making inferences; for example, calculating confidence intervals and testing hypotheses about the parameters. © 2002 by Chapman & Hall/CRC
Table 1.1 Major methods of statistical analysis for response and explanatory variables measured on various scales and chapter references for this book. Response (chapter) Explanatory variables Methods Continuous Binary t-test (Chapter 6) Nominal, >2 categories Analysis ofvariance Ordinal Analysis ofvariance Continuous Multiple regression Nominal & some Analysis of continuous covariance Categorical & continuous Multiple regression Binary Categorical Contingency tables (Chapter 7) Logistic regression Continuous Logistic, probit & other dose-response models Categorical & continuous Logistic regression Nominal with Nominal Contingency tables >2 categories (Chapter 8 & 9) Categorical & continuous Nominal logistic regression Ordinal Categorical & continuous Ordinal logistic (Chapter 8) regression Counts Categorical Log-linear models (Chapter 9) Categorical & continuous Poisson regression Failure times Categorical & continuous Survival analysis (Chapter 10) (parametric) Correlated Categorical & continuous Generalized responses estimating equations (Chapter 11) Multilevel models © 2002 by Chapman & Hall/CRC
Page 1 and 2: CHAPMAN & HALL/CRC Texts in Statist
Page 3 and 4: AN INTRODUCTION TO GENERALIZED LINE
Page 5 and 6: Preface Contents 1 Introduction 1.1
Page 7 and 8: 10 Survival Analysis 10.1 Introduct
Page 9: 1 Introduction 1.1 Background This
Page 13 and 14: ofgeneralized linear models althoug
Page 15 and 16: 3. Let Y1, ..., Yn denote Normally
Page 17 and 18: divided by its degrees offreedom, F
Page 19 and 20: l(θ; y) = log L(θ; y), since the
Page 21 and 22: (i.e., the matrix ofsecond derivati
Page 23 and 24: Table 1.3 Successive approximations
Page 25 and 26: 2 Model Fitting 2.1 Introduction Th
Page 27 and 28: IfH1is true, then the log-likelihoo
Page 29 and 30: estimated in order to calculate to
Page 31 and 32: where xjk is the gestational age of
Page 33 and 34: Table 2.4 Summary of data on birthw
Page 35 and 36: Residuals Residuals Percent 2 1 0 -
Page 37 and 38: sampling distributions ofthe corres
Page 39 and 40: is categorical how many categories
Page 41 and 42: Cox and Snell, 1968; Prigibon, 1981
Page 43 and 44: 2.4Notation and coding for explanat
Page 45 and 46: and the rows of X are as follows Gr
Page 47 and 48: (a) Conduct an exploratory analysis
Page 49 and 50: (d) List the assumptions made for (
Page 51 and 52: putation involving numerical optimi
Page 53 and 54: worthwhile trying to identify a tra
Page 55 and 56: We also need expressions for the ex
Page 57 and 58: 1. Response variables Y1,... ,YN wh
Page 59 and 60: Table 3.2 Numbers of deaths from co
Page 61 and 62:
3.4 Use results (3.9) and (3.12) to
Page 63 and 64:
4 Estimation 4.1 Introduction This
Page 65 and 66:
x (m - 1) x (m) Figure 4.3 Newton-R
Page 67 and 68:
Table 4.2 Details of Newton-Raphson
Page 69 and 70:
y differentiating (4.13) and substi
Page 71 and 72:
Table 4.3 Data for Poisson regressi
Page 73 and 74:
Table 4.4 Successive approximations
Page 75 and 76:
5 Inference 5.1 Introduction The tw
Page 77 and 78:
consistent with the general result
Page 79 and 80:
approximated by its expected value
Page 81 and 82:
Hence E � (b − β)(b − β) T
Page 83 and 84:
For Yi’s with other distributions
Page 85 and 86:
8, D has a chi-squared distribution
Page 87 and 88:
Consider the null hypothesis ⎡
Page 89 and 90:
(a) Find the Wald statistic (�π
Page 91 and 92:
6.2.2 Least squares estimation IfE(
Page 93 and 94:
Table 6.2 Multiple hypothesis tests
Page 95 and 96:
the minimum value ofthe sum ofsquar
Page 97 and 98:
and ⎡ X T ⎢ X = ⎢ ⎣ 20 923
Page 99 and 100:
or ‘worst possible’ value of S.
Page 101 and 102:
Table 6.6 Dried weights yi of plant
Page 103 and 104:
The first row (or column) ofthe (J
Page 105 and 106:
so For the plant weight data and Y
Page 107 and 108:
4. The model formed by omitting eff
Page 109 and 110:
Finally for the model with only a m
Page 111 and 112:
6.5 Analysis of covariance Analysis
Page 113 and 114:
For the reduced model (6.14) �
Page 115 and 116:
6.7 Exercises 6.1 Table 6.15 shows
Page 117 and 118:
Table 6.17 Cholesterol (CHOL), age
Page 119 and 120:
6.8 Table 6.20 shows the data from
Page 121 and 122:
Table 7.1 Frequencies for N binomia
Page 123 and 124:
x x Figure 7.2 Normal distribution:
Page 125 and 126:
and log(1 − πi) =− log [1 + ex
Page 127 and 128:
Table 7.4 Comparison of observed nu
Page 129 and 130:
Proportion germinated 0.7 0.6 0.5 4
Page 131 and 132:
which is asymptotically equivalent
Page 133 and 134:
From equation (7.5), �m residuals
Page 135 and 136:
Proportion with symptoms of senilit
Page 137 and 138:
Table 7.10 Hosmer-Lemeshow test for
Page 139 and 140:
(a) Are the proportions of graduate
Page 141 and 142:
category 2, and so on, then let ⎡
Page 143 and 144:
(iii) Likelihood ratio chi-squared
Page 145 and 146:
Women: preference for air condition
Page 147 and 148:
Table 8.3 Results from fitting the
Page 149 and 150:
π 1 π 2 π 3 π 4 C1 C2 C3 Figure
Page 151 and 152:
The adjacent category logit model i
Page 153 and 154:
Table 8.4 Results of proportional o
Page 155 and 156:
(c) Use a Wald statistic to test th
Page 157 and 158:
variables. The study design may mea
Page 159 and 160:
series expansion given in Section 7
Page 161 and 162:
for smokers and zero for non-smoker
Page 163 and 164:
column totals. It appears that Hutc
Page 165 and 166:
similar to the ulcer patients with
Page 167 and 168:
� k θ.k =1. This hypothesis can
Page 169 and 170:
9.6 Inference for log-linear models
Page 171 and 172:
Table 9.10 Log-linear models for th
Page 173 and 174:
9.9 Exercises 9.1 Let Y1, ..., YN b
Page 175 and 176:
Table 9.14 Contingency table with 2
Page 177 and 178:
1 2 D 3 A D 4 L 5 D D TO TL TC time
Page 179 and 180:
ofthe distribution. The median surv
Page 181 and 182:
10.2.3 Weibull distribution Another
Page 183 and 184:
Table 10.1 Remission times of leuke
Page 185 and 186:
log H(y) 1 0 -1 -2 0 1 2 3 log (y)
Page 187 and 188:
As there are r uncensored observati
Page 189 and 190:
small number ofcategorical explanat
Page 191 and 192:
Cox Snell residuals 3 2 1 0 Devianc
Page 193 and 194:
is sometimes used for modelling sur
Page 195 and 196:
11 Clustered and Longitudinal Data
Page 197 and 198:
data from the stroke example in Sec
Page 199 and 200:
score 100 80 60 40 20 0 2 4 6 8 wee
Page 201 and 202:
Table11.3 Results of naive analyses
Page 203 and 204:
Table 11.6 Analysis of variance of
Page 205 and 206:
1. All the off-diagonal elements ar
Page 207 and 208:
These are also called the quasi-sco
Page 209 and 210:
andom effect.This is an example ofa
Page 211 and 212:
Table 11.7 Comparison of analyses o
Page 213 and 214:
Table 11.8 Measurements of left ven
Page 215 and 216:
Table 11.9 Numbers of ears clear of
Page 217 and 218:
References Aitkin, M., Anderson, D.
Page 219 and 220:
Diggle, P. J., Liang, K.-Y. and Zeg
Page 221:
Roberts, G., Martyn, A. L., Dobson,
show all

an introduction to generalized linear models - GDM@FUDAN ...

Create successful ePaper yourself

Delete template?

Save as template?