09.07.2015 Views

Robust Estimation for Zero-Inflated Poisson Regression - Franklin ...

Robust Estimation for Zero-Inflated Poisson Regression - Franklin ...

Robust Estimation for Zero-Inflated Poisson Regression - Franklin ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

6 D. B. Hall and J. Shen Scand J StatistTable 1. Relative efficiencies <strong>for</strong> zero-inflated <strong>Poisson</strong> (ZIP) model parameters <strong>for</strong> differentvalues of the tuning quantile cZIP with γ = 1.0 ZIP with γ = 4.0c γ = 1.0 β 1 = 0 β 2 = 1 β 3 = 0.1 γ = 4.0 β 1 = 0 β 2 = 1 β 3 = 0.1c = 0.005 0.998 0.993 0.996 0.996 1.000 0.995 0.996 0.996c = 0.010 0.997 0.987 0.993 0.993 1.000 0.990 0.993 0.993c = 0.015 0.996 0.985 0.990 0.990 1.000 0.988 0.990 0.990c = 0.020 0.995 0.982 0.988 0.988 1.000 0.986 0.988 0.988c = 0.025 0.995 0.981 0.986 0.986 1.000 0.985 0.986 0.986c = 0.030 0.992 0.967 0.981 0.981 0.999 0.975 0.982 0.981c = 0.040 0.990 0.955 0.972 0.971 0.999 0.965 0.973 0.972c = 0.050 0.988 0.947 0.969 0.968 0.999 0.959 0.969 0.969c = 0.075 0.985 0.930 0.946 0.945 0.999 0.944 0.947 0.946c = 0.100 0.980 0.924 0.938 0.937 0.998 0.940 0.940 0.939particular, we consider ZIP models with non-constant mixing probability p = {1 + exp(−γg)} −1and <strong>Poisson</strong> mean μ = e Bβ . Specifically, we set g = j T 10⊗(−1.5, −1.4, −1.3, −1.2, −1.1, − 1.0,− 0.9, − 0.8, − 0.7, − 0.6) and⎛1 1 1 1 1 0 0 0 0⎞0B T = j T 10 ⊗ ⎝ 0 0 0 0 0 1 1 1 1 1 ⎠.1 2 3 4 5 6 7 8 9 10Parameters γ and β were chosen to yield some overlap between the mixture components andare listed in Table 1. Two values, 1 and 4, were chosen <strong>for</strong> γ, which give values of p in therange (0.18, 0.35) and (0.0025, 0.083), respectively.Based on these models, we report the ratios of asymptotic variances <strong>for</strong> values of c rangingfrom 0.005 to 0.10 in Table 1. Because no high leverage points exist in the data, the ratiosof asymptotic variance corresponding to the mixing probability are close to 1. Ratios <strong>for</strong> theregression parameters decrease as expected with c, but the relative efficiency <strong>for</strong> the RES estimatorsis reasonably high even <strong>for</strong> c = 0.10. In the simulation studies and example of sections4 and 5, we take c = 0.01 <strong>for</strong> the ZIP models we consider. In the simple models we considerhere, this choice leads to relative efficiencies <strong>for</strong> β of 98.7 per cent or better. Of course, therelative efficiency of the RES approach will differ across models, but the results we presenthere give some sense of the efficiency loss that one might expect in practice.3. MHD estimation in ZI regression models3.1. MHD methodThe literature on MHD estimation concentrates primarily on the i.i.d. setting. In that contextthe Hellinger distance is defined as H 2 (f n , f θ ) = ∫ {f n (y) 1/2 − f θ (y) 1/2 } 2 dy where f θ is themarginal density of a response variable Y based on a sample y 1 , ..., y n according to theparametric model under consideration and f n is a corresponding non-parametric density estimate.For discrete data, the natural choice <strong>for</strong> f n is the empirical frequency function defined byf n (y) = N y /n, y ∈ Y, (8)where N y is the number of observations having value y, and Y is the sample space <strong>for</strong> Y.In the context of <strong>Poisson</strong> mixture regression models, Lu et al. (2003) presented MHD estimatorsdefined in terms of the classical Hellinger distance between the empirical frequencyfunction f n (·) and the marginal density <strong>for</strong> Y implied by the model. Because ZIP regressionfalls in the class of models these authors considered, here we treat their approach as an© 2009 Board of the Foundation of the Scandinavian Journal of Statistics.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!