Controlling for confounding in time series studies of air ... - IMAGe

image.ucar.edu
  • No tags were found...

Controlling for confounding in time series studies of air ... - IMAGe

Outline• The Social, Political, and RegulatoryContext• Model Choice in Time Series Studies– Adjustment for Confounding bias– Model Uncertainty– Model Averaging• Time Series and Cohort Studies• EStimating Chronic and Acute PollutionEffects Study (ESCAPES): a new Spatio-Temporal epidemiological design


Time series and cohort studies• Time Series studiesestimate associationbetween probability ofdeath and the level ofair pollution shortlybefore death (shortertermeffects)• Exposure: day-to-dayvariations in air pollution• temporal variability isused to estimate acuteeffects associated withshorter-term exposure• Cohort studies estimateassociation betweentime-to-death andexposure to air pollutionin a lifetime (longertermeffects)• Exposure: city-to-citylong-term average• spatial variability isused to estimatechronic effectsassociated with longertermexposure


Social, Political, and RegulatoryContext• EPA’s process for review of the NationalAmbient Air Quality Standard (NAAQS)creates a sensitive political context• Approx $100 million annually are spent inthe United States alone to addressuncertainties in the understanding of thehealth effects of particulate matter• Several expert committees are createdsuch as: Clean Air Act Scientific AdvisoryBoard (CASAC) of the EPA, committees ofNational Academy of Sciences, WorldHealth Organization and many others


Science to reduce uncertainty


The National MortalityMorbidity Air Pollution Study


The National Morbidity MortalityAir Pollution Study (NMMAPS)NMMAPS is a multi-site time series studyassessing short-term effects of airpollution on mortality/morbiditycomprising:1. a national data base of air pollution andmortality;2. statistical methods for estimatingassociations between air pollution andmortality for the 90 largest US cities, andon average for the entire nation.


Daily time series of air pollution, mortalityand weather in Baltimore 1987-1994


90 Largest Locations in the USA


Statistical Methods• Within city. Semi-parametric regressionsfor estimating associations between dayto-dayvariations in air pollution andmortality controlling for confoundingfactors• Across cities. Hierarchical Models forestimating:– national-average relative rate– national-average exposure-responserelationship– exploring heterogeneity of air pollution effectsacross the country


Confounding• The association between air pollutionand mortality is potentially confoundedby:– Weather: mortality is higher at low and hightemperature– Seasonality: mortality generally peaks inwinter because of influenza epidemics– Long-term trend: improvement in medicalpractice lower mortality over time• All these phenomena cannot beattributed to air pollution


Estimating city-specific relative rates


Estimating a national-average relative rateDominici, Zeger, Samet RSSA 2000Samet, Dominici, Zeger et al. NEJM 2000


Maximum likelihood and Bayesianestimates of air pollution effectsUse only city-specific informationBorrow strength across cities


National-average estimates for Cvdresp,Total and Other causes mortalityDominici McDermott Zeger Samet 2002 EHP


Software sensitivity versusmodel uncertainty• We discovered a sensitivity of the NMMAPSnational average estimate to the defaultparameters in the Splus GAM software• This captured the attention of the industryand the press• However, in the environmentalepidemiological community not enoughattention is generally paid to the sensitivity oftime series results to model choice


NMMAPS in ScienceJuly 2002“(A)lthough manyquestions remainabout how fineparticles killpeople, the HEIstudy showsthere’s nomistaking that PMis the culprit…?


Science June 10 2002GAMdefGAM-strict


The Devil is in the Details!October 31 2002Revised GAM software with asymptotically exact standarderrors and stricter convergence parameters posted on myweb site


GAM consequences to regulatoryprocess• Revision of the NAAQS have been delayedfor one year at least• Initiated a major effort in a peer-reviewed reanalysisof all major time series studies in theworld that used GAM• Conclusion of the HEI Special Review Panelwas: These revised analyses have renewedthe awareness of the uncertainties present inthe estimates of short-term air pollutioneffects…


Model Uncertainty:Selecting Degrees of Freedomto Reduce Confounding Bias


Uncertainty in Model Choice• How to reduce confounding bias in theestimated pollution effect is among the mostdiscussed statistical issues in time seriesanalyses of pollution and health• In particular, the choice of the number ofdegrees of freedom in the smooth function oftime is critical• This choice determines the residual temporalvariability in the daily deaths and pollutionlevels used to estimate the pollution effect


Measured and UnmeasuredConfounding• Measured confounders: time-varying covariates thatare associated with pollution and mortality, as forexample weather variables• Unmeasured confounders: other time-varyingfactors, such us influenza epidemics and trends insurvival that affect mortality and are temporallyassociated with variations in air pollution• Goal: estimate associations between day-to-dayvariations in air pollution and day-to-day variations inmortality taking into account measured andunmeasured confounders


Pittsburgh NMMAPS dataAdjusting for unmeasured confoundingPM10 (1987-88)Mortality (1987-88)df = 4ydf = 4ydf = 16ydf = 16y


Relative rates estimates as function of the degree ofadjustment for confounding factors(Pittsburgh 1987-1994)GAMGLMaαα multiplies the NMMAPS default choice of 8 df per yearSo α=2 implies that d = 16 df per year


Choosing the number of df inthe smooth function of time• Choosing too small a df– Over-smoothing– Leaves temporal cycles in the residual in theresidual– Confounding bias might occur• Choosing too large a df– under-smoothing– Removes all temporal variability in residuals– Wash out the pollution effect


Assessing Model Uncertaintywith calibration• Take a baseline choice for df in thesmooth functions of time, temperature,dew-point temperature..• Multiply these df for a parameter α• Fit the NMMAPS hierarchical model for20 values of α between 1/3 and 3• Report the NMMAPS national averageestimate as function of α


Dominici McDermott Hastie Technical Report (2003)


National average estimates versus degrees ofadjustment for confoundersNational-average estimatesBayesian city-specific estimates


Regulatory Policy amidstmodel uncertainty• EPA needs a single number for the entire country• NMMAPS national-average estimate used formortality impact estimates such:– the excess number of deaths attributable toshort-term exposure to air pollution• Looking for an important evidence for regulatoryassessment and environmental policy• To meet environmental policy expectations inpresence of statistical uncertainty is difficult


Model Uncertainty• There is no gold standard for estimating anoptimal number of degrees of freedom thatremove confounding bias• Because of the complexity and non-verifiableidentifiability inherent in selecting anappropriate model to adjust for unmeasuredconfounders, account for model uncertainty isa necessary step in air pollution riskestimation


Model Uncertainty and ModelChoice• Our strategy for assessing model uncertaintytreats measured and unmeasuredconfounders differently• We approach adjustment for measuredconfounders as model selection• We approach adjustment for unmeasuredconfounders as assessment of modeluncertainty in the context of prior opinions


Bayesian Model Averaging• BMA as a natural approach for adjusting forunmeasured confounders:– Easy to incorporate prior information about cyclicvariations in pollution and mortality whereconfounding is more likely to occur– Small number of competing models– Easy to implement– Produce a policy-relevant air pollution effectsestimate that takes into account model uncertaintyand biological knowledge on confounding


Dominici Louis Parmigiani Samet (work in progress)Clyde (1999): Previous work on BMA in time series studiesof air pollution


BMA with respect to confoundingadjustmentPrior Model ProbabilitiesBMA Pooled estimates


Is BMA a good idea?• Accounting for model uncertainty is of centralimportance when the goal is to estimate apolicy relevant quantity (such as a nationalaverage pollution effect) which can besensitive to model choice• Properly informed policy decisions shoulddepend on the point estimate and on modelselectionuncertainties• BMA as a general approach for removingconfounding bias in environmentalepidemiology


Epidemiological findings and numberof deaths attributable to air pollution• Time series and cohort studies providemortality risks estimates from shorter-termand longer-term air pollution exposure• Currently, regulators rely upon estimates ofpopulation burden of illness and prematuredeaths to set pollution standardsHow should we use these two relative riskestimates to determine the attributablenumber of deaths?


Need to reconcile estimates fromtime-series and cohort studiesLong term effect (ACS)Short term effect (NMMAPS)


EStimating Chronic AcutePollution Effects Study(ESCAPES)an ongoing project atJohns HopkinsBloomberg School of Public Health


• ESCAPES is a joint time-series andcohort study aimed at estimatinghealth effects associated with acuteand chronic air pollution exposurefor the National Medicare Cohort


ESCAPES: Data Sources• National Medicare Cohort: 1999-2001 followup of approx 40 million people for whom wehave:– Zip code of residence– time-of-entry into the cohort– time-of-event of hospitalization and deaths– Morbidity history– Socio-demographic covariates• National Air Pollution Monitoring Network:1999-2001 daily time series of:– Several pollutants, including PM2.5 for over athousand monitoring stations– Weather variables


Number of People enrolled in Medicare by County in 2000


National Medicare Cohort andNational Air Pollution MonitoringNetworkThese two data sources can be linked by:1. Counties: 327 already identified2. Zip codes and air pollutionmonitoring stations:• More than 1,000 monitoring stationsare currently available


Summary Statistics:ESCAPES Data: 2000 onlyAll Counties in USANumber ofPeopleenrolled inMedicare41 millionApproximateNumber ofDeaths2 millionCounties with PM2.5data18 million800 thousand


ESCAPES (1999-2001)Exposures:• daily time series and yearly averages for PM2.5,PM10, other pollutants, and weatherOutcomes:• diagnosis for any morbidity indicator• death for any cause-specific mortalityConfounders:• individual level socio-economic variables• individual level medical history• individual level risk factors (smoking included)for a 20% sample of the Medicare cohort• location-specific characteristics (data from USCensus)


Number of counties with PM2.5 and numberof deaths in the Medicare population90 counties with PM2.5 dataavailable 1 every 3 days. In these90 counties there are 4.5 millionpeople enrolled in Medicare and209,000 deaths


327 US counties with individual level data and daily airpollution levels for 2000


Model Uncertainty inEnvironmental Policy• Air pollution effect estimates are sensitive toconfounding adjustment and model choice• Estimation of policy-relevant quantities should relyupon the integration of– Good-ness of fit measures– Prior knowledge– Assessment of model uncertainty• NMMAPS has provided a national-average estimateof the short term effect of particulate matter onmortality, a key piece of evidence for regulatorypolicy• ESCAPES is a new epidemiological design aimed atjointly estimating health effects associated with shorttermand long-term exposure to fine particles andother pollutants


NMMAPS on the web


Papers, NMMAPS data, andsoftware posted on the web• http://www.biostat.jhsph.edu/~fdominic• IHAPPS (Internet-based Health Air PollutionSurveillance System, Dr Aidan McDermott)• http://www.ihapss.jhsph.edu/• Environmental Biostatistics Working Group(EBWG)• http://www.ihapss.jhsph.edu/~Ebwg

More magazines by this user
Similar magazines