11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

542 NP-hard inferenceIn the strictest sense, bias results whenever real information remains in theresidual variation (e.g., the ɛ term in the linear model). However, statisticianshave chosen to further categorize bias in this strict sense depending on whetherit occurs above or below/at the resolution level r. Whentheinformationinthe residual variation resides in resolutions higher than r then we use theterm “variance” for the price of failing to include that information. Whenthe residual information resides in resolutions lower than or at r, thenwekeep the designation “bias.” This categorization, just as the mathematician’sO notation, serves many useful purposes, but we should not forget that it isultimately artificial.This point is most clear when we apply (45.1) in a telescopic fashion (byfirst making s = r +1andthensummingoverr) and when R = ∞:σ 2 r =E(σ 2 ∞|F r )+∞∑E{(µ i+1 − µ i ) 2 |F r }. (45.2)i=rThe use of R = ∞ is a mathematical idealization of the situations whereour specifications can go on indefinitely, such as with individualized medicine,where we have height, weight, age, gender, race, education, habit, all sorts ofmedical test results, family history, genetic compositions, environmental factors,etc. That is, we switch from the hopeless n =1(i.e.,asingleindividual)case to the hopeful R = ∞ scenario. The σ 2 ∞ term captures the variation ofthe population at infinite resolution. Whether σ 2 ∞ should be set to zero or notreflects whether we believe the world is fundamentally stochastic or appearsto be stochastic because of our human limitation in learning every mechanismresponsible for variations, as captured by F ∞ .Inthatsenseσ 2 ∞ can be viewedas the intrinsic variance with respect to a given filtration. Everything else inthe variance at resolution r are merely biases (e.g., from using µ i to estimateµ i+1 ) accumulated at higher resolutions.45.2.2 Resolution model estimation and selectionWhen σ 2 ∞ = 0, the infinite-resolution setup essentially is the same as a potentialoutcome model (Rubin, 2005), because the resulting population is ofsize one and hence comparisons on treatment effects must be counterfactual.This is exactly the right causal question for individualized treatments: whatwould be my (health, test) outcome if I receive one treatment versus another?In order to estimate such an effect, however, we must lower the resolutionto a finite and often small degree, making it possible to estimate averagetreatment effects, by averaging over a population that permits some degreesof replication. We then hope that the attributes (i.e., predictors) left in the“noise” will not contain enough real signals to alter our quantitative results,as compared to if we had enough data to model those attributes as signals, toa degree that would change our qualitative conclusions, such as choosing onetreatment versus another.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!