11.07.2015 Views

2DkcTXceO

2DkcTXceO

2DkcTXceO

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

550 NP-hard inferencemodel P Y (Y |θ, ξ) of (45.7) is typically unknown to the pre-processor. At thevery best, the pre-processor may have a working model ˜PX (X|η), where ηmay not live even on the same space as θ. Consequently,thepre-processormay produce T (Y ) as a (minimal) sufficient statistic with respect to∫˜P Y (y|η, ξ) = P Y (y|x; ξ) ˜P X (x|η)µ(dx). (45.8)A natural question then is what are sufficient and necessary conditions onthe pre-processor’s working model such that a T (Y )(minimally)sufficientfor(45.8) will also be (minimally) sufficient for (45.7). Or to use computer sciencejargon, when is T (Y ) a lossless compression (in terms of statistical efficiency)?Evidently, we do not need the multi-phase framework to obtain trivial anduseless answers such as setting T (Y )=Y (which will be sufficient for anymodel of Y only) or requiring the working model to be the same as the scientificmodel (which tells us nothing new). The multi-phase framework allows usto formulate and obtain theoretically insightful and practically relevant resultsthat are unavailable in the single-phase framework. For example, in Blockerand Meng (2013), we obtained a non-trivial sufficient condition as well as anecessary condition (but they are not the same) for preserving sufficiency undera more general setting involving multiple (parallel) pre-processors duringthe pre-process phase. The sufficient condition is in the same spirit as thecondition for consistency of Rubin’s variance rule under uncongeniality. Thatis, in essence, sufficiency under (45.8) implies sufficiency under (45.7) whenthe working model is more saturated than the scientific model. This is ratherintuitive from a multi-phase perspective, because the fewer assumptions wemake in earlier phases, the more flexibility the later phases inherit, and consequently,the better the chances these procedures preserve information ordesirable properties.There is, however, no free lunch. The more saturated our model is, the lesscompression it achieves by statistical sufficiency. Therefore, in order to makeour results as practically relevant as possible, we must find ways to incorporatecomputational efficiency into our formulation. However, establishing a generaltheory for balancing statistical and computational efficiency is an extremelychallenging problem. The central difficulty is well known: statistical efficiencyis an inherent property of a procedure, but the computational efficiency canvary tremendously across computational architectures and over time.For necessary conditions, the challenge is of a different kind. Preserving sufficiencyis a much weaker requirement than preserving a model, even for minimalsufficiency. For example, N (µ, 1) and Poisson(λ) donotshareeventhesame state space. However, the sample mean is a minimal sufficient statisticfor both models. Therefore, a pre-processing model could be seriously flawedyet still lead to the best possible pre-processing (this could be viewed as a caseof action consistency; see Section 45.5). This type of possibility makes buildingamulti-phaseinferencetheorybothintellectuallydemandingandintriguing.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!