23.06.2015 Views

Package 'Hmisc' - R

Package 'Hmisc' - R

Package 'Hmisc' - R

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

16 aregImpute<br />

that only monotonic transformations of left and right-side variables are allowed, every bootstrap<br />

resample will give predicted values of the target variable that are monotonically related to predicted<br />

values from every other bootstrap resample. The same is true for Bayesian predicted values. This<br />

causes predictive mean matching to always match on the same donor observation.<br />

When the missingness mechanism for a variable is so systematic that the distribution of observed<br />

values is truncated, predictive mean matching does not work. It will only yield imputed values that<br />

are near observed values, so intervals in which no values are observed will not be populated by imputed<br />

values. For this case, the only hope is to make regression assumptions and use extrapolation.<br />

With type="regression", aregImpute will use linear extrapolation to obtain a (hopefully)<br />

reasonable distribution of imputed values. The "regression" option causes aregImpute to<br />

impute missing values by adding a random sample of residuals (with replacement if there are more<br />

NAs than measured values) on the transformed scale of the target variable. After random residuals<br />

are added, predicted random draws are obtained on the original untransformed scale using reverse<br />

linear interpolation on the table of original and transformed target values (linear extrapolation when<br />

a random residual is large enough to put the random draw prediction outside the range of observed<br />

values). The bootstrap is used as with type="pmm" to factor in the uncertainty of the imputation<br />

model.<br />

As model uncertainty is high when the transformation of a target variable is unknown, tlinear<br />

defaults to TRUE to limit the variance in predicted values when nk is positive.<br />

Value<br />

a list of class "aregImpute" containing the following elements:<br />

call<br />

formula<br />

match<br />

fweighted<br />

n<br />

p<br />

na<br />

nna<br />

type<br />

tlinear<br />

nk<br />

cat.levels<br />

df<br />

n.impute<br />

imputed<br />

x<br />

rsq<br />

the function call expression<br />

the formula specified to aregImpute<br />

the match argument<br />

the fweighted argument<br />

total number of observations in input dataset<br />

number of variables<br />

list of subscripts of observations for which values were originally missing<br />

named vector containing the numbers of missing values in the data<br />

vector of types of transformations used for each variable ("s","l","c" for<br />

smooth spline, linear, or categorical with dummy variables)<br />

value of tlinear parameter<br />

number of knots used for smooth transformations<br />

list containing character vectors specifying the levels of categorical variables<br />

degrees of freedom (number of parameters estimated) for each variable<br />

number of multiple imputations per missing value<br />

a list containing matrices of imputed values in the same format as those created<br />

by transcan. Categorical variables are coded using their integer codes.<br />

Variables having no missing values will have NULL matrices in the list.<br />

if x is TRUE, the original data matrix with integer codes for categorical variables<br />

for the last round of imputations, a vector containing the R-squares with which<br />

each sometimes-missing variable could be predicted from the others by ace or<br />

avas.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!