10.07.2015 Views

The np Package - NexTag Supports Open Source Initiatives

The np Package - NexTag Supports Open Source Initiatives

The np Package - NexTag Supports Open Source Initiatives

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>np</strong>regbw 103ckerorderukertypeokertypenmultireminitmaxftoltolsmallnumeric value specifying kernel order (one of (2,4,6,8)). Kernel order specifiedalong with a uniform continuous kernel type will be ignored. Defaults to2.character string used to specify the unordered categorical kernel type. Can beset as aitchisonaitken or liracine. Defaults to aitchisonaitken.character string used to specify the ordered categorical kernel type. Can be setas wangvanryzin or liracine. Defaults to wangvanryzin.integer number of times to restart the process of finding extrema of the crossvalidationfunction from different (random) initial points. Defaults to min(5,ncol(xdat)).a logical value which when set as TRUE the search routine restarts from locatedminima for a minor gain in accuracy. Defaults to TRUE.integer number of iterations before failure in the numerical optimization routine.Defaults to 10000.tolerance on the value of the cross-validation function evaluated at located minima.Defaults to 1.19e-07 (FLT_EPSILON).tolerance on the position of located minima of the cross-validation function.Defaults to 1.49e-08 (sqrt(DBL_EPSILON)).a small number, at about the precision of the data type used. Defaults to 2.22e-16 (DBL_EPSILON).Details<strong>np</strong>regbw implements a variety of methods for choosing bandwidths for multivariate (p-variate)regression data defined over a set of possibly continuous and/or discrete (unordered, ordered) data.<strong>The</strong> approach is based on Li and Racine (2003) who employ ‘generalized product kernels’ thatadmit a mix of continuous and discrete datatypes.<strong>The</strong> cross-validation methods employ multivariate numerical search algorithms (direction set (Powell’s)methods in multidimensions).Bandwidths can (and will) differ for each variable which is, of course, desirable.Three classes of kernel estimators for the continuous datatypes are available: fixed, adaptive nearestneighbor,and generalized nearest-neighbor. Adaptive nearest-neighbor bandwidths change witheach sample realization in the set, x i , when estimating the density at the point x. Generalizednearest-neighbor bandwidths change with the point at which the density is estimated, x. Fixedbandwidths are constant over the support of x.<strong>np</strong>regbw may be invoked either with a formula-like symbolic description of variables on whichbandwidth selection is to be performed or through a simpler interface whereby data is passed directlyto the function via the xdat and ydat parameters. Use of these two interfaces is mutuallyexclusive.Data contained in the data frame xdat may be a mix of continuous (default), unordered discrete(to be specified in the data frame xdat using factor), and ordered discrete (to be specified in thedata frame xdat using ordered). Data can be entered in an arbitrary order and data types will bedetected automatically by the routine (see <strong>np</strong> for details).Data for which bandwidths are to be estimated may be specified symbolically. A typical descriptionhas the form dependent data ~ explanatory data, where dependent data is

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!