10.07.2015 Views

The np Package - NexTag Supports Open Source Initiatives

The np Package - NexTag Supports Open Source Initiatives

The np Package - NexTag Supports Open Source Initiatives

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

122 <strong>np</strong>indexbwValue<strong>np</strong>indexbw may be invoked either with a formula-like symbolic description of variables on whichbandwidth selection is to be performed or through a simpler interface whereby data is passed directlyto the function via the xdat and ydat parameters. Use of these two interfaces is mutuallyexclusive.Note that, unlike most other bandwidth methods in the <strong>np</strong> package, this implementation uses theR optim nonlinear minimization routines and <strong>np</strong>ksum. We have implemented multistarting andstrongly encourage its use in practice. For exploratory purposes, you may wish to override thedefault search tolerances, say, setting optim.reltol=.1 and conduct multistarting (the defaultis to restart min(5, ncol(xdat)) times) as is done for a number of examples.Data for which bandwidths are to be estimated may be specified symbolically. A typical descriptionhas the form dependent data ~ explanatory data, where dependent data isa univariate response, and explanatory data is a series of variables specified by name, separatedby the separation character ’+’. For example y1 ~ x1 + x2 specifies that the bandwidthobject for the regression of response y1 and semiparametric regressors x1 and x2 are to be estimated.See below for further examples.<strong>np</strong>indexbw returns a sibandwidth object, with the following components:bwbetafvalbandwidth(s), scale factor(s) or nearest neighbours for the data, xdatcoefficients of the modelobjective function value at minimumIf bwtype is set to fixed, an object containing a scalar bandwidth for the function G(Xβ) andan estimate of the parameter vector β is returned.If bwtype is set to generalized_nn or adaptive_nn, then instead the scalar kth nearestneighbor is returned.<strong>The</strong> functions coef, predict, summary, and plot support objects of this class.Usage IssuesIf you are using data of mixed types, then it is advisable to use the data.frame function toconstruct your i<strong>np</strong>ut data and not cbind, since cbind will typically not work as intended onmixed data types and will coerce the data to the same type.Caution: multivariate data-driven bandwidth selection methods are, by their nature, computationallyintensive. Virtually all methods require dropping the ith observation from the data set, computingan object, repeating this for all observations in the sample, then averaging each of these leave-oneoutestimates for a given value of the bandwidth vector, and only then repeating this a large numberof times in order to conduct multivariate numerical minimization/maximization. Furthermore, dueto the potential for local minima/maxima, restarting this procedure a large number of times mayoften be necessary. This can be frustrating for users possessing large datasets. For exploratorypurposes, you may wish to override the default search tolerances, say, setting optim.reltol=.1and conduct multistarting (the default is to restart min(5, ncol(xdat)) times). Once the procedureterminates, you can restart search with default tolerances using those bandwidths obtained from theless rigorous search (i.e., set bws=bw on subsequent calls to this routine where bw is the initialbandwidth object). A version of this package using the Rmpi wrapper is under development that

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!