104 <strong>np</strong>regbwa univariate response, and explanatory data is a series of variables specified by name, separatedby the separation character ’+’. For example, y1 ~ x1 + x2 specifies that the bandwidthsfor the regression of response y1 and no<strong>np</strong>arametric regressors x1 and x2 are to be estimated.See below for further examples.A variety of kernels may be specified by the user. Kernels implemented for continuous datatypesinclude the second, fourth, sixth, and eighth order Gaussian and Epanechnikov kernels, and theuniform kernel. Unordered discrete datatypes use a variation on Aitchison and Aitken’s (1976)kernel, while ordered datatypes use a variation of the Wang and van Ryzin (1981) kernel.Value<strong>np</strong>regbw returns a rbandwidth object, with the following components:bwfvalbandwidth(s), scale factor(s) or nearest neighbours for the data, xdatobjective function value at minimumif bwtype is set to fixed, an object containing bandwidths (or scale factors if bwscaling =TRUE) is returned. If it is set to generalized_nn or adaptive_nn, then instead the kthnearest neighbors are returned for the continuous variables while the discrete kernel bandwidths arereturned for the discrete variables. Bandwidths are stored under the component name bw, with eachelement i corresponding to column i of i<strong>np</strong>ut data xdat.<strong>The</strong> functions predict, summary, and plot support objects of this class.Usage IssuesIf you are using data of mixed types, then it is advisable to use the data.frame function toconstruct your i<strong>np</strong>ut data and not cbind, since cbind will typically not work as intended onmixed data types and will coerce the data to the same type.Caution: multivariate data-driven bandwidth selection methods are, by their nature, computationallyintensive. Virtually all methods require dropping the ith observation from the data set, computingan object, repeating this for all observations in the sample, then averaging each of these leave-oneoutestimates for a given value of the bandwidth vector, and only then repeating this a large numberof times in order to conduct multivariate numerical minimization/maximization. Furthermore, dueto the potential for local minima/maxima, restarting this procedure a large number of times mayoften be necessary. This can be frustrating for users possessing large datasets. For exploratorypurposes, you may wish to override the default search tolerances, say, setting ftol=.01 and tol=.01and conduct multistarting (the default is to restart min(5, ncol(xdat)) times) as is done for a numberof examples. Once the procedure terminates, you can restart search with default tolerances usingthose bandwidths obtained from the less rigorous search (i.e., set bws=bw on subsequent calls tothis routine where bw is the initial bandwidth object). A version of this package using the Rmpiwrapper is under development that allows one to deploy this software in a clustered computingenvironment to facilitate computation involving large datasets.Author(s)Tristen Hayfield 〈hayfield@phys.ethz.ch〉, Jeffrey S. Racine 〈racinej@mcmaster.ca〉
<strong>np</strong>regbw 105ReferencesAitchison, J. and C.G.G. Aitken (1976), “Multivariate binary discrimination by the kernel method,”Biometrika, 63, 413-420.Hall, P. and Q. Li and J.S. Racine (forthcoming), “No<strong>np</strong>arametric estimation of regression functionsin the presence of irrelevant regressors,” <strong>The</strong> Review of Economics and Statistics.Hurvich, C.M. and J.S. Simonoff and C.L. Tsai (1998), “Smoothing parameter selection in no<strong>np</strong>arametricregression using an improved Akaike information criterion,” Journal of the Royal StatisticalSociety B, 60, 271-293.Li, Q. and J.S. Racine (2007), No<strong>np</strong>arametric Econometrics: <strong>The</strong>ory and Practice, Princeton UniversityPress.Li, Q. and J.S. Racine (2004), “Cross-validated local linear no<strong>np</strong>arametric regression,” StatisticaSinica, 14, 485-512.Pagan, A. and A. Ullah (1999), No<strong>np</strong>arametric Econometrics, Cambridge University Press.Racine, J.S. and Q. Li (2004), “No<strong>np</strong>arametric estimation of regression functions with both categoricaland continuous Data,” Journal of Econometrics, 119, 99-130.Wang, M.C. and J. van Ryzin (1981), “A class of smooth estimators for discrete distributions,”Biometrika, 68, 301-309.See Also<strong>np</strong>regExamples# EXAMPLE 1 (INTERFACE=FORMULA): For this example, we compute a# Bivariate no<strong>np</strong>arametric regression estimate for Giovanni Baiocchi's# Italian income panel (see Italy for details)data("Italy")attach(Italy)# Compute the least-squares cross-validated bandwidths for the local# constant estimator (default)bw
- Page 1 and 2:
The np PackageFebruary 16, 2008Vers
- Page 3 and 4:
Italy 3Examplesdata("cps71")attach(
- Page 5 and 6:
wage1 5Examplesdata("oecdpanel")att
- Page 7 and 8:
gradients 7## S3 method for class '
- Page 9 and 10:
np 9A variety of bandwidth methods
- Page 11 and 12:
npcmstest 11npcmstestKernel Consist
- Page 13 and 14:
npcmstest 13ReferencesAitchison, J.
- Page 15 and 16:
npcdens 15npcdensKernel Conditional
- Page 17 and 18:
npcdens 17Valuenpcdens returns a co
- Page 19 and 20:
npcdens 19# Gaussian kernel (defaul
- Page 21 and 22:
npcdens 21# (1993) (see their descr
- Page 23 and 24:
npcdensbw 23fit
- Page 25 and 26:
npcdensbw 25na.actionxdatydatbwspro
- Page 27 and 28:
npcdensbw 27data. The approach is b
- Page 29 and 30:
npconmode 29# depending on the spee
- Page 31 and 32:
npconmode 31tydatexdateydata one (1
- Page 33 and 34:
npconmode 33lwt,family=binomial(lin
- Page 35 and 36:
npudens 35npudensKernel Density and
- Page 37 and 38:
npudens 37Author(s)Tristen Hayfield
- Page 39 and 40:
npudens 39# EXAMPLE 1 (INTERFACE=DA
- Page 41 and 42:
npudensbw 41library("datasets")data
- Page 43 and 44:
npudensbw 43bwsa bandwidth specific
- Page 45 and 46:
npudensbw 45fvalobjective function
- Page 47 and 48:
npudensbw 47# previous examples.bw
- Page 49 and 50:
npudensbw 49# previous examples.bw
- Page 51 and 52:
npksum 51Argumentsformuladatanewdat
- Page 53 and 54: npksum 53Usage IssuesIf you are usi
- Page 55 and 56: npksum 55# the bandwidth object its
- Page 57 and 58: npksum 57ss
- Page 59 and 60: npplot 59plot.behavior = c("plot","
- Page 61 and 62: npplot 61xtrim = 0.0,neval = 50,com
- Page 63 and 64: npplot 63xdatydatzdatxqyqzqxtrimytr
- Page 65 and 66: npplot 65DetailsValuenpplot is a ge
- Page 67 and 68: npplot 67year.seq
- Page 69 and 70: npplot 69# npplot(). When npplot()
- Page 71 and 72: npplreg 71## S3 method for class 'c
- Page 73 and 74: npplreg 73residR2MSEMAEMAPECORRSIGN
- Page 75 and 76: npplreg 75# Plot the regression sur
- Page 77 and 78: npplregbw 77and dependent data), an
- Page 79 and 80: npplregbw 79Detailsnpplregbw implem
- Page 81 and 82: npplregbw 81x2
- Page 83 and 84: npqcmstest 83npqcmstestKernel Consi
- Page 85 and 86: npqcmstest 85Author(s)Tristen Hayfi
- Page 87 and 88: npqreg 87ftol = 1.19209e-07,tol = 1
- Page 89 and 90: npqreg 89Li, Q. and J.S. Racine (20
- Page 91 and 92: npreg 91Usagenpreg(bws, ...)## S3 m
- Page 93 and 94: npreg 93residR2MSEMAEMAPECORRSIGNif
- Page 95 and 96: npreg 95summary(model)# Use npplot(
- Page 97 and 98: npreg 97# - this may take a few min
- Page 99 and 100: npreg 99# then a noisy samplen
- Page 101 and 102: npregbw 101## S3 method for class '
- Page 103: npregbw 103ckerorderukertypeokertyp
- Page 107 and 108: npsigtest 107bw
- Page 109 and 110: npsigtest 109Author(s)Tristen Hayfi
- Page 111 and 112: npindex 111Usagenpindex(bws, ...)##
- Page 113 and 114: npindex 113MAEMAPECORRSIGNif method
- Page 115 and 116: npindex 115x2
- Page 117 and 118: npindex 117# x1 is chi-squared havi
- Page 119 and 120: npindexbw 119# plotting via persp()
- Page 121 and 122: npindexbw 121methodnmultithe single
- Page 123 and 124: npindexbw 123allows one to deploy t
- Page 125 and 126: npscoef 125x1
- Page 127 and 128: npscoef 127Valueeydatezdaterrorsres
- Page 129 and 130: npscoef 129# We could manually plot
- Page 131 and 132: npscoefbw 131optim.abstol,optim.max
- Page 133 and 134: npscoefbw 133optim.maxattemptsmaxim
- Page 135 and 136: npscoefbw 135ReferencesAitchison, J
- Page 137 and 138: uocquantile 137uocquantileCompute Q
- Page 139: INDEX 139npconmode, 29npindex, 110n