The np Package - NexTag Supports Open Source Initiatives

More documents

Recommendations

Info

8 npnpNonparametric Kernel Smoothing Methods for Mixed DatatypesDescriptionThis package provides a variety of nonparametric and semiparametric kernel methods that seamlesslyhandle a mix of continuous, unordered, and ordered factor datatypes (unordered and orderedfactors are often referred to as ‘nominal’ and ‘ordinal’ categorical variables respectively).Bandwidth selection is a key aspect of sound nonparametric and semiparametric kernel estimation.np is designed from the ground up to make bandwidth selection the focus of attention. To this end,one typically begins by creating a ‘bandwidth object’ which embodies all aspects of the method,including specific kernel functions, data names, datatypes, and the like. One then passes thesebandwidth objects to other functions, and those functions can grab the specifics from the bandwidthobject thereby removing potential inconsistencies and unnecessary repetition.There are two ways in which you can interact with functions in np, either i) using dataframes, or ii)using a formula interface, where appropriate.To some, it may be natural to use the dataframe interface. The R data.frame function preservesa variable’s type once it has been cast (unlike cbind, which we avoid for this reason). If you findthis most natural for your project, you first create a dataframe casting data according to their type(i.e., one of continuous (default), factor, ordered) Then you would simply pass this dataframeto the appropriate np function, for example npudensbw(dat=data).To others, however, it may be natural to use the formula interface that is used for the regression examples,among others. For nonparametric regression functions such as npreg, you would proceedas you would using lm (e.g., bw
np 9A variety of bandwidth methods are implemented including fixed, nearest-neighbor, and adaptivenearest-neighbor.A variety of data-driven methods of bandwidth selection are implemented, while the user can specifytheir own bandwidths should they so choose (either a raw bandwidth or scaling factor).A flexible plotting utility, npplot, facilitates graphing of multivariate objects. An example forcreating postscript graphs using the npplot utility and pulling this into a LaTeX document isprovided.The function npksum allows users to create or implement their own kernel estimators or testsshould they so desire.The underlying functions are written in C for computational efficiency. Despite this, due to theirnature, data-driven bandwidth selection methods involving multivariate numerical search can betime-consuming, particularly for large datasets. A version of this package using the Rmpi wrapperis under development that allows one to deploy this software in a clustered computing environmentto facilitate computation involving large datasets.To cite the np package, type citation("np") from within R for details.DetailsThe kernel methods in np employ the so-called ‘generalized product kernels’ found in Hall, Racine,and Li (2004), Li and Racine (2003), Li and Racine (2004), Li and Racine (2007), Ouyang, Li, andRacine (2006), and Racine and Li (2004), among others. For details on a particular method, kindlyrefer to the original references listed above.We briefly describe the particulars of various univariate kernels used to generate the generalizedproduct kernels that underlie the kernel estimators implemented in the np package. In a nutshell,the generalized kernel functions that underlie the kernel estimators in np are formed by taking theproduct of univariate kernels such as those listed below. When you cast your data as a particular type(continuous, factor, or ordered factor) in a data frame or formula, the routines will automaticallyrecognize the type of variable being modelled and use the appropriate kernel type for each variablein the resulting estimator.Second Order Gaussian (x is continuous) k(z) = exp(−z 2 /2)/ √ 2π, where z = (x i − x)/h,and h > 0.Second Order Epanechnikov (x is continuous) k(z) = 3 ( 1 − z 2 /5 ) /(4 √ 5) if z 2 < 5, 0 otherwise,where z = (x i − x)/h, and h > 0.Uniform (x is continuous) k(z) = 1/2 if |z| < 1, 0 otherwise, where z = (x i − x)/h, and h > 0.Aitchison and Aitken (x is a (discrete) factor) l(x i , x, λ) = 1 − λ if x i = x, and λ/(c − 1) ifx i ≠ x, where c is the number of (discrete) outcomes assumed by the factor x. Note that λmust lie between 0 and (c − 1)/c.Wang and van Ryzin (x is a (discrete) ordered factor) l(x i , x, λ) = 1 − λ if |x i − x| = 0, and(1 − λ)λ |xi−x| /2 if |x i − x| ≥ 1. Note that λ must lie between 0 and 1.Li and Racine (x is a (discrete) factor) l(x i , x, λ) = 1 if x i = x, and λ if x i ≠ x. Note that λmust lie between 0 and 1.Li and Racine (x is a (discrete) ordered factor) l(x i , x, λ) = 1 if |x i − x| = 0, and λ |xi−x| if|x i − x| ≥ 1. Note that λ must lie between 0 and 1.
Page 1 and 2: The np PackageFebruary 16, 2008Vers
Page 3 and 4: Italy 3Examplesdata("cps71")attach(
Page 5 and 6: wage1 5Examplesdata("oecdpanel")att
Page 7: gradients 7## S3 method for class '
Page 11 and 12: npcmstest 11npcmstestKernel Consist
Page 13 and 14: npcmstest 13ReferencesAitchison, J.
Page 15 and 16: npcdens 15npcdensKernel Conditional
Page 17 and 18: npcdens 17Valuenpcdens returns a co
Page 19 and 20: npcdens 19# Gaussian kernel (defaul
Page 21 and 22: npcdens 21# (1993) (see their descr
Page 23 and 24: npcdensbw 23fit
Page 25 and 26: npcdensbw 25na.actionxdatydatbwspro
Page 27 and 28: npcdensbw 27data. The approach is b
Page 29 and 30: npconmode 29# depending on the spee
Page 31 and 32: npconmode 31tydatexdateydata one (1
Page 33 and 34: npconmode 33lwt,family=binomial(lin
Page 35 and 36: npudens 35npudensKernel Density and
Page 37 and 38: npudens 37Author(s)Tristen Hayfield
Page 39 and 40: npudens 39# EXAMPLE 1 (INTERFACE=DA
Page 41 and 42: npudensbw 41library("datasets")data
Page 43 and 44: npudensbw 43bwsa bandwidth specific
Page 45 and 46: npudensbw 45fvalobjective function
Page 47 and 48: npudensbw 47# previous examples.bw
Page 49 and 50: npudensbw 49# previous examples.bw
Page 51 and 52: npksum 51Argumentsformuladatanewdat
Page 53 and 54: npksum 53Usage IssuesIf you are usi
Page 55 and 56: npksum 55# the bandwidth object its
Page 57 and 58: npksum 57ss
Page 59 and 60:
npplot 59plot.behavior = c("plot","
Page 61 and 62:
npplot 61xtrim = 0.0,neval = 50,com
Page 63 and 64:
npplot 63xdatydatzdatxqyqzqxtrimytr
Page 65 and 66:
npplot 65DetailsValuenpplot is a ge
Page 67 and 68:
npplot 67year.seq
Page 69 and 70:
npplot 69# npplot(). When npplot()
Page 71 and 72:
npplreg 71## S3 method for class 'c
Page 73 and 74:
npplreg 73residR2MSEMAEMAPECORRSIGN
Page 75 and 76:
npplreg 75# Plot the regression sur
Page 77 and 78:
npplregbw 77and dependent data), an
Page 79 and 80:
npplregbw 79Detailsnpplregbw implem
Page 81 and 82:
npplregbw 81x2
Page 83 and 84:
npqcmstest 83npqcmstestKernel Consi
Page 85 and 86:
npqcmstest 85Author(s)Tristen Hayfi
Page 87 and 88:
npqreg 87ftol = 1.19209e-07,tol = 1
Page 89 and 90:
npqreg 89Li, Q. and J.S. Racine (20
Page 91 and 92:
npreg 91Usagenpreg(bws, ...)## S3 m
Page 93 and 94:
npreg 93residR2MSEMAEMAPECORRSIGNif
Page 95 and 96:
npreg 95summary(model)# Use npplot(
Page 97 and 98:
npreg 97# - this may take a few min
Page 99 and 100:
npreg 99# then a noisy samplen
Page 101 and 102:
npregbw 101## S3 method for class '
Page 103 and 104:
npregbw 103ckerorderukertypeokertyp
Page 105 and 106:
npregbw 105ReferencesAitchison, J.
Page 107 and 108:
npsigtest 107bw
Page 109 and 110:
npsigtest 109Author(s)Tristen Hayfi
Page 111 and 112:
npindex 111Usagenpindex(bws, ...)##
Page 113 and 114:
npindex 113MAEMAPECORRSIGNif method
Page 115 and 116:
npindex 115x2
Page 117 and 118:
npindex 117# x1 is chi-squared havi
Page 119 and 120:
npindexbw 119# plotting via persp()
Page 121 and 122:
npindexbw 121methodnmultithe single
Page 123 and 124:
npindexbw 123allows one to deploy t
Page 125 and 126:
npscoef 125x1
Page 127 and 128:
npscoef 127Valueeydatezdaterrorsres
Page 129 and 130:
npscoef 129# We could manually plot
Page 131 and 132:
npscoefbw 131optim.abstol,optim.max
Page 133 and 134:
npscoefbw 133optim.maxattemptsmaxim
Page 135 and 136:
npscoefbw 135ReferencesAitchison, J
Page 137 and 138:
uocquantile 137uocquantileCompute Q
Page 139:
INDEX 139npconmode, 29npindex, 110n
show all

The np Package - NexTag Supports Open Source Initiatives

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?