16.04.2014 Views

Embedding R in Windows applications, and executing R remotely

Embedding R in Windows applications, and executing R remotely

Embedding R in Windows applications, and executing R remotely

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

An Adventure <strong>in</strong> r<strong>and</strong>omForest – A Set of Mach<strong>in</strong>e Learn<strong>in</strong>g Tools<br />

Andy Liaw<br />

Biometrics Research, Merck Research Laboratories<br />

Rahway, New Jersey, USA<br />

Abstract:<br />

R<strong>and</strong>om Forest is a relatively new method <strong>in</strong> mach<strong>in</strong>e learn<strong>in</strong>g. It is built from an<br />

ensemble of classification or regression trees that are grown with some element of<br />

r<strong>and</strong>omness. While algorithmically it is a relatively simple modification of bagg<strong>in</strong>g,<br />

conceptually it is quite a radical step forward. The algorithm has been shown to have<br />

desirable properties, such as convergence of generalization errors (Breiman, 2001).<br />

Empirically, it has also been demonstrated to have performance competitive with the<br />

current lead<strong>in</strong>g algorithms such as support vector mach<strong>in</strong>es <strong>and</strong> boost<strong>in</strong>g. We will briefly<br />

<strong>in</strong>troduce the algorithm <strong>and</strong> <strong>in</strong>tuitions on why it works, as well as the extra features<br />

implemented <strong>in</strong> the R package r<strong>and</strong>omForest, such as variable importance <strong>and</strong> proximity<br />

measures. Some examples of application <strong>in</strong> drug discovery will also be discussed.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!