Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R

About

We introduce the C++ application and R package ranger. The software is a fast implementation of random forests for high dimensional data. Ensembles of classification, regression and survival trees are supported. We describe the implementation, provide examples, validate the package with a reference implementation, and compare runtime and memory usage with other implementations. The new software proves to scale best with the number of features, samples, trees, and features tried for splitting. Finally, we show that ranger is the fastest and most memory efficient implementation of random forests to analyze data on the scale of a genome-wide association study.

Marvin N. Wright, Andreas Ziegler• 2015

Related benchmarks

TaskDatasetResultRank
RegressionInfrared Therm. Temp.
Mean Relative Improvement over Lasso (%)6
5
Regressionliver-disorders
Mean Relative Improvement (%)370
5
RegressionReal Estate Valuation
Mean Relative Improvement over Lasso (%)29.9
5
RegressionAuto MPG
Mean Relative Improvement (%)26.4
5
Regressionautomobile
Mean Relative Improvement over Lasso33
5
RegressionConcrete Comp. Strength
Mean Relative Improvement (%)64.1
5
RegressionFacebook Metrics
Mean Relative Improvement (Lasso)93.4
5
RegressionServo
Mean Relative Improvement (%)42.2
5
RegressionAirfoil Self-Noise
Mean Relative Improvement (%)69.3
5
RegressionForest Fires
Mean Rel. Improvement over Lasso (%)-2.11e+3
5
Showing 10 of 17 rows

Other info

Follow for update