Boruta (algorithm)

Boruta is an algorithm in the field of machine-learning, and more specifically, a feature-selection algorithm. The aim of the algorithm as presented in the original paper describing it is to find all relevant features (compare with minimal-optimal features set). The Boruta algorithm is not a stand-alone algorithm, but is implemented as a wrapper algorithm around the random-forest classification algorithm. In its essence, Boruta works in an iterative manner, and in each iteration the aim is to remove features which according to a statistical test, are less relevant than what is defined by the authors as a random probe. One of the fundamental components of Boruta is the use of shadow attributes. Shadow attributes are pseudo-features that are added to the information system, and produced by taking existing features from the original data-set and shuffling the values of those features between the original samples (data points). After generating the shadow attributes the procedure proceeds with building random-forest trees and comparing the Z-scores obtained by original features to Z-scores obtained by the shadow attributes. This comparison is the foundation for Boruta to decide whether a feature is important or not.

High level pseudo-code:

1.  Copy all variables (features)
2.  Shuffle values in each feature
3.  Run random-forest on the extended system (shuffled features), gather Z scores
4.  Find maximum MSZA (max Z-score among shadow attributes)
5.  Run random-forest on original features
6.  Assign each original feature a hit if feature Z-score > MSZA
7.  If Z-score <= MSZA, perform two-side equality test against MSZA
8.  If Z-score < MSZA significantly, drop feature as unimportant
9.  If Z-score > MSZA significantly, keep feature as important
10. Repeat from step 5 until all importance is determined for all features or max RF runs have been reached

External links

Burota R package description (CRAN)

🪦 Wikipedia History

1 yearage

1editors

1edits

Archive Provenance

Created: September 28, 2014

Deleted: February 17, 2016

Verify: Verify deletion on Wikipedia →

Article size: 2.6 KB

Technical Metadata

Wikipedia page ID: 43959130

Metadata captured: May 10, 2026 5:59 AM

Metadata updated: May 10, 2026 5:59 AM

Subject Tags

Classification algorithmsDecision treesEnsemble learning

Why Deleted

AfD

by MBisanz

Articles for deletion/Boruta (algorithm) closed as delete

View AfD discussion ↗

Sources

cran.r-project.org/...

http://www.jstatsoft.org/v36/i11/paper

Archive Inventory

View stored source record counts

Revision rows stored: 0

Outgoing links stored: 10

External links stored: 2

Templates stored: 3

Talk exports stored: 0

AfD exports stored: 0

Raw API payloads stored: 0

Image records stored: 0

View full source metadata

Outgoing Wikipedia links (10)

AlgorithmFeature (machine learning)Feature selectionMachine learningRandom forestStatistical classificationStatistical significanceStatistical testWrapper (data mining)Z-scores

External links (2)

cran.r-project.org/...

http://www.jstatsoft.org/v36/i11/paper

Templates (3)

Cite journalRef improveReflist

Boruta (algorithm)

External links

See Also

Cluster Based Ensemble Classifiers

Adaptron

Monte Carlo Machine Learning Library

Monte Carlo Machine Learning Library (MCMLL)

Information Fuzzy Networks