hyperimpute.plugins.prediction.classifiers.plugin_random_forest module

class RandomForestPlugin(n_estimators: int = 100, criterion: int = 0, max_features: int = 0, min_samples_split: int = 2, min_samples_leaf: int = 1, max_depth: Optional[int] = 3, random_state: int = 0, hyperparam_search_iterations: Optional[int] = None, **kwargs: Any)

Bases: ClassifierPlugin

Classification plugin based on Random forests.

Method:: A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

Parameters:

n_estimators – int The number of trees in the forest.
criterion – str The function to measure the quality of a split. Supported criteria are “gini” for the Gini impurity and “entropy” for the information gain.
max_features – str The number of features to consider when looking for the best split.
min_samples_split – int The minimum number of samples required to split an internal node.
boostrap – bool Whether bootstrap samples are used when building trees. If False, the whole dataset is used to build each tree.
min_samples_leaf – int The minimum number of samples required to be at a leaf node.

Example

>>> from hyperimpute.plugins.prediction import Predictions
>>> plugin = Predictions(category="classifiers").get("random_forest")
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> plugin.fit_predict(X, y)

_abc_impl = <_abc_data object>

_fit(X: DataFrame, *args: Any, **kwargs: Any) → RandomForestPlugin

_predict(X: DataFrame, *args: Any, **kwargs: Any) → DataFrame

_predict_proba(X: DataFrame, *args: Any, **kwargs: Any) → DataFrame

criterions = ['gini', 'entropy']

features = ['sqrt', 'log2', None]

static hyperparameter_space(*args: Any, **kwargs: Any) → List[Params]

module_relative_path: Optional[Path]

static name() → str

plugin: alias of RandomForestPlugin