hyperimpute.plugins.prediction.regression.plugin_random_forest_regressor module

class RandomForestRegressionPlugin(n_estimators: int = 100, criterion: int = 0, max_features: int = 0, min_samples_split: int = 2, min_samples_leaf: int = 1, max_depth: Optional[int] = 3, hyperparam_search_iterations: Optional[int] = None, random_state: int = 0, **kwargs: Any)

Bases: RegressionPlugin

Regression plugin based on Random forests.

Method:

A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

Parameters:
  • n_estimators – int The number of trees in the forest.

  • criterion – str The function to measure the quality of a split. Supported criteria are “gini” for the Gini impurity and “entropy” for the information gain.

  • max_features – str The number of features to consider when looking for the best split.

  • min_samples_split – int The minimum number of samples required to split an internal node.

  • boostrap – bool Whether bootstrap samples are used when building trees. If False, the whole dataset is used to build each tree.

  • min_samples_leaf – int The minimum number of samples required to be at a leaf node.

Example

>>> from hyperimpute.plugins.prediction import Predictions
>>> plugin = Predictions(category="regression").get("random_forest")
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y=True)
>>> plugin.fit_predict(X, y)
_abc_impl = <_abc_data object>
_fit(X: DataFrame, *args: Any, **kwargs: Any) RandomForestRegressionPlugin
_predict(X: DataFrame, *args: Any, **kwargs: Any) DataFrame
criterions = ['squared_error', 'absolute_error', 'poisson']
features = ['sqrt', 'log2', None]
static hyperparameter_space(*args: Any, **kwargs: Any) List[Params]
module_relative_path: Optional[Path]
static name() str
plugin

alias of RandomForestRegressionPlugin