Python-package Introduction

This page contains parameters.

BBF.BFClassifier

class BBF.BFClassifier(max_iterations=10, active_function='relu', n_nodes_H=100, reg_alpha=0.001, verbose=False, boosting_model='ridge', batch_size=256, learning_rate=0.05, initLearner=None, random_state=0)[source]

Bases: ClassifierMixin, BF

TrBF classifier. Construct a TrBF model to fine-tune the initial model.

Parameters:

max_iterations (int, default=10) – Controls the number of boosting iterations.
active_function ({str, ('relu', 'tanh', 'sigmoid' or 'linear')}, default='relu') – Controls the active function of enhancement nodes.
n_nodes_H (int, default=100) – Controls the number of enhancement nodes.
reg_alpha (float, default=0.001) – Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization.
verbose (bool, default=False) – Controls wether to show the boosting process.
boosting_model (str, default='ridge') – Controls the base learner used in boosting.
batch_size (int, default=256) – Controls the batch size.
learning_rate (float, default=0.05) – Controls the learning rate.
initLearner (obj, default=None) – Controls the initial model.
random_state (int, default=0) – Controls the randomness of the estimator.

property classes_: Classes labels

fit(X=None, y=None, eval_data=None)[source]

Build a TrBF model.

Parameters:

X ({ndarray, sparse matrix} of shape (n_samples, n_features) or dict) – Training data.
y (ndarray of shape (n_samples,)) – Target values.
eval_data (tuple (X_test, y_test)) – tuple to use for watching the boosting process.

Returns:

self – Instance of the estimator.

Return type:

object

get_params(deep=True)

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

predict(X, iter=None)[source]

Predict class labels for samples in X.

Parameters:

X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples.
iter (int) – Total number of iterations used in the prediction.

Returns:

C – Predicted class label per sample.

Return type:

array, shape [n_samples]

predict_proba(X, iter=None)[source]

Predict class probabilities for samples in X.

Parameters:

X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples.
iter (int) – Total number of iterations used in the prediction.

Returns:

p – Predicted class probability per sample.

Return type:

array, shape [n_samples, n_classes]

save_model(file)

Parameters:: file (str) – Controls the filename.

score(X, y, sample_weight=None)

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

Returns:

score – Mean accuracy of self.predict(X) wrt. y.

Return type:

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: estimator instance

BBF.BFRegressor

class BBF.BFRegressor(max_iterations=10, active_function='relu', n_nodes_H=100, reg_alpha=0.001, verbose=False, boosting_model='ridge', batch_size=256, learning_rate=0.05, initLearner=None, random_state=0)[source]

Bases: RegressorMixin, BF

TrBF regressor. Construct a TrBF model to fine-tune the initial model.

Parameters:

max_iterations (int, default=10) – Controls the number of boosting iterations.
active_function ({str, ('relu', 'tanh', 'sigmoid' or 'linear')}, default='relu') – Controls the active function of enhancement nodes.
n_nodes_H (int, default=100) – Controls the number of enhancement nodes.
reg_alpha (float, default=0.001) – Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization.
verbose (bool, default=False) – Controls wether to show the boosting process.
boosting_model (str, default='ridge') – Controls the base learner used in boosting.
batch_size (int, default=256) – Controls the batch size.
learning_rate (float, default=0.05) – Controls the learning rate.
initLearner (obj, default=None) – Controls the initial model.
random_state (int, default=0) – Controls the randomness of the estimator.

fit(X=None, y=None, eval_data=None)[source]

Build a TrBF model.

Parameters:

X ({ndarray, sparse matrix} of shape (n_samples, n_features) or dict) – Training data.
y (ndarray of shape (n_samples,)) – Target values.
eval_data (tuple (X_test, y_test)) – tuple to use for watching the boosting process.

Returns:

self – Instance of the estimator.

Return type:

object

get_params(deep=True)

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

predict(X, iter=None)[source]

Return the predicted value for each sample.

Parameters:

X (array_like or sparse matrix, shape (n_samples, n_features)) – Samples.
iter (int) – Total number of iterations used in the prediction.

Returns:

C – Returns predicted values.

Return type:

array, shape (n_samples,)

save_model(file)

Parameters:: file (str) – Controls the filename.

score(X, y, sample_weight=None)

Return the coefficient of determination of the prediction.

The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares ((y_true - y_pred)** 2).sum() and \(v\) is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

Returns:

score – \(R^2\) of self.predict(X) wrt. y.

Return type:

float

Notes

The \(R^2\) score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of r2_score(). This influences the score method of all the multioutput regressors (except for MultiOutputRegressor).

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: estimator instance

BBF.BBFClassifier

class BBF.BBFClassifier(max_iterations=10, active_function='relu', n_nodes_H=100, reg_alpha=0.001, boosting_model='ridge', batch_size=256, learning_rate=0.05, initLearner=None, **kwargs)[source]

Bases: BaggingClassifier

Construct a BoostForestClassifier, referred to sklearn.ensemble.BaggingClassifier.

Parameters:

max_iterations (int, default=10) – Controls the number of boosting iterations.
active_function ({str, ('relu', 'tanh', 'sigmoid' or 'linear')}, default='relu') – Controls the active function of enhancement nodes.
n_nodes_H (int, default=100) – Controls the number of enhancement nodes.
reg_alpha (float, default=0.001) – Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization.
verbose (int, default=0) – Controls wether to show the boosting process.
boosting_model (str, default='ridge') – Controls the base learner used in boosting.
batch_size (int, default=256) – Controls the batch size.
learning_rate (float, default=0.05) – Controls the learning rate.
initLearner (obj, default=None) – Controls the initial model.
n_estimators (int, default=10) – The number of base estimators in the ensemble.
max_samples (int or float, default=1.0) –
The number of samples to draw from X to train each base estimator (with replacement by default, see bootstrap for more details).
- If int, then draw max_samples samples.
- If float, then draw max_samples * X.shape[0] samples.
max_features (int or float, default=1.0) –
The number of features to draw from X to train each base estimator ( without replacement by default, see bootstrap_features for more details).
- If int, then draw max_features features.
- If float, then draw max_features * X.shape[1] features.
bootstrap (bool, default=True) – Whether samples are drawn with replacement. If False, sampling without replacement is performed.
bootstrap_features (bool, default=False) – Whether features are drawn with replacement.
oob_score (bool, default=False) – Whether to use out-of-bag samples to estimate the generalization error.
warm_start (bool, default=False) – When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new ensemble.
n_jobs (int, default=None) – The number of jobs to run in parallel for both fit() and predict(). None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.
random_state (int or RandomState, default=None) – Controls the random resampling of the original dataset (sample wise and feature wise). If the base estimator accepts a random_state attribute, a different seed is generated for each instance in the ensemble. Pass an int for reproducible output across multiple function calls.
verbose –
predicting. (Controls the verbosity when fitting and) –

decision_function(X)

Average of the decision functions of the base classifiers.

Parameters:: X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
Returns:: score – The decision function of the input samples. The columns correspond to the classes in sorted order, as they appear in the attribute classes_. Regression and binary classification are special cases with k == 1, otherwise k==n_classes.
Return type:: ndarray of shape (n_samples, k)

property estimators_samples_

The subset of drawn samples for each base estimator.

Returns a dynamically generated list of indices identifying the samples used for fitting each member of the ensemble, i.e., the in-bag samples.

Note: the list is re-created at each call to the property in order to reduce the object memory footprint by not storing the sampling data. Thus fetching the property may be slower than expected.

fit(X, y, sample_weight=None)

Build a Bagging ensemble of estimators from the training set (X, y).

Parameters:

X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
y (array-like of shape (n_samples,)) – The target values (class labels in classification, real numbers in regression).
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights. If None, then samples are equally weighted. Note that this is supported only if the base estimator supports sample weighting.

Returns:

self – Fitted estimator.

Return type:

object

get_params(deep=True)

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

property n_features_

Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.

Type:: DEPRECATED

predict(X)

Predict class for X.

The predicted class of an input sample is computed as the class with the highest mean predicted probability. If base estimators do not implement a predict_proba method, then it resorts to voting.

Parameters:: X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
Returns:: y – The predicted classes.
Return type:: ndarray of shape (n_samples,)

predict_log_proba(X)

Predict class log-probabilities for X.

The predicted class log-probabilities of an input sample is computed as the log of the mean predicted class probabilities of the base estimators in the ensemble.

Parameters:: X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
Returns:: p – The class log-probabilities of the input samples. The order of the classes corresponds to that in the attribute classes_.
Return type:: ndarray of shape (n_samples, n_classes)

predict_proba(X)

Predict class probabilities for X.

The predicted class probabilities of an input sample is computed as the mean predicted class probabilities of the base estimators in the ensemble. If base estimators do not implement a predict_proba method, then it resorts to voting and the predicted class probabilities of an input sample represents the proportion of estimators predicting each class.

Parameters:: X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
Returns:: p – The class probabilities of the input samples. The order of the classes corresponds to that in the attribute classes_.
Return type:: ndarray of shape (n_samples, n_classes)

score(X, y, sample_weight=None)

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True labels for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

Returns:

score – Mean accuracy of self.predict(X) wrt. y.

Return type:

float

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: estimator instance

BBF.BFRegressor

class BBF.BBFRegressor(max_iterations=10, active_function='relu', n_nodes_H=100, reg_alpha=0.001, boosting_model='ridge', batch_size=256, learning_rate=0.05, initLearner=None, **kwargs)[source]

Bases: BaggingRegressor

Construct a BoostForestClassifier, referred to sklearn.ensemble.BaggingRegressor.

Parameters:

max_iterations (int, default=10) – Controls the number of boosting iterations.
active_function ({str, ('relu', 'tanh', 'sigmoid' or 'linear')}, default='relu') – Controls the active function of enhancement nodes.
n_nodes_H (int, default=100) – Controls the number of enhancement nodes.
reg_alpha (float, default=0.001) – Regularization strength; must be a positive float. Regularization improves the conditioning of the problem and reduces the variance of the estimates. Larger values specify stronger regularization.
verbose (int, default=0) – Controls wether to show the boosting process.
boosting_model (str, default='ridge') – Controls the base learner used in boosting.
batch_size (int, default=256) – Controls the batch size.
learning_rate (float, default=0.05) – Controls the learning rate.
initLearner (obj, default=None) – Controls the initial model.
n_estimators (int, default=10) – The number of base estimators in the ensemble.
max_samples (int or float, default=1.0) –
The number of samples to draw from X to train each base estimator (with replacement by default, see bootstrap for more details).
- If int, then draw max_samples samples.
- If float, then draw max_samples * X.shape[0] samples.
max_features (int or float, default=1.0) –
The number of features to draw from X to train each base estimator ( without replacement by default, see bootstrap_features for more details).
- If int, then draw max_features features.
- If float, then draw max_features * X.shape[1] features.
bootstrap (bool, default=True) – Whether samples are drawn with replacement. If False, sampling without replacement is performed.
bootstrap_features (bool, default=False) – Whether features are drawn with replacement.
oob_score (bool, default=False) – Whether to use out-of-bag samples to estimate the generalization error.
warm_start (bool, default=False) – When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new ensemble.
n_jobs (int, default=None) – The number of jobs to run in parallel for both fit() and predict(). None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.
random_state (int or RandomState, default=None) – Controls the random resampling of the original dataset (sample wise and feature wise). If the base estimator accepts a random_state attribute, a different seed is generated for each instance in the ensemble.
verbose – Controls the verbosity when fitting and predicting.

property estimators_samples_

The subset of drawn samples for each base estimator.

Returns a dynamically generated list of indices identifying the samples used for fitting each member of the ensemble, i.e., the in-bag samples.

Note: the list is re-created at each call to the property in order to reduce the object memory footprint by not storing the sampling data. Thus fetching the property may be slower than expected.

fit(X, y, sample_weight=None)

Build a Bagging ensemble of estimators from the training set (X, y).

Parameters:

X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
y (array-like of shape (n_samples,)) – The target values (class labels in classification, real numbers in regression).
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights. If None, then samples are equally weighted. Note that this is supported only if the base estimator supports sample weighting.

Returns:

self – Fitted estimator.

Return type:

object

get_params(deep=True)

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

property n_features_

Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.

Type:: DEPRECATED

predict(X)

Predict regression target for X.

The predicted regression target of an input sample is computed as the mean predicted regression targets of the estimators in the ensemble.

Parameters:: X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.
Returns:: y – The predicted values.
Return type:: ndarray of shape (n_samples,)

score(X, y, sample_weight=None)

Return the coefficient of determination of the prediction.

The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares ((y_true - y_pred)** 2).sum() and \(v\) is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
y (array-like of shape (n_samples,) or (n_samples, n_outputs)) – True values for X.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

Returns:

score – \(R^2\) of self.predict(X) wrt. y.

Return type:

float

Notes

The \(R^2\) score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of r2_score(). This influences the score method of all the multioutput regressors (except for MultiOutputRegressor).

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: estimator instance