Hive_ML.evaluation.model_evaluation module#

Hive_ML.evaluation.model_evaluation.YB_Visualizer(clf, visualizer, x_train, y_train, x_test, y_test, kwargs)[source]#

Creates and Finalize a YellowBrick visualizer, given the classifier and the train/test features and corresponding labels to use for fitting and scoring.

Parameters:

clf (ClassifierMixin) – Classifier used by the Visualizer.
visualizer (str) – visualizer name to create. Must match a value in YB_VISUALIZERS.
x_train (ndarray) – Train Feature set used for the classifiers fitting.
y_train (ndarray) – Train Label set used for the classifiers fitting.
x_test (ndarray) – Test Feature set used for the classifiers scoring.
y_test (ndarray) – Test Label set used for the classifiers scoring.
kwargs (Dict) – Dictionary of kwargs for the YellowBrick Visualizer.

Return type:

Visualizer

Returns:

YellowBrick Visualizer finalized.

Hive_ML.evaluation.model_evaluation.evaluate_classifiers(ensemble_configuration_df, classifier_kwargs_list, train_feature_set, train_label_set, test_feature_set, test_label_set, aggregation, feature_selection, visualizers=None, output_file=None, plot_title='', random_state=None)[source]#

Evaluate ensemble Classification performance of provided classifiers, weighting and combining the single classifier predictions. If a list of YellowBrick Visualizers is provided, generates a single multi-plot report file.

Parameters:

ensemble_configuration_df (DataFrame) – Dataframe containing the ensemble configuration. Each row should include Classifier , N_Features ( Number of Features to select), and weight ( weighting of the classifier prediction in the ensemble).
classifier_kwargs_list (List[Dict]) – List of classifiers kwargs Dict, used to configure the classifiers.
train_feature_set (ndarray) – Train Feature set used for the classifiers fitting.
train_label_set (ndarray) – Train Label set used for the classifiers fitting.
test_feature_set (ndarray) – Test Feature set used for the classifiers evaluations.
test_label_set (ndarray) – Test Label set used for the classifiers evaluations.
feature_selection (str) – Type of Feature Selection to perform ( SFFS or PCA).
aggregation (str) – Type of Feature Aggregation.
visualizers (List[Dict]) – List of YellowBrick Visualizers to use in the report plot generation.
output_file (Union[str, PathLike]) – File location where to save the YellowBrick Plot Report.
plot_title (str) – String used in the YellowBrick plots as title.

Return type:

Dict

Returns:

Dictionary with the ensemble classifier report ( including the classification metrics ).

Hive_ML.evaluation.model_evaluation.select_best_classifiers(df_summary, metric, reduction, k=1)[source]#

Given a DataFrame containing Validation scores for different Classifiers and Number of Selected Features, returns the k-best combinations and their respective reduced score (mean or median over the validation splits).

Parameters:

df_summary (DataFrame) – Validation DataFrame Summary.
metric (str) – Metric to consider to select the best performance.
reduction (str) – Reduction to apply to the validation splits to select the best performance.
k (int) – Number of the best combinations to select.

Return type:

Tuple[List[Tuple[str, str]], List[float]]

Returns:

Selected best combinations [(N_Features, Classifier), (N_Features, Classifier), … ] and corresponding reduced validation scores.