Hive_ML.evaluation.model_evaluation module#
- Hive_ML.evaluation.model_evaluation.YB_Visualizer(clf, visualizer, x_train, y_train, x_test, y_test, kwargs)[source]#
Creates and Finalize a YellowBrick visualizer, given the classifier and the train/test features and corresponding labels to use for fitting and scoring.
- Parameters:
clf (
ClassifierMixin) – Classifier used by the Visualizer.visualizer (
str) – visualizer name to create. Must match a value in YB_VISUALIZERS.x_train (
ndarray) – Train Feature set used for the classifiers fitting.y_train (
ndarray) – Train Label set used for the classifiers fitting.x_test (
ndarray) – Test Feature set used for the classifiers scoring.y_test (
ndarray) – Test Label set used for the classifiers scoring.kwargs (
Dict) – Dictionary of kwargs for the YellowBrick Visualizer.
- Return type:
Visualizer- Returns:
YellowBrick Visualizer finalized.
- Hive_ML.evaluation.model_evaluation.evaluate_classifiers(ensemble_configuration_df, classifier_kwargs_list, train_feature_set, train_label_set, test_feature_set, test_label_set, aggregation, feature_selection, visualizers=None, output_file=None, plot_title='', random_state=None)[source]#
Evaluate ensemble Classification performance of provided classifiers, weighting and combining the single classifier predictions. If a list of YellowBrick Visualizers is provided, generates a single multi-plot report file.
- Parameters:
ensemble_configuration_df (
DataFrame) – Dataframe containing the ensemble configuration. Each row should include Classifier , N_Features ( Number of Features to select), and weight ( weighting of the classifier prediction in the ensemble).classifier_kwargs_list (
List[Dict]) – List of classifiers kwargs Dict, used to configure the classifiers.train_feature_set (
ndarray) – Train Feature set used for the classifiers fitting.train_label_set (
ndarray) – Train Label set used for the classifiers fitting.test_feature_set (
ndarray) – Test Feature set used for the classifiers evaluations.test_label_set (
ndarray) – Test Label set used for the classifiers evaluations.feature_selection (
str) – Type of Feature Selection to perform (SFFSorPCA).aggregation (
str) – Type of Feature Aggregation.visualizers (
List[Dict]) – List of YellowBrick Visualizers to use in the report plot generation.output_file (
Union[str,PathLike]) – File location where to save the YellowBrick Plot Report.plot_title (
str) – String used in the YellowBrick plots as title.
- Return type:
Dict- Returns:
Dictionary with the ensemble classifier report ( including the classification metrics ).
- Hive_ML.evaluation.model_evaluation.select_best_classifiers(df_summary, metric, reduction, k=1)[source]#
Given a DataFrame containing Validation scores for different Classifiers and Number of Selected Features, returns the k-best combinations and their respective reduced score (mean or median over the validation splits).
- Parameters:
df_summary (
DataFrame) – Validation DataFrame Summary.metric (
str) – Metric to consider to select the best performance.reduction (
str) – Reduction to apply to the validation splits to select the best performance.k (
int) – Number of the best combinations to select.
- Return type:
Tuple[List[Tuple[str,str]],List[float]]- Returns:
Selected best combinations [(N_Features, Classifier), (N_Features, Classifier), … ] and corresponding reduced validation scores.