:py:mod:`kwcoco.metrics.clf_report` =================================== .. py:module:: kwcoco.metrics.clf_report Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: kwcoco.metrics.clf_report.classification_report kwcoco.metrics.clf_report.ovr_classification_report Attributes ~~~~~~~~~~ .. autoapisummary:: kwcoco.metrics.clf_report.ASCII_ONLY .. py:data:: ASCII_ONLY .. py:function:: classification_report(y_true, y_pred, target_names=None, sample_weight=None, verbose=False, remove_unsupported=False, log=None, ascii_only=False) Computes a classification report which is a collection of various metrics commonly used to evaulate classification quality. This can handle binary and multiclass settings. Note that this function does not accept probabilities or scores and must instead act on final decisions. See ovr_classification_report for a probability based report function using a one-vs-rest strategy. This emulates the bm(cm) Matlab script written by David Powers that is used for computing bookmaker, markedness, and various other scores. References: https://csem.flinders.edu.au/research/techreps/SIE07001.pdf https://www.mathworks.com/matlabcentral/fileexchange/5648-bm-cm-?requestedDomain=www.mathworks.com Jurman, Riccadonna, Furlanello, (2012). A Comparison of MCC and CEN Error Measures in MultiClass Prediction Args: y_true (array): true labels for each item y_pred (array): predicted labels for each item target_names (List): mapping from label to category name sample_weight (ndarray): weight for each item verbose (False): print if True log (callable): print or logging function remove_unsupported (bool, default=False): removes categories that have no support. ascii_only (bool, default=False): if True dont use unicode characters. if the environ ASCII_ONLY is present this is forced to True and cannot be undone. Example: >>> # xdoctest: +IGNORE_WANT >>> # xdoctest: +REQUIRES(module:sklearn) >>> # xdoctest: +REQUIRES(module:pandas) >>> y_true = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3] >>> y_pred = [1, 2, 1, 3, 1, 2, 2, 3, 2, 2, 3, 3, 2, 3, 3, 3, 1, 3] >>> target_names = None >>> sample_weight = None >>> report = classification_report(y_true, y_pred, verbose=0, ascii_only=1) >>> print(report['confusion']) pred 1 2 3 Σr real 1 3 1 1 5 2 0 4 1 5 3 1 1 6 8 Σp 4 6 8 18 >>> print(report['metrics']) metric precision recall fpr markedness bookmaker mcc support class 1 0.7500 0.6000 0.0769 0.6071 0.5231 0.5635 5 2 0.6667 0.8000 0.1538 0.5833 0.6462 0.6139 5 3 0.7500 0.7500 0.2000 0.5500 0.5500 0.5500 8 combined 0.7269 0.7222 0.1530 0.5751 0.5761 0.5758 18 Example: >>> # xdoctest: +IGNORE_WANT >>> # xdoctest: +REQUIRES(module:sklearn) >>> # xdoctest: +REQUIRES(module:pandas) >>> from kwcoco.metrics.clf_report import * # NOQA >>> y_true = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3] >>> y_pred = [1, 2, 1, 3, 1, 2, 2, 3, 2, 2, 3, 3, 2, 3, 3, 3, 1, 3] >>> target_names = None >>> sample_weight = None >>> logs = [] >>> report = classification_report(y_true, y_pred, verbose=1, ascii_only=True, log=logs.append) >>> print(' '.join(logs)) Ignore: >>> size = 100 >>> rng = np.random.RandomState(0) >>> p_classes = np.array([.90, .05, .05][0:2]) >>> p_classes = p_classes / p_classes.sum() >>> p_wrong = np.array([.03, .01, .02][0:2]) >>> y_true = testdata_ytrue(p_classes, p_wrong, size, rng) >>> rs = [] >>> for x in range(17): >>> p_wrong += .05 >>> y_pred = testdata_ypred(y_true, p_wrong, rng) >>> report = classification_report(y_true, y_pred, verbose='hack') >>> rs.append(report) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> import pandas as pd >>> df = pd.DataFrame(rs).drop(['raw'], axis=1) >>> delta = df.subtract(df['target'], axis=0) >>> sqrd_error = np.sqrt((delta ** 2).sum(axis=0)) >>> print('Error') >>> print(sqrd_error.sort_values()) >>> ys = df.to_dict(orient='list') >>> kwplot.multi_plot(ydata_list=ys) .. py:function:: ovr_classification_report(mc_y_true, mc_probs, target_names=None, sample_weight=None, metrics=None, verbose=0, remove_unsupported=False, log=None) One-vs-rest classification report :Parameters: * **mc_y_true** (*ndarray[int]*) -- multiclass truth labels (integer label format). Shape [N]. * **mc_probs** (*ndarray*) -- multiclass probabilities for each class. Shape [N x C]. * **target_names (Dict[int, str]** -- mapping from int label to string name * **sample_weight** (*ndarray*) -- weight for each item. Shape [N]. * **metrics** (*List[str]*) -- names of metrics to compute .. rubric:: Example >>> # xdoctest: +IGNORE_WANT >>> # xdoctest: +REQUIRES(module:sklearn) >>> # xdoctest: +REQUIRES(module:pandas) >>> from kwcoco.metrics.clf_report import * # NOQA >>> y_true = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0] >>> y_probs = np.random.rand(len(y_true), max(y_true) + 1) >>> target_names = None >>> sample_weight = None >>> verbose = True >>> report = ovr_classification_report(y_true, y_probs) >>> print(report['ave']) auc 0.6541 ap 0.6824 kappa 0.0963 mcc 0.1002 brier 0.2214 dtype: float64 >>> print(report['ovr']) auc ap kappa mcc brier support weight 0 0.6062 0.6161 0.0526 0.0598 0.2608 8 0.4444 1 0.5846 0.6014 0.0000 0.0000 0.2195 5 0.2778 2 0.8000 0.8693 0.2623 0.2652 0.1602 5 0.2778