:py:mod:`kwcoco.metrics.sklearn_alts` ===================================== .. py:module:: kwcoco.metrics.sklearn_alts .. autoapi-nested-parse:: Faster pure-python versions of sklearn functions that avoid expensive checks and label rectifications. It is assumed that all labels are consecutive non-negative integers. Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: kwcoco.metrics.sklearn_alts.confusion_matrix kwcoco.metrics.sklearn_alts.global_accuracy_from_confusion kwcoco.metrics.sklearn_alts.class_accuracy_from_confusion kwcoco.metrics.sklearn_alts._binary_clf_curve2 Attributes ~~~~~~~~~~ .. autoapisummary:: kwcoco.metrics.sklearn_alts.profile .. py:data:: profile .. py:function:: confusion_matrix(y_true, y_pred, n_labels=None, labels=None, sample_weight=None) faster version of sklearn confusion matrix that avoids the expensive checks and label rectification Runs in about 0.7ms :returns: matrix where rows represent real and cols represent pred :rtype: ndarray .. rubric:: Example >>> y_true = np.array([0, 0, 0, 0, 1, 1, 1, 0, 0, 1]) >>> y_pred = np.array([0, 0, 0, 0, 0, 0, 0, 1, 1, 1]) >>> confusion_matrix(y_true, y_pred, 2) array([[4, 2], [3, 1]]) >>> confusion_matrix(y_true, y_pred, 2).ravel() array([4, 2, 3, 1]) Benchmarks: import ubelt as ub y_true = np.random.randint(0, 2, 10000) y_pred = np.random.randint(0, 2, 10000) n = 1000 for timer in ub.Timerit(n, bestof=10, label='py-time'): sample_weight = [1] * len(y_true) confusion_matrix(y_true, y_pred, 2, sample_weight=sample_weight) for timer in ub.Timerit(n, bestof=10, label='np-time'): sample_weight = np.ones(len(y_true), dtype=int) confusion_matrix(y_true, y_pred, 2, sample_weight=sample_weight) .. py:function:: global_accuracy_from_confusion(cfsn) .. py:function:: class_accuracy_from_confusion(cfsn) .. py:function:: _binary_clf_curve2(y_true, y_score, pos_label=None, sample_weight=None) MODIFIED VERSION OF SCIKIT-LEARN API Calculate true and false positives per binary classification threshold. :Parameters: * **y_true** (*array, shape = [n_samples]*) -- True targets of binary classification * **y_score** (*array, shape = [n_samples]*) -- Estimated probabilities or decision function * **pos_label** (*int or str, default=None*) -- The label of the positive class * **sample_weight** (*array-like of shape (n_samples,), default=None*) -- Sample weights. :returns: * **fps** (*array, shape = [n_thresholds]*) -- A count of false positives, at index i being the number of negative samples assigned a score >= thresholds[i]. The total number of negative samples is equal to fps[-1] (thus true negatives are given by fps[-1] - fps). * **tps** (*array, shape = [n_thresholds <= len(np.unique(y_score))]*) -- An increasing count of true positives, at index i being the number of positive samples assigned a score >= thresholds[i]. The total number of positive samples is equal to tps[-1] (thus false negatives are given by tps[-1] - tps). * **thresholds** (*array, shape = [n_thresholds]*) -- Decreasing score values. .. rubric:: Example >>> y_true = [ 1, 1, 1, 1, 1, 1, 0] >>> y_score = [ np.nan, 0.2, 0.3, 0.4, 0.5, 0.6, 0.3] >>> sample_weight = None >>> pos_label = None >>> fps, tps, thresholds = _binary_clf_curve2(y_true, y_score)