:py:mod:`kwcoco.metrics.sklearn_alts`
=====================================

.. py:module:: kwcoco.metrics.sklearn_alts

.. autoapi-nested-parse::

   Faster pure-python versions of sklearn functions that avoid expensive checks
   and label rectifications. It is assumed that all labels are consecutive
   non-negative integers.


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   kwcoco.metrics.sklearn_alts.confusion_matrix
   kwcoco.metrics.sklearn_alts.global_accuracy_from_confusion
   kwcoco.metrics.sklearn_alts.class_accuracy_from_confusion
   kwcoco.metrics.sklearn_alts._binary_clf_curve2


Attributes
~~~~~~~~~~

.. autoapisummary::

   kwcoco.metrics.sklearn_alts.profile


.. py:data:: profile
   

.. py:function:: confusion_matrix(y_true, y_pred, n_labels=None, labels=None, sample_weight=None)

   faster version of sklearn confusion matrix that avoids the
   expensive checks and label rectification

   Runs in about 0.7ms

   :returns: matrix where rows represent real and cols represent pred
   :rtype: ndarray

   .. rubric:: Example

   >>> y_true = np.array([0, 0, 0, 0, 1, 1, 1, 0,  0, 1])
   >>> y_pred = np.array([0, 0, 0, 0, 0, 0, 0, 1,  1, 1])
   >>> confusion_matrix(y_true, y_pred, 2)
   array([[4, 2],
          [3, 1]])
   >>> confusion_matrix(y_true, y_pred, 2).ravel()
   array([4, 2, 3, 1])

   Benchmarks:
       import ubelt as ub
       y_true = np.random.randint(0, 2, 10000)
       y_pred = np.random.randint(0, 2, 10000)

       n = 1000
       for timer in ub.Timerit(n, bestof=10, label='py-time'):
           sample_weight = [1] * len(y_true)
           confusion_matrix(y_true, y_pred, 2, sample_weight=sample_weight)

       for timer in ub.Timerit(n, bestof=10, label='np-time'):
           sample_weight = np.ones(len(y_true), dtype=int)
           confusion_matrix(y_true, y_pred, 2, sample_weight=sample_weight)


.. py:function:: global_accuracy_from_confusion(cfsn)


.. py:function:: class_accuracy_from_confusion(cfsn)


.. py:function:: _binary_clf_curve2(y_true, y_score, pos_label=None, sample_weight=None)

   MODIFIED VERSION OF SCIKIT-LEARN API

   Calculate true and false positives per binary classification threshold.

   :Parameters: * **y_true** (*array, shape = [n_samples]*) -- True targets of binary classification
                * **y_score** (*array, shape = [n_samples]*) -- Estimated probabilities or decision function
                * **pos_label** (*int or str, default=None*) -- The label of the positive class
                * **sample_weight** (*array-like of shape (n_samples,), default=None*) -- Sample weights.

   :returns: * **fps** (*array, shape = [n_thresholds]*) -- A count of false positives, at index i being the number of negative
               samples assigned a score >= thresholds[i]. The total number of
               negative samples is equal to fps[-1] (thus true negatives are given by
               fps[-1] - fps).
             * **tps** (*array, shape = [n_thresholds <= len(np.unique(y_score))]*) -- An increasing count of true positives, at index i being the number
               of positive samples assigned a score >= thresholds[i]. The total
               number of positive samples is equal to tps[-1] (thus false negatives
               are given by tps[-1] - tps).
             * **thresholds** (*array, shape = [n_thresholds]*) -- Decreasing score values.

   .. rubric:: Example

   >>> y_true  = [      1,   1,   1,   1,   1,   1,   0]
   >>> y_score = [ np.nan, 0.2, 0.3, 0.4, 0.5, 0.6, 0.3]
   >>> sample_weight = None
   >>> pos_label = None
   >>> fps, tps, thresholds = _binary_clf_curve2(y_true, y_score)