`kwcoco.metrics.sklearn_alts`¶

Faster pure-python versions of sklearn functions that avoid expensive checks and label rectifications. It is assumed that all labels are consecutive non-negative integers.

Module Contents¶

Functions¶

`confusion_matrix`(y_true, y_pred, n_labels=None, labels=None, sample_weight=None)	faster version of sklearn confusion matrix that avoids the
`global_accuracy_from_confusion`(cfsn)
`class_accuracy_from_confusion`(cfsn)
`_binary_clf_curve2`(y_true, y_score, pos_label=None, sample_weight=None)	MODIFIED VERSION OF SCIKIT-LEARN API

Attributes¶

profile

kwcoco.metrics.sklearn_alts.profile[source]¶

kwcoco.metrics.sklearn_alts.confusion_matrix(y_true, y_pred, n_labels=None, labels=None, sample_weight=None)[source]¶

faster version of sklearn confusion matrix that avoids the expensive checks and label rectification

Runs in about 0.7ms

Returns: matrix where rows represent real and cols represent pred
Return type: ndarray

Example

>>> y_true = np.array([0, 0, 0, 0, 1, 1, 1, 0,  0, 1])
>>> y_pred = np.array([0, 0, 0, 0, 0, 0, 0, 1,  1, 1])
>>> confusion_matrix(y_true, y_pred, 2)
array([[4, 2],
       [3, 1]])
>>> confusion_matrix(y_true, y_pred, 2).ravel()
array([4, 2, 3, 1])

Benchmarks:

import ubelt as ub y_true = np.random.randint(0, 2, 10000) y_pred = np.random.randint(0, 2, 10000)

n = 1000 for timer in ub.Timerit(n, bestof=10, label=’py-time’):

sample_weight = [1] * len(y_true) confusion_matrix(y_true, y_pred, 2, sample_weight=sample_weight)

for timer in ub.Timerit(n, bestof=10, label=’np-time’):: sample_weight = np.ones(len(y_true), dtype=int) confusion_matrix(y_true, y_pred, 2, sample_weight=sample_weight)

kwcoco.metrics.sklearn_alts.global_accuracy_from_confusion(cfsn)[source]¶

kwcoco.metrics.sklearn_alts.class_accuracy_from_confusion(cfsn)[source]¶

kwcoco.metrics.sklearn_alts._binary_clf_curve2(y_true, y_score, pos_label=None, sample_weight=None)[source]¶

MODIFIED VERSION OF SCIKIT-LEARN API

Calculate true and false positives per binary classification threshold.

Parameters

y_true (array, shape = [n_samples]) – True targets of binary classification
y_score (array, shape = [n_samples]) – Estimated probabilities or decision function
pos_label (int or str, default=None) – The label of the positive class
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.

Returns

fps (array, shape = [n_thresholds]) – A count of false positives, at index i being the number of negative samples assigned a score >= thresholds[i]. The total number of negative samples is equal to fps[-1] (thus true negatives are given by fps[-1] - fps).
tps (array, shape = [n_thresholds <= len(np.unique(y_score))]) – An increasing count of true positives, at index i being the number of positive samples assigned a score >= thresholds[i]. The total number of positive samples is equal to tps[-1] (thus false negatives are given by tps[-1] - tps).
thresholds (array, shape = [n_thresholds]) – Decreasing score values.

Example

>>> y_true  = [      1,   1,   1,   1,   1,   1,   0]
>>> y_score = [ np.nan, 0.2, 0.3, 0.4, 0.5, 0.6, 0.3]
>>> sample_weight = None
>>> pos_label = None
>>> fps, tps, thresholds = _binary_clf_curve2(y_true, y_score)

kwcoco.metrics.sklearn_alts¶

Module Contents¶

Functions¶

Attributes¶

`kwcoco.metrics.sklearn_alts`¶