kwcoco.metrics.sklearn_alts module¶
Faster pure-python versions of sklearn functions that avoid expensive checks and label rectifications. It is assumed that all labels are consecutive non-negative integers.
- kwcoco.metrics.sklearn_alts.confusion_matrix(y_true, y_pred, n_labels=None, labels=None, sample_weight=None)[source]¶
faster version of sklearn confusion matrix that avoids the expensive checks and label rectification
Runs in about 0.7ms
- Returns:
matrix where rows represent real and cols represent pred
- Return type:
ndarray
Example
>>> y_true = np.array([0, 0, 0, 0, 1, 1, 1, 0, 0, 1]) >>> y_pred = np.array([0, 0, 0, 0, 0, 0, 0, 1, 1, 1]) >>> confusion_matrix(y_true, y_pred, 2) array([[4, 2], [3, 1]]...) >>> confusion_matrix(y_true, y_pred, 2).ravel() array([4, 2, 3, 1]...)
Benchmark
>>> # xdoctest: +SKIP >>> import ubelt as ub >>> y_true = np.random.randint(0, 2, 10000) >>> y_pred = np.random.randint(0, 2, 10000) >>> n = 1000 >>> for timer in ub.Timerit(n, bestof=10, label='py-time'): >>> sample_weight = [1] * len(y_true) >>> confusion_matrix(y_true, y_pred, 2, sample_weight=sample_weight) >>> for timer in ub.Timerit(n, bestof=10, label='np-time'): >>> sample_weight = np.ones(len(y_true), dtype=int) >>> confusion_matrix(y_true, y_pred, 2, sample_weight=sample_weight)
- kwcoco.metrics.sklearn_alts._binary_clf_curve2(y_true, y_score, pos_label=None, sample_weight=None)[source]¶
MODIFIED VERSION OF SCIKIT-LEARN API
Calculate true and false positives per binary classification threshold.
- Parameters:
y_true (array, shape = [n_samples]) – True targets of binary classification
y_score (array, shape = [n_samples]) – Estimated probabilities or decision function
pos_label (int or str, default=None) – The label of the positive class
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Returns:
fps (array, shape = [n_thresholds]) – A count of false positives, at index i being the number of negative samples assigned a score >= thresholds[i]. The total number of negative samples is equal to fps[-1] (thus true negatives are given by fps[-1] - fps).
tps (array, shape = [n_thresholds <= len(np.unique(y_score))]) – An increasing count of true positives, at index i being the number of positive samples assigned a score >= thresholds[i]. The total number of positive samples is equal to tps[-1] (thus false negatives are given by tps[-1] - tps).
thresholds (array, shape = [n_thresholds]) – Decreasing score values.
Example
>>> y_true = [ 1, 1, 1, 1, 1, 1, 0] >>> y_score = [ np.nan, 0.2, 0.3, 0.4, 0.5, 0.6, 0.3] >>> sample_weight = None >>> pos_label = None >>> fps, tps, thresholds = _binary_clf_curve2(y_true, y_score)