kwcoco.util.util_sklearn module¶
Extensions to sklearn constructs
- class kwcoco.util.util_sklearn.StratifiedGroupKFold(n_splits=3, shuffle=False, random_state=None)[source]¶
Bases:
_BaseKFold
Stratified K-Folds cross-validator with Grouping
Provides train/test indices to split data in train/test sets.
This cross-validation object is a variation of GroupKFold that returns stratified folds. The folds are made by preserving the percentage of samples for each class.
This is an old interface and should likely be refactored and modernized.
- Parameters
n_splits (int, default=3) – Number of folds. Must be at least 2.
- _make_test_folds(X, y=None, groups=None)[source]¶
- Parameters
X (ndarray) – data
y (ndarray) – labels
groups (ndarray) – groupids for items. Items with the same groupid must be placed in the same group.
- Returns
test_folds
- Return type
Example
>>> from kwcoco.util.util_sklearn import * # NOQA >>> import kwarray >>> rng = kwarray.ensure_rng(0) >>> groups = [1, 1, 3, 4, 2, 2, 7, 8, 8] >>> y = [1, 1, 1, 1, 2, 2, 2, 3, 3] >>> X = np.empty((len(y), 0)) >>> self = StratifiedGroupKFold(random_state=rng, shuffle=True) >>> skf_list = list(self.split(X=X, y=y, groups=groups)) >>> import ubelt as ub >>> print(ub.urepr(skf_list, nl=1, with_dtype=False)) [ (np.array([2, 3, 4, 5, 6]), np.array([0, 1, 7, 8])), (np.array([0, 1, 2, 7, 8]), np.array([3, 4, 5, 6])), (np.array([0, 1, 3, 4, 5, 6, 7, 8]), np.array([2])), ]
- _abc_impl = <_abc_data object>¶