:py:mod:`kwcoco.util.util_sklearn` ================================== .. py:module:: kwcoco.util.util_sklearn .. autoapi-nested-parse:: Extensions to sklearn constructs Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: kwcoco.util.util_sklearn.StratifiedGroupKFold .. py:class:: StratifiedGroupKFold(n_splits=3, shuffle=False, random_state=None) Bases: :py:obj:`sklearn.model_selection._split._BaseKFold` Stratified K-Folds cross-validator with Grouping Provides train/test indices to split data in train/test sets. This cross-validation object is a variation of GroupKFold that returns stratified folds. The folds are made by preserving the percentage of samples for each class. Read more in the :ref:`User Guide `. :Parameters: **n_splits** (*int, default=3*) -- Number of folds. Must be at least 2. .. py:method:: _make_test_folds(self, X, y=None, groups=None) :Parameters: * **X** (*ndarray*) -- data * **y** (*ndarray*) -- labels * **groups** (*ndarray*) -- groupids for items. Items with the same groupid must be placed in the same group. :returns: test_folds :rtype: list .. rubric:: Example >>> import kwarray >>> rng = kwarray.ensure_rng(0) >>> groups = [1, 1, 3, 4, 2, 2, 7, 8, 8] >>> y = [1, 1, 1, 1, 2, 2, 2, 3, 3] >>> X = np.empty((len(y), 0)) >>> self = StratifiedGroupKFold(random_state=rng, shuffle=True) >>> skf_list = list(self.split(X=X, y=y, groups=groups)) ... >>> import ubelt as ub >>> print(ub.repr2(skf_list, nl=1, with_dtype=False)) [ (np.array([2, 3, 4, 5, 6]), np.array([0, 1, 7, 8])), (np.array([0, 1, 2, 7, 8]), np.array([3, 4, 5, 6])), (np.array([0, 1, 3, 4, 5, 6, 7, 8]), np.array([2])), ] .. py:method:: _iter_test_masks(self, X, y=None, groups=None) Generates boolean masks corresponding to test sets. By default, delegates to _iter_test_indices(X, y, groups) .. py:method:: split(self, X, y, groups=None) Generate indices to split data into training and test set.