kwcoco._helpers module

These items were split out of coco_dataset.py which is becoming too big

These are helper data structures used to do things like auto-increment ids, recycle ids, do renaming, extend sortedcontainers etc…

class kwcoco._helpers._NextId(parent)[source]

Bases: object

Helper class to tracks unused ids for new items

_update_unused(key)[source]

Scans for what the next safe id can be for key

get(key)[source]

Get the next safe item id for key

class kwcoco._helpers._ID_Remapper(reuse=False)[source]

Bases: object

Helper to recycle ids for unions.

For each dataset we create a mapping between each old id and a new id. If possible and reuse=True we allow the new id to match the old id. After each dataset is finished we mark all those ids as used and subsequent new-ids cannot be chosen from that pool.

Parameters:

reuse (bool) – if True we are allowed to reuse ids as long as they haven’t been used before.

Example

>>> video_trackids = [[1, 1, 3, 3, 200, 4], [204, 1, 2, 3, 3, 4, 5, 9]]
>>> self = _ID_Remapper(reuse=True)
>>> for tids in video_trackids:
>>>     new_tids = [self.remap(old_tid) for old_tid in tids]
>>>     self.block_seen()
>>>     print('new_tids = {!r}'.format(new_tids))
new_tids = [1, 1, 3, 3, 200, 4]
new_tids = [204, 205, 2, 206, 206, 207, 5, 9]
>>> #
>>> self = _ID_Remapper(reuse=False)
>>> for tids in video_trackids:
>>>     new_tids = [self.remap(old_tid) for old_tid in tids]
>>>     self.block_seen()
>>>     print('new_tids = {!r}'.format(new_tids))
new_tids = [0, 0, 1, 1, 2, 3]
new_tids = [4, 5, 6, 7, 7, 8, 9, 10]
remap(old_id)[source]

Convert a old-id into a new-id. If self.reuse is True then we will return the same id if it hasn’t been blocked yet.

block_seen()[source]

Mark all seen ids as unable to be used. Any ids sent to remap will now generate new ids.

next_id()[source]

Generate a new id that hasnt been used yet

class kwcoco._helpers.UniqueNameRemapper(policy='warn', name_type='unspecified')[source]

Bases: object

Helper to ensure names will be unique by appending suffixes.

By default will notify users about this action based on policy.

Example

>>> from kwcoco._helpers import UniqueNameRemapper
>>> self = UniqueNameRemapper(policy='ignore')
>>> assert self.remap('foo') == 'foo'
>>> assert self.remap('foo') == 'foo_v001'
>>> assert self.remap('foo') == 'foo_v002'
>>> assert self.remap('foo_v001') == 'foo_v003'
>>> assert 'foo' in self

Example

>>> from kwcoco._helpers import UniqueNameRemapper
>>> import pytest
>>> # Test error policy
>>> self = UniqueNameRemapper(policy='error')
>>> assert self.remap('foo') == 'foo'
>>> with pytest.raises(Exception) as ex:
>>>     self.remap('foo')
>>> print(f'ex={ex}')
Parameters:
  • policy (str) – if “ignore”, will not notify the user of a rename. if “warn”, will emit a warning when a rename occurs. if “error”, will raise an exception if a rename occurs.

  • name_type (str) – A hint to the user about what type of name this was when an error or warning message is emitted.

remap(name)[source]
Parameters:

name (str) – name to check / rename

Returns:

a name guarenteed to be unique

Return type:

str

class kwcoco._helpers._CategoryID_Remapper[source]

Bases: object

Helper for a category union that re-uses ids whenever possible.

Given an old category dictionary, calling remap() will return a new category dictionary with updated properties if necessary.

Example

>>> from kwcoco._helpers import _CategoryID_Remapper
>>> self = _CategoryID_Remapper()
>>> self.remap({'name': 'cat5', 'id': 5})
>>> self.remap({'name': 'cat6', 'id': 9})
>>> self.remap({'name': 'cat9', 'id': 5})
>>> self.remap({'name': 'cat5', 'id': 9, 'special_property': 5})
>>> assert self._id_to_cat == {
>>>     5: {'name': 'cat5', 'id': 5, 'special_property': 5},
>>>     9: {'name': 'cat6', 'id': 9},
>>>     10: {'name': 'cat9', 'id': 10}}
remap(old_cat)[source]
kwcoco._helpers._lut_image_frame_index(imgs, gid)[source]
kwcoco._helpers._lut_frame_index(imgs, gid)
kwcoco._helpers._lut_annot_frame_index(imgs, anns, aid)[source]
class kwcoco._helpers.SortedSet(iterable=None, key=None)[source]

Bases: SortedSet

Initialize sorted set instance.

Optional iterable argument provides an initial iterable of values to initialize the sorted set.

Optional key argument defines a callable that, like the key argument to Python’s sorted function, extracts a comparison key from each value. The default, none, compares values directly.

Runtime complexity: O(n*log(n))

>>> ss = SortedSet([3, 1, 2, 5, 4])
>>> ss
SortedSet([1, 2, 3, 4, 5])
>>> from operator import neg
>>> ss = SortedSet([3, 1, 2, 5, 4], neg)
>>> ss
SortedSet([5, 4, 3, 2, 1], key=<built-in function neg>)
Parameters:
  • iterable – initial values (optional)

  • key – function used to extract comparison key (optional)

_abc_impl = <_abc._abc_data object>
kwcoco._helpers.SortedSetQuiet

alias of SortedSet

kwcoco._helpers._delitems(items, remove_idxs, thresh=750)[source]
Parameters:
  • items (List) – list which will be modified

  • remove_idxs (List[int]) – integers to remove (MUST BE UNIQUE)

kwcoco._helpers._load_and_postprocess(data, loader, postprocess, **loadkw)[source]
kwcoco._helpers._image_corruption_check(fpath, only_shape=False, imread_kwargs=None)[source]

Helper that checks if an image is readable or not

kwcoco._helpers._query_image_ids(coco_dset, select_images=None, select_videos=None, valid_image_ids=None)[source]

Filters to a specific set of images given query parameters based on json-query (jq).

Parameters:
  • select_images (str | List[int] | None) – Can be a coercable YAML list of image ids, or…

    A json query (via the jq spec) that specifies which images belong in the subset. Note, this is a passed as the body of the following jq query format string to filter valid ids ‘.images[] | select({select_images}) | .id’.

    Examples for this argument are as follows: ‘.id < 3’ will select all image ids less than 3. ‘.file_name | test(“.*png”)’ will select only images with file names that end with png. ‘.file_name | test(“.*png”) | not’ will select only images with file names that do not end with png. ‘.myattr == “foo”’ will select only image dictionaries where the value of myattr is “foo”. ‘.id < 3 and (.file_name | test(“.*png”))’ will select only images with id less than 3 that are also pngs. ‘.myattr | in({“val1”: 1, “val4”: 1})’ will take images where myattr is either val1 or val4. An alternative syntax is: ‘[.myattr] | inside([“val1”, “val4”])’

    Requires the “jq” python library is installed.

  • select_videos (str | List[int] | None) – Can be a coercable YAML list of video ids, or…

    A json query (via the jq spec) that specifies which videos belong in the subset. Note, this is a passed as the body of the following jq query format string to filter valid ids ‘.videos[] | select({select_images}) | .id’.

    Examples for this argument are as follows: ‘.file_name | startswith(“foo”)’ will select only videos where the name starts with foo. or ‘.file_name | contains(“foo”)’ will select videos where any part of the filename contains foo.

    Only applicable for dataset that contain videos.

    Requires the “jq” python library is installed.

  • valid_image_ids (Set[int] | None) – if specified use this initial set of image ids, otherwise select from all available.

Returns:

sorted list of filtered image ids

Return type:

List[int]

SeeAlso:

Based on ~/code/geowatch/geowatch/utils/kwcoco_extensions.py::filter_image_ids

Example

>>> # xdoctest: +REQUIRES(module:jq)
>>> from kwcoco._helpers import _query_image_ids
>>> import kwcoco
>>> coco_dset = kwcoco.CocoDataset.demo('vidshapes8', verbose=0)
>>> _query_image_ids(coco_dset, select_images='.id < 3')
>>> _query_image_ids(coco_dset, select_images='.file_name | test(".*.png")')
>>> _query_image_ids(coco_dset, select_images='.file_name | test(".*.png") | not')
>>> _query_image_ids(coco_dset, select_images='.id < 3 and (.file_name | test(".*.png"))')
>>> _query_image_ids(coco_dset, select_images='.id < 3 or (.file_name | test(".*.png"))')

Example

>>> # xdoctest: +REQUIRES(module:kwutil)
>>> from kwcoco._helpers import _query_image_ids
>>> import kwcoco
>>> coco_dset = kwcoco.CocoDataset.demo('vidshapes8', verbose=0)
>>> assert _query_image_ids(coco_dset, select_images='[2, 3, 4]') == [2, 3, 4]
>>> assert _query_image_ids(coco_dset, select_videos='[3]') == [5, 6]