kwcoco.category_tree module

from __future__ import annotations

The category_tree module defines the CategoryTree class, which is used for maintaining flat or hierarchical category information. The kwcoco version of this class only contains the datastructure and does not contain any torch operations. See the ndsampler version for the extension with torch operations.

class kwcoco.category_tree.CategoryTree(graph=None, checks=True)[source]

Bases: NiceRepr

Wrapper that maintains flat or hierarchical category information.

Helps compute softmaxes and probabilities for tree-based categories where a directed edge (A, B) represents that A is a superclass of B.

Note

There are three basic properties that this object maintains:

node:
    Alphanumeric string names that should be generally descriptive.
    Using spaces and special characters in these names is
    discouraged, but can be done.  This is the COCO category "name"
    attribute.  For categories this may be denoted as (name, node,
    cname, catname).

id:
    The integer id of a category should ideally remain consistent.
    These are often given by a dataset (e.g. a COCO dataset).  This
    is the COCO category "id" attribute. For categories this is
    often denoted as (id, cid).

index:
    Contiguous zero-based indices that indexes the list of
    categories.  These should be used for the fastest access in
    backend computation tasks. Typically corresponds to the
    ordering of the channels in the final linear layer in an
    associated model.  For categories this is often denoted as
    (index, cidx, idx, or cx).
Variables:
  • idx_to_node (List[str]) – a list of class names. Implicitly maps from index to category name.

  • id_to_node (Dict[int, str]) – maps integer ids to category names

  • node_to_id (Dict[str, int]) – maps category names to ids

  • node_to_idx (Dict[str, int]) – maps category names to indexes

  • graph (networkx.Graph) – a Graph that stores any hierarchy information. For standard mutually exclusive classes, this graph is edgeless. Nodes in this graph can maintain category attributes / properties.

  • idx_groups (List[List[int]]) – groups of category indices that share the same parent category.

Example

>>> from kwcoco.category_tree import *
>>> graph = nx.from_dict_of_lists({
>>>     'background': [],
>>>     'foreground': ['animal'],
>>>     'animal': ['mammal', 'fish', 'insect', 'reptile'],
>>>     'mammal': ['dog', 'cat', 'human', 'zebra'],
>>>     'zebra': ['grevys', 'plains'],
>>>     'grevys': ['fred'],
>>>     'dog': ['boxer', 'beagle', 'golden'],
>>>     'cat': ['maine coon', 'persian', 'sphynx'],
>>>     'reptile': ['bearded dragon', 't-rex'],
>>> }, nx.DiGraph)
>>> self = CategoryTree(graph)
>>> print(self)
<CategoryTree(nNodes=22, maxDepth=6, maxBreadth=4...)>

Example

>>> # The coerce classmethod is the easiest way to create an instance
>>> import kwcoco
>>> kwcoco.CategoryTree.coerce(['a', 'b', 'c'])
<CategoryTree...nNodes=3, nodes=...'a', 'b', 'c'...
>>> kwcoco.CategoryTree.coerce(4)
<CategoryTree...nNodes=4, nodes=...'class_1', 'class_2', 'class_3', ...
>>> kwcoco.CategoryTree.coerce(4)
Parameters:
  • graph (nx.DiGraph) – either the graph representing a category hierarchy

  • checks (bool, default=True) – if false, bypass input checks

copy()[source]
classmethod from_mutex(nodes, bg_hack=True)[source]
Parameters:

nodes (List[str]) – or a list of class names (in which case they will all be assumed to be mutually exclusive)

Example

>>> print(CategoryTree.from_mutex(['a', 'b', 'c']))
<CategoryTree(nNodes=3, ...)>
classmethod from_json(state)[source]
Parameters:

state (Dict) – see __getstate__ / __json__ for details

classmethod from_coco(categories)[source]

Create a CategoryTree object from coco categories

Parameters:

List[Dict] – list of coco-style categories

Example

>>> import kwcoco
>>> classes1 = kwcoco.CategoryTree.coerce([{'name': 'cat1'}, {'name': 'cat2', 'id': 1}])
>>> assert classes1.id_to_node == {2: 'cat1', 1: 'cat2'}
>>> classes2 = kwcoco.CategoryTree.coerce([{'name': 'cat4'}, {'name': 'cat5'}])
>>> assert classes2.id_to_node == {1: 'cat4', 2: 'cat5'}
classmethod coerce(data, **kw)[source]

Attempt to coerce data as a CategoryTree object.

This is primarily useful for when the software stack depends on categories being represent

This will work if the input data is a specially formatted json dict, a list of mutually exclusive classes, or if it is already a CategoryTree. Otherwise an error will be thrown.

Parameters:
  • data (object) – a known representation of a category tree.

  • **kwargs – input type specific arguments

Returns:

self

Return type:

CategoryTree

Raises:
  • TypeError - if the input format is unknown

  • ValueError - if kwargs are not compatible with the input format

Example

>>> import kwcoco
>>> classes1 = kwcoco.CategoryTree.coerce(3)  # integer
>>> classes2 = kwcoco.CategoryTree.coerce(classes1.__json__())  # graph dict
>>> classes3 = kwcoco.CategoryTree.coerce(['class_1', 'class_2', 'class_3'])  # mutex list
>>> classes4 = kwcoco.CategoryTree.coerce(classes1.graph)  # nx Graph
>>> classes5 = kwcoco.CategoryTree.coerce(classes1)  # cls
>>> classes_09 = kwcoco.CategoryTree.coerce([{'name': 'cat1'}])
>>> # xdoctest: +REQUIRES(module:ndsampler)
>>> import ndsampler
>>> classes6 = ndsampler.CategoryTree.coerce(3)
>>> classes7 = ndsampler.CategoryTree.coerce(classes1)
>>> classes8 = kwcoco.CategoryTree.coerce(classes6)
classmethod demo(key='coco', **kwargs)[source]
Parameters:

key (str) – specify which demo dataset to use. Can be ‘coco’ (which uses the default coco demo data). Can be ‘btree’ which creates a binary tree and accepts kwargs ‘r’ and ‘h’ for branching-factor and height. Can be ‘btree2’, which is the same as btree but returns strings

CommandLine

xdoctest -m ~/code/kwcoco/kwcoco/category_tree.py CategoryTree.demo

Example

>>> from kwcoco.category_tree import *
>>> self = CategoryTree.demo()
>>> print('self = {}'.format(self))
self = <CategoryTree(nNodes=10, maxDepth=2, maxBreadth=4...)>
to_coco()[source]

Converts to a coco-style data structure

Yields:

Dict[str, Any] – coco category dictionaries

property id_to_idx

Example:

>>> import kwcoco
>>> self = kwcoco.CategoryTree.demo()
>>> self.id_to_idx[1]
property idx_to_id

Example:

>>> import kwcoco
>>> self = kwcoco.CategoryTree.demo()
>>> self.idx_to_id[0]
idx_to_ancestor_idxs

memoization decorator for a method that respects args and kwargs

References

Variables:

__func__ (Callable) – the wrapped function

Note

This is very thread-unsafe, and has an issue as pointed out in [ActiveState_Miller_2010], next version may work on fixing this.

Example

>>> import ubelt as ub
>>> closure1 = closure = {'a': 'b', 'c': 'd', 'z': 'z1'}
>>> incr = [0]
>>> class Foo:
>>>     def __init__(self, instance_id):
>>>         self.instance_id = instance_id
>>>     @ub.memoize_method
>>>     def foo_memo(self, key):
>>>         "Wrapped foo_memo docstr"
>>>         value = closure[key]
>>>         incr[0] += 1
>>>         return value, self.instance_id
>>>     def foo(self, key):
>>>         value = closure[key]
>>>         incr[0] += 1
>>>         return value, self.instance_id
>>> self1 = Foo('F1')
>>> assert self1.foo('a') == ('b', 'F1')
>>> assert self1.foo('c') == ('d', 'F1')
>>> assert incr[0] == 2
>>> #
>>> print('Call memoized version')
>>> assert self1.foo_memo('a') == ('b', 'F1')
>>> assert self1.foo_memo('c') == ('d', 'F1')
>>> assert incr[0] == 4, 'should have called a function 4 times'
>>> #
>>> assert self1.foo_memo('a') == ('b', 'F1')
>>> assert self1.foo_memo('c') == ('d', 'F1')
>>> print('Counter should no longer increase')
>>> assert incr[0] == 4
>>> #
>>> print('Closure changes result without memoization')
>>> closure2 = closure = {'a': 0, 'c': 1, 'z': 'z2'}
>>> assert self1.foo('a') == (0, 'F1')
>>> assert self1.foo('c') == (1, 'F1')
>>> assert incr[0] == 6
>>> assert self1.foo_memo('a') == ('b', 'F1')
>>> assert self1.foo_memo('c') == ('d', 'F1')
>>> #
>>> print('Constructing a new object should get a new cache')
>>> self2 = Foo('F2')
>>> self2.foo_memo('a')
>>> assert incr[0] == 7
>>> self2.foo_memo('a')
>>> assert incr[0] == 7
>>> # Check that the decorator preserves the name and docstring
>>> assert self1.foo_memo.__doc__ == 'Wrapped foo_memo docstr'
>>> assert self1.foo_memo.__name__ == 'foo_memo'
>>> print(f'self1.foo_memo = {self1.foo_memo!r}, {hex(id(self1.foo_memo))}')
>>> print(f'self2.foo_memo = {self2.foo_memo!r}, {hex(id(self2.foo_memo))}')
>>> #
>>> # Test for the issue in the active state recipe
>>> method1 = self1.foo_memo
>>> method2 = self2.foo_memo
>>> assert method1('a') == ('b', 'F1')
>>> assert method2('a') == (0, 'F2')
>>> assert method1('z') == ('z2', 'F1')
>>> assert method2('z') == ('z2', 'F2')
idx_to_descendants_idxs

memoization decorator for a method that respects args and kwargs

References

Variables:

__func__ (Callable) – the wrapped function

Note

This is very thread-unsafe, and has an issue as pointed out in [ActiveState_Miller_2010], next version may work on fixing this.

Example

>>> import ubelt as ub
>>> closure1 = closure = {'a': 'b', 'c': 'd', 'z': 'z1'}
>>> incr = [0]
>>> class Foo:
>>>     def __init__(self, instance_id):
>>>         self.instance_id = instance_id
>>>     @ub.memoize_method
>>>     def foo_memo(self, key):
>>>         "Wrapped foo_memo docstr"
>>>         value = closure[key]
>>>         incr[0] += 1
>>>         return value, self.instance_id
>>>     def foo(self, key):
>>>         value = closure[key]
>>>         incr[0] += 1
>>>         return value, self.instance_id
>>> self1 = Foo('F1')
>>> assert self1.foo('a') == ('b', 'F1')
>>> assert self1.foo('c') == ('d', 'F1')
>>> assert incr[0] == 2
>>> #
>>> print('Call memoized version')
>>> assert self1.foo_memo('a') == ('b', 'F1')
>>> assert self1.foo_memo('c') == ('d', 'F1')
>>> assert incr[0] == 4, 'should have called a function 4 times'
>>> #
>>> assert self1.foo_memo('a') == ('b', 'F1')
>>> assert self1.foo_memo('c') == ('d', 'F1')
>>> print('Counter should no longer increase')
>>> assert incr[0] == 4
>>> #
>>> print('Closure changes result without memoization')
>>> closure2 = closure = {'a': 0, 'c': 1, 'z': 'z2'}
>>> assert self1.foo('a') == (0, 'F1')
>>> assert self1.foo('c') == (1, 'F1')
>>> assert incr[0] == 6
>>> assert self1.foo_memo('a') == ('b', 'F1')
>>> assert self1.foo_memo('c') == ('d', 'F1')
>>> #
>>> print('Constructing a new object should get a new cache')
>>> self2 = Foo('F2')
>>> self2.foo_memo('a')
>>> assert incr[0] == 7
>>> self2.foo_memo('a')
>>> assert incr[0] == 7
>>> # Check that the decorator preserves the name and docstring
>>> assert self1.foo_memo.__doc__ == 'Wrapped foo_memo docstr'
>>> assert self1.foo_memo.__name__ == 'foo_memo'
>>> print(f'self1.foo_memo = {self1.foo_memo!r}, {hex(id(self1.foo_memo))}')
>>> print(f'self2.foo_memo = {self2.foo_memo!r}, {hex(id(self2.foo_memo))}')
>>> #
>>> # Test for the issue in the active state recipe
>>> method1 = self1.foo_memo
>>> method2 = self2.foo_memo
>>> assert method1('a') == ('b', 'F1')
>>> assert method2('a') == (0, 'F2')
>>> assert method1('z') == ('z2', 'F1')
>>> assert method2('z') == ('z2', 'F2')
idx_pairwise_distance

memoization decorator for a method that respects args and kwargs

References

Variables:

__func__ (Callable) – the wrapped function

Note

This is very thread-unsafe, and has an issue as pointed out in [ActiveState_Miller_2010], next version may work on fixing this.

Example

>>> import ubelt as ub
>>> closure1 = closure = {'a': 'b', 'c': 'd', 'z': 'z1'}
>>> incr = [0]
>>> class Foo:
>>>     def __init__(self, instance_id):
>>>         self.instance_id = instance_id
>>>     @ub.memoize_method
>>>     def foo_memo(self, key):
>>>         "Wrapped foo_memo docstr"
>>>         value = closure[key]
>>>         incr[0] += 1
>>>         return value, self.instance_id
>>>     def foo(self, key):
>>>         value = closure[key]
>>>         incr[0] += 1
>>>         return value, self.instance_id
>>> self1 = Foo('F1')
>>> assert self1.foo('a') == ('b', 'F1')
>>> assert self1.foo('c') == ('d', 'F1')
>>> assert incr[0] == 2
>>> #
>>> print('Call memoized version')
>>> assert self1.foo_memo('a') == ('b', 'F1')
>>> assert self1.foo_memo('c') == ('d', 'F1')
>>> assert incr[0] == 4, 'should have called a function 4 times'
>>> #
>>> assert self1.foo_memo('a') == ('b', 'F1')
>>> assert self1.foo_memo('c') == ('d', 'F1')
>>> print('Counter should no longer increase')
>>> assert incr[0] == 4
>>> #
>>> print('Closure changes result without memoization')
>>> closure2 = closure = {'a': 0, 'c': 1, 'z': 'z2'}
>>> assert self1.foo('a') == (0, 'F1')
>>> assert self1.foo('c') == (1, 'F1')
>>> assert incr[0] == 6
>>> assert self1.foo_memo('a') == ('b', 'F1')
>>> assert self1.foo_memo('c') == ('d', 'F1')
>>> #
>>> print('Constructing a new object should get a new cache')
>>> self2 = Foo('F2')
>>> self2.foo_memo('a')
>>> assert incr[0] == 7
>>> self2.foo_memo('a')
>>> assert incr[0] == 7
>>> # Check that the decorator preserves the name and docstring
>>> assert self1.foo_memo.__doc__ == 'Wrapped foo_memo docstr'
>>> assert self1.foo_memo.__name__ == 'foo_memo'
>>> print(f'self1.foo_memo = {self1.foo_memo!r}, {hex(id(self1.foo_memo))}')
>>> print(f'self2.foo_memo = {self2.foo_memo!r}, {hex(id(self2.foo_memo))}')
>>> #
>>> # Test for the issue in the active state recipe
>>> method1 = self1.foo_memo
>>> method2 = self2.foo_memo
>>> assert method1('a') == ('b', 'F1')
>>> assert method2('a') == (0, 'F2')
>>> assert method1('z') == ('z2', 'F1')
>>> assert method2('z') == ('z2', 'F2')
is_mutex()[source]

Returns True if all categories are mutually exclusive (i.e. flat)

If true, then the classes may be represented as a simple list of class names without any loss of information, otherwise the underlying category graph is necessary to preserve all knowledge.

Todo

  • [ ] what happens when we have a dummy root?

property num_classes
property class_names
property category_names
property cats

Returns a mapping from category names to category attributes.

If this category tree was constructed from a coco-dataset, then this will contain the coco category attributes.

Returns:

Dict[str, Dict[str, object]]

Example

>>> from kwcoco.category_tree import *
>>> self = CategoryTree.demo()
>>> print('self.cats = {!r}'.format(self.cats))
index(node)[source]

Return the index that corresponds to the category name

Parameters:

node (str) – the name of the category

Returns:

int

take(indexes)[source]

Create a subgraph based on the selected class indexes

subgraph(subnodes, closure=True)[source]

Create a subgraph based on the selected class nodes (i.e. names)

Example

>>> self = CategoryTree.from_coco([
>>>     {'id': 130, 'name': 'n3', 'supercategory': 'n1'},
>>>     {'id': 410, 'name': 'n1', 'supercategory': None},
>>>     {'id': 640, 'name': 'n4', 'supercategory': 'n3'},
>>>     {'id': 220, 'name': 'n2', 'supercategory': 'n1'},
>>>     {'id': 560, 'name': 'n6', 'supercategory': 'n2'},
>>>     {'id': 350, 'name': 'n5', 'supercategory': 'n2'},
>>> ])
>>> self.print_graph()
>>> subnodes = ['n3', 'n6', 'n4', 'n1']
>>> new1 = self.subgraph(subnodes, closure=1)
>>> new1.print_graph()
...
>>> print('new1.idx_to_id = {}'.format(ub.urepr(new1.idx_to_id, nl=0)))
>>> print('new1.idx_to_node = {}'.format(ub.urepr(new1.idx_to_node, nl=0)))
new1.idx_to_id = [130, 560, 640, 410]
new1.idx_to_node = ['n3', 'n6', 'n4', 'n1']
>>> indexes = [2, 1, 0, 5]
>>> new2 = self.take(indexes)
>>> new2.print_graph()
...
>>> print('new2.idx_to_id = {}'.format(ub.urepr(new2.idx_to_id, nl=0)))
>>> print('new2.idx_to_node = {}'.format(ub.urepr(new2.idx_to_node, nl=0)))
new2.idx_to_id = [640, 410, 130, 350]
new2.idx_to_node = ['n4', 'n1', 'n3', 'n5']
>>> subnodes = ['n3', 'n6', 'n4', 'n1']
>>> new3 = self.subgraph(subnodes, closure=0)
>>> new3.print_graph()
_build_index()[source]

construct lookup tables

show()[source]
forest_str()[source]
print_graph()[source]
normalize()[source]

Applies a normalization scheme to the categories.

Note: this may break other tasks that depend on exact category names.

Returns:

CategoryTree

Example

>>> from kwcoco.category_tree import *  # NOQA
>>> import kwcoco
>>> orig = kwcoco.CategoryTree.demo('animals_v1')
>>> self = kwcoco.CategoryTree(nx.relabel_nodes(orig.graph, str.upper))
>>> norm = self.normalize()