Welcome to kwcoco’s documentation!¶
If you are new, please see our getting started document: getting_started
Please also see information in the repo README, which contains similar but complementary information.
For notes about warping and spaces see warping_and_spaces.
The Kitware COCO module defines a variant of the Microsoft COCO format, originally developed for the “collected images in context” object detection challenge. We are backwards compatible with the original module, but we also have improved implementations in several places, including segmentations, keypoints, annotation tracks, multi-spectral images, and videos (which represents a generic sequence of images).
A kwcoco file is a “manifest” that serves as a single reference that points to all images, categories, and annotations in a computer vision dataset. Thus, when applying an algorithm to a dataset, it is sufficient to have the algorithm take one dataset parameter: the path to the kwcoco file. Generally a kwcoco file will live in a “bundle” directory along with the data that it references, and paths in the kwcoco file will be relative to the location of the kwcoco file itself.
The main data structure in this model is largely based on the implementation in https://github.com/cocodataset/cocoapi It uses the same efficient core indexing data structures, but in our implementation the indexing can be optionally turned off, functions are silent by default (with the exception of long running processes, which optionally show progress by default). We support helper functions that add and remove images, categories, and annotations.
The kwcoco.CocoDataset
class is capable of dynamic addition and removal
of categories, images, and annotations. Has better support for keypoints and
segmentation formats than the original COCO format. Despite being written in
Python, this data structure is reasonably efficient.
>>> import kwcoco
>>> import json
>>> # Create demo data
>>> demo = CocoDataset.demo()
>>> # could also use demo.dump / demo.dumps, but this is more explicit
>>> text = json.dumps(demo.dataset)
>>> with open('demo.json', 'w') as file:
>>> file.write(text)
>>> # Read from disk
>>> self = CocoDataset('demo.json')
>>> # Add data
>>> cid = self.add_category('Cat')
>>> gid = self.add_image('new-img.jpg')
>>> aid = self.add_annotation(image_id=gid, category_id=cid, bbox=[0, 0, 100, 100])
>>> # Remove data
>>> self.remove_annotations([aid])
>>> self.remove_images([gid])
>>> self.remove_categories([cid])
>>> # Look at data
>>> import ubelt as ub
>>> print(ub.repr2(self.basic_stats(), nl=1))
>>> print(ub.repr2(self.extended_stats(), nl=2))
>>> print(ub.repr2(self.boxsize_stats(), nl=3))
>>> print(ub.repr2(self.category_annotation_frequency()))
>>> # Inspect data
>>> import kwplot
>>> kwplot.autompl()
>>> self.show_image(gid=1)
>>> # Access single-item data via imgs, cats, anns
>>> cid = 1
>>> self.cats[cid]
{'id': 1, 'name': 'astronaut', 'supercategory': 'human'}
>>> gid = 1
>>> self.imgs[gid]
{'id': 1, 'file_name': 'astro.png', 'url': 'https://i.imgur.com/KXhKM72.png'}
>>> aid = 3
>>> self.anns[aid]
{'id': 3, 'image_id': 1, 'category_id': 3, 'line': [326, 369, 500, 500]}
>>> # Access multi-item data via the annots and images helper objects
>>> aids = self.index.gid_to_aids[2]
>>> annots = self.annots(aids)
>>> print('annots = {}'.format(ub.repr2(annots, nl=1, sv=1)))
annots = <Annots(num=2)>
>>> annots.lookup('category_id')
[6, 4]
>>> annots.lookup('bbox')
[[37, 6, 230, 240], [124, 96, 45, 18]]
>>> # built in conversions to efficient kwimage array DataStructures
>>> print(ub.repr2(annots.detections.data))
{
'boxes': <Boxes(xywh,
array([[ 37., 6., 230., 240.],
[124., 96., 45., 18.]], dtype=float32))>,
'class_idxs': np.array([5, 3], dtype=np.int64),
'keypoints': <PointsList(n=2) at 0x7f07eda33220>,
'segmentations': <PolygonList(n=2) at 0x7f086365aa60>,
}
>>> gids = list(self.imgs.keys())
>>> images = self.images(gids)
>>> print('images = {}'.format(ub.repr2(images, nl=1, sv=1)))
images = <Images(num=3)>
>>> images.lookup('file_name')
['astro.png', 'carl.png', 'stars.png']
>>> print('images.annots = {}'.format(images.annots))
images.annots = <AnnotGroups(n=3, m=3.7, s=3.9)>
>>> print('images.annots.cids = {!r}'.format(images.annots.cids))
images.annots.cids = [[1, 2, 3, 4, 5, 5, 5, 5, 5], [6, 4], []]
CocoDataset API¶
The following is a logical grouping of the public kwcoco.CocoDataset API attributes and methods. See the in-code documentation for further details.
CocoDataset classmethods (via MixinCocoExtras)¶
kwcoco.CocoDataset.coerce
- Attempt to transform the input into the intended CocoDataset.
kwcoco.CocoDataset.demo
- Create a toy coco dataset for testing and demo puposes
kwcoco.CocoDataset.random
- Creates a random CocoDataset according to distribution parameters
CocoDataset classmethods (via CocoDataset)¶
kwcoco.CocoDataset.from_coco_paths
- Constructor from multiple coco file paths.
kwcoco.CocoDataset.from_data
- Constructor from a json dictionary
kwcoco.CocoDataset.from_image_paths
- Constructor from a list of images paths.
CocoDataset slots¶
kwcoco.CocoDataset.index
- an efficient lookup index into the coco data structure. The index defines its own attributes likeanns
,cats
,imgs
,gid_to_aids
,file_name_to_img
, etc. SeeCocoIndex
for more details on which attributes are available.
kwcoco.CocoDataset.hashid
- If computed, this will be a hash uniquely identifing the dataset. To ensure this is computed seekwcoco.coco_dataset.MixinCocoExtras._build_hashid()
.
kwcoco.CocoDataset.hashid_parts
-
kwcoco.CocoDataset.tag
- A tag indicating the name of the dataset.
kwcoco.CocoDataset.dataset
- raw json data structure. This is the base dictionary that contains {‘annotations’: List, ‘images’: List, ‘categories’: List}
kwcoco.CocoDataset.bundle_dpath
- If known, this is the root path that all image file names are relative to. This can also be manually overwritten by the user.
kwcoco.CocoDataset.assets_dpath
-
kwcoco.CocoDataset.cache_dpath
-
CocoDataset properties¶
kwcoco.CocoDataset.anns
-
kwcoco.CocoDataset.cats
-
kwcoco.CocoDataset.cid_to_aids
-
kwcoco.CocoDataset.data_fpath
-
kwcoco.CocoDataset.data_root
-
kwcoco.CocoDataset.fpath
- if known, this stores the filepath the dataset was loaded from
kwcoco.CocoDataset.gid_to_aids
-
kwcoco.CocoDataset.img_root
-
kwcoco.CocoDataset.imgs
-
kwcoco.CocoDataset.n_annots
-
kwcoco.CocoDataset.n_cats
-
kwcoco.CocoDataset.n_images
-
kwcoco.CocoDataset.n_videos
-
kwcoco.CocoDataset.name_to_cat
-
CocoDataset methods (via MixinCocoAddRemove)¶
kwcoco.CocoDataset.add_annotation
- Add an annotation to the dataset (dynamically updates the index)
kwcoco.CocoDataset.add_annotations
- Faster less-safe multi-item alternative to add_annotation.
kwcoco.CocoDataset.add_category
- Adds a category
kwcoco.CocoDataset.add_image
- Add an image to the dataset (dynamically updates the index)
kwcoco.CocoDataset.add_images
- Faster less-safe multi-item alternative
kwcoco.CocoDataset.add_video
- Add a video to the dataset (dynamically updates the index)
kwcoco.CocoDataset.clear_annotations
- Removes all annotations (but not images and categories)
kwcoco.CocoDataset.clear_images
- Removes all images and annotations (but not categories)
kwcoco.CocoDataset.ensure_category
- Likeadd_category()
, but returns the existing category id if it already exists instead of failing. In this case all metadata is ignored.
kwcoco.CocoDataset.ensure_image
- Likeadd_image()
,, but returns the existing image id if it already exists instead of failing. In this case all metadata is ignored.
kwcoco.CocoDataset.remove_annotation
- Remove a single annotation from the dataset
kwcoco.CocoDataset.remove_annotation_keypoints
- Removes all keypoints with a particular category
kwcoco.CocoDataset.remove_annotations
- Remove multiple annotations from the dataset.
kwcoco.CocoDataset.remove_categories
- Remove categories and all annotations in those categories. Currently does not change any hierarchy information
kwcoco.CocoDataset.remove_images
- Remove images and any annotations contained by them
kwcoco.CocoDataset.remove_keypoint_categories
- Removes all keypoints of a particular category as well as all annotation keypoints with those ids.
kwcoco.CocoDataset.remove_videos
- Remove videos and any images / annotations contained by them
kwcoco.CocoDataset.set_annotation_category
- Sets the category of a single annotation
CocoDataset methods (via MixinCocoObjects)¶
kwcoco.CocoDataset.annots
- Return vectorized annotation objects
kwcoco.CocoDataset.categories
- Return vectorized category objects
kwcoco.CocoDataset.images
- Return vectorized image objects
kwcoco.CocoDataset.videos
- Return vectorized video objects
CocoDataset methods (via MixinCocoStats)¶
kwcoco.CocoDataset.basic_stats
- Reports number of images, annotations, and categories.
kwcoco.CocoDataset.boxsize_stats
- Compute statistics about bounding box sizes.
kwcoco.CocoDataset.category_annotation_frequency
- Reports the number of annotations of each category
kwcoco.CocoDataset.category_annotation_type_frequency
- Reports the number of annotations of each type for each category
kwcoco.CocoDataset.conform
- Make the COCO file conform a stricter spec, infers attibutes where possible.
kwcoco.CocoDataset.extended_stats
- Reports number of images, annotations, and categories.
kwcoco.CocoDataset.find_representative_images
- Find images that have a wide array of categories. Attempt to find the fewest images that cover all categories using images that contain both a large and small number of annotations.
kwcoco.CocoDataset.stats
- This function corresponds tokwcoco.cli.coco_stats
.
kwcoco.CocoDataset.validate
- Performs checks on this coco dataset.
CocoDataset methods (via MixinCocoAccessors)¶
kwcoco.CocoDataset.category_graph
- Construct a networkx category hierarchy
kwcoco.CocoDataset.delayed_load
- Experimental method
kwcoco.CocoDataset.get_auxiliary_fpath
- Returns the full path to auxiliary data for an image
kwcoco.CocoDataset.get_image_fpath
- Returns the full path to the image
kwcoco.CocoDataset.keypoint_categories
- Construct a consistent CategoryTree representation of keypoint classes
kwcoco.CocoDataset.load_annot_sample
- Reads the chip of an annotation. Note this is much less efficient than using a sampler, but it doesn’t require disk cache.
kwcoco.CocoDataset.load_image
- Reads an image from disk and
kwcoco.CocoDataset.object_categories
- Construct a consistent CategoryTree representation of object classes
CocoDataset methods (via CocoDataset)¶
kwcoco.CocoDataset.copy
- Deep copies this object
kwcoco.CocoDataset.dump
- Writes the dataset out to the json format
kwcoco.CocoDataset.dumps
- Writes the dataset out to the json format
kwcoco.CocoDataset.subset
- Return a subset of the larger coco dataset by specifying which images to port. All annotations in those images will be taken.
kwcoco.CocoDataset.union
- Merges multipleCocoDataset
items into one. Names and associations are retained, but ids may be different.
kwcoco.CocoDataset.view_sql
- Create a cached SQL interface to this dataset suitable for large scale multiprocessing use cases.
CocoDataset methods (via MixinCocoExtras)¶
kwcoco.CocoDataset.corrupted_images
- Check for images that don’t exist or can’t be opened
kwcoco.CocoDataset.missing_images
- Check for images that don’t exist
kwcoco.CocoDataset.rename_categories
- Rename categories with a potentially coarser categorization.
kwcoco.CocoDataset.reroot
- Rebase image/data paths onto a new image/data root.
CocoDataset methods (via MixinCocoDraw)¶
kwcoco.CocoDataset.draw_image
- Use kwimage to draw all annotations on an image and return the pixels as a numpy array.
kwcoco.CocoDataset.imread
- Loads a particular image
kwcoco.CocoDataset.show_image
- Use matplotlib to show an image with annotations overlaid
kwcoco
kwcoco.cli
kwcoco.cli.__main__
kwcoco.cli.coco_conform
kwcoco.cli.coco_eval
kwcoco.cli.coco_grab
kwcoco.cli.coco_modify_categories
kwcoco.cli.coco_reroot
kwcoco.cli.coco_show
kwcoco.cli.coco_split
kwcoco.cli.coco_stats
kwcoco.cli.coco_subset
kwcoco.cli.coco_toydata
kwcoco.cli.coco_union
kwcoco.cli.coco_validate
kwcoco.data
kwcoco.demo
kwcoco.examples
kwcoco.metrics
kwcoco.util
kwcoco.__main__
kwcoco.abstract_coco_dataset
kwcoco.category_tree
kwcoco.channel_spec
kwcoco.coco_dataset
kwcoco.coco_evaluator
kwcoco.coco_image
kwcoco.coco_objects1d
kwcoco.coco_schema
kwcoco.coco_sql_dataset
kwcoco.compat_dataset
kwcoco.kpf
kwcoco.kw18