kwcoco.cli.coco_stats module¶
- class kwcoco.cli.coco_stats.CocoStatsCLI(*args: Any, **kwargs: Any)[source]¶
Bases:
DataConfigCompute summary statistics about a COCO dataset.
Basic stats are the number of images, annotations, categories, videos, and tracks. Extended stats are also available.
- SeeAlso:
kwcoco visual_stats –help
Valid options: []
- Parameters:
*args – positional arguments for this data config
**kwargs – keyword arguments for this data config
- classmethod main(cmdline=True, **kw)[source]¶
CommandLine
xdoctest -m kwcoco.cli.coco_stats CocoStatsCLI.main:0 xdoctest -m kwcoco.cli.coco_stats CocoStatsCLI.main:1
Example
>>> kw = {'src': 'special:shapes8'} >>> cmdline = False >>> cls = CocoStatsCLI >>> cls.main(cmdline, **kw)
Example
>>> # xdoctest: +REQUIRES(module:pyyaml) >>> from kwcoco.cli.coco_stats import * # NOQA >>> kw = { >>> 'src': ['special:shapes8', 'special:vidshapes8', 'special:vidshapes2'], >>> 'basic': True, >>> 'extended': True, >>> 'catfreq': True, >>> 'image_size': True, >>> 'annot_attrs': True, >>> 'image_attrs': True, >>> 'video_attrs': True, >>> 'disk_usage': True, >>> 'boxes': True, >>> } >>> cmdline = False >>> cls = CocoStatsCLI >>> print('-- Test YAML format --') >>> kw['format'] = 'yaml' >>> cls.main(cmdline, **kw) >>> print('-- Test Human format --') >>> kw['format'] = 'human' >>> cls.main(cmdline, **kw)
- default = {'annot_attrs': <Value(False)>, 'basic': <Value(True)>, 'boxes': <Value(False)>, 'catfreq': <Value(True)>, 'channels': <Value(False)>, 'disk_usage': <Value(False)>, 'embed': <Value(False)>, 'extended': <Value(True)>, 'format': <Value('human')>, 'image_attrs': <Value(False)>, 'image_size': <Value(False)>, 'io_workers': <Value(0)>, 'src': <Value(['special:shapes8'])>, 'video_attrs': <Value(False)>}¶
- kwcoco.cli.coco_stats._coco_channel_stats(coco_dset)[source]¶
Return information about which channels and sensors are available.
This is a streamlined version of the richer geowatch stats, focused on generic kwcoco datasets.
The exact return values of this function may change in the future.
Example
>>> # xdoctest: +REQUIRES(module:lark) >>> import kwcoco >>> from kwcoco.cli.coco_stats import _coco_channel_stats >>> dset = kwcoco.CocoDataset() >>> dset.add_category('a') >>> gid1 = dset.add_image(file_name='img1.tif', sensor_coarse='S1', width=1, height=1) >>> gid2 = dset.add_image(file_name='img2.tif', sensor_coarse='S2', width=1, height=1) >>> dset.add_asset(gid=gid1, file_name='a1.tif', channels='red,green', width=1, height=1) >>> dset.add_asset(gid=gid1, file_name='a2.tif', channels='blue', width=1, height=1) >>> dset.add_asset(gid=gid2, file_name='b1.tif', channels='red,green', width=1, height=1) >>> dset.add_asset(gid=gid2, file_name='b2.tif', channels='nir', width=1, height=1) >>> info = _coco_channel_stats(dset) >>> assert info['sensor_hist'] == {'S1': 1, 'S2': 1} >>> assert info['chan_hist']['blue,red,green,unknown-chan'] == 1 >>> assert info['chan_hist']['nir,red,green,unknown-chan'] == 1
- kwcoco.cli.coco_stats._dataset_disk_usage(dset)[source]¶
Compute disk usage of all image assets referenced by this dataset.
- Returns:
- {
‘num_files’: int, ‘total_bytes’: int, ‘total_gb’: float, ‘missing_files’: List[str],
}
- Return type:
- kwcoco.cli.coco_stats.byte_str(num, unit='auto', precision=2)[source]¶
Automatically chooses relevant unit (KB, MB, or GB) for displaying some number of bytes.
- Parameters:
num (int) – number of bytes
unit (str) – which unit to use, can be auto, B, KB, MB, GB, TB, PB, EB, ZB, or YB.
precision (int) – number of decimals of precision
References
https://en.wikipedia.org/wiki/Orders_of_magnitude_(data)
- Returns:
string representing the number of bytes with appropriate units
- Return type:
Example
>>> num_list = [1, 100, 1024, 1048576, 1073741824, 1099511627776] >>> result = ub.urepr(list(map(byte_str, num_list)), nl=0) >>> print(result) ['0.00 KB', '0.10 KB', '1.00 KB', '1.00 MB', '1.00 GB', '1.00 TB']