kwcoco.coco_evaluator

Evaluates a predicted coco dataset against a truth coco dataset.

The components in this module work programatically or as a command line script.

Todo

  • [ ] does evaluate return one result or multiple results

    based on different configurations?

  • [ ] max_dets - TODO: in original pycocoutils but not here

  • [ ] How do we note what iou_thresh and area-range were in

    the result plots?

CommandLine

xdoctest -m kwcoco.coco_evaluator __doc__:0 --vd --slow

Example

>>> from kwcoco.coco_evaluator import *  # NOQA
>>> from kwcoco.coco_evaluator import CocoEvaluator
>>> import kwcoco
>>> true_dset = kwcoco.CocoDataset.demo('shapes128')
>>> from kwcoco.demo.perterb import perterb_coco
>>> kwargs = {
>>>     'box_noise': 0.5,
>>>     'n_fp': (0, 10),
>>>     'n_fn': (0, 10),
>>>     'with_probs': True,
>>> }
>>> pred_dset = perterb_coco(true_dset, **kwargs)
>>> print('true_dset = {!r}'.format(true_dset))
>>> print('pred_dset = {!r}'.format(pred_dset))
>>> config = {
>>>     'true_dataset': true_dset,
>>>     'pred_dataset': pred_dset,
>>>     'area_range': ['all', 'small'],
>>>     'iou_thresh': [0.3, 0.95],
>>> }
>>> coco_eval = CocoEvaluator(config)
>>> results = coco_eval.evaluate()
>>> # Now we can draw / serialize the results as we please
>>> dpath = ub.ensure_app_cache_dir('kwcoco/tests/test_out_dpath')
>>> results_fpath = join(dpath, 'metrics.json')
>>> print('results_fpath = {!r}'.format(results_fpath))
>>> results.dump(results_fpath, indent='    ')
>>> measures = results['area_range=all,iou_thresh=0.3'].nocls_measures
>>> import pandas as pd
>>> print(pd.DataFrame(ub.dict_isect(
>>>     measures, ['f1', 'g1', 'mcc', 'thresholds',
>>>                'ppv', 'tpr', 'tnr', 'npv', 'fpr',
>>>                'tp_count', 'fp_count',
>>>                'tn_count', 'fn_count'])).iloc[::100])
>>> # xdoctest: +REQUIRES(module:kwplot)
>>> # xdoctest: +REQUIRES(--slow)
>>> results.dump_figures(dpath)
>>> print('dpath = {!r}'.format(dpath))
>>> # xdoctest: +REQUIRES(--vd)
>>> if ub.argflag('--vd') or 1:
>>>     import xdev
>>>     xdev.view_directory(dpath)

Module Contents

Classes

CocoEvalConfig

Evaluate and score predicted versus truth detections / classifications in a COCO dataset

CocoEvaluator

Abstracts the evaluation process to execute on two coco datasets.

CocoResults

Example

CocoSingleResult

Container class to store, draw, summarize, and serialize results from

CocoEvalCLIConfig

Evaluate detection metrics using a predicted and truth coco file.

Functions

dmet_area_weights(dmet, orig_weights, cfsn_vecs, area_ranges, coco_eval, use_area_attr=False)

Hacky function to compute confusion vector ignore weights for different

_load_dets(pred_fpaths, workers=0)

Example

_load_dets_worker(single_pred_fpath, with_coco=True)

main(cmdline=True, **kw)

Todo

  • [ ] should live in kwcoco.cli.coco_eval

Attributes

COCO_SAMPLER_CLS

kwcoco.coco_evaluator.COCO_SAMPLER_CLS[source]
class kwcoco.coco_evaluator.CocoEvalConfig(data=None, default=None, cmdline=False)[source]

Bases: scriptconfig.Config

Evaluate and score predicted versus truth detections / classifications in a COCO dataset

default[source]
normalize(self)[source]

overloadable function called after each load

class kwcoco.coco_evaluator.CocoEvaluator(coco_eval, config)[source]

Bases: object

Abstracts the evaluation process to execute on two coco datasets.

This can be run as a standalone script where the user specifies the paths to the true and predited dataset explicitly, or this can be used by a higher level script that produces the predictions and then sends them to this evaluator.

Example

>>> from kwcoco.coco_evaluator import CocoEvaluator
>>> from kwcoco.demo.perterb import perterb_coco
>>> import kwcoco
>>> true_dset = kwcoco.CocoDataset.demo('shapes8')
>>> kwargs = {
>>>     'box_noise': 0.5,
>>>     'n_fp': (0, 10),
>>>     'n_fn': (0, 10),
>>>     'with_probs': True,
>>> }
>>> pred_dset = perterb_coco(true_dset, **kwargs)
>>> config = {
>>>     'true_dataset': true_dset,
>>>     'pred_dataset': pred_dset,
>>>     'classes_of_interest': [],
>>> }
>>> coco_eval = CocoEvaluator(config)
>>> results = coco_eval.evaluate()
Config[source]
log(coco_eval, msg, level='INFO')[source]
_init(coco_eval)[source]

Performs initial coercion from given inputs into dictionaries of kwimage.Detection objects and attempts to ensure comparable category and image ids.

_ensure_init(coco_eval)[source]
classmethod _rectify_classes(coco_eval, true_classes, pred_classes)[source]
classmethod _coerce_dets(CocoEvaluator, dataset, verbose=0, workers=0)[source]

Coerce the input to a mapping from image-id to kwimage.Detection

Also capture a CocoDataset if possible.

Returns

gid_to_det: mapping from gid to dets extra: any extra information we gathered via coercion

Return type

Tuple[Dict[int, Detections], Dict]

Example

>>> from kwcoco.coco_evaluator import *  # NOQA
>>> import kwcoco
>>> coco_dset = kwcoco.CocoDataset.demo('shapes8')
>>> gid_to_det, extras = CocoEvaluator._coerce_dets(coco_dset)

Example

>>> # xdoctest: +REQUIRES(module:sqlalchemy)
>>> from kwcoco.coco_evaluator import *  # NOQA
>>> import kwcoco
>>> coco_dset = kwcoco.CocoDataset.demo('shapes8').view_sql()
>>> gid_to_det, extras = CocoEvaluator._coerce_dets(coco_dset)
_build_dmet(coco_eval)[source]

Builds the detection metrics object

Returns

DetectionMetrics - object that can perform assignment and

build confusion vectors.

evaluate(coco_eval)[source]

Executes the main evaluation logic. Performs assignments between detections to make DetectionMetrics object, then creates per-item and ovr confusion vectors, and performs various threshold-vs-confusion analyses.

Returns

container storing (and capable of drawing /

serializing) results

Return type

CocoResults

kwcoco.coco_evaluator.dmet_area_weights(dmet, orig_weights, cfsn_vecs, area_ranges, coco_eval, use_area_attr=False)[source]

Hacky function to compute confusion vector ignore weights for different area thresholds. Needs to be slightly refactored.

class kwcoco.coco_evaluator.CocoResults(results, resdata=None)[source]

Bases: ubelt.NiceRepr, kwcoco.metrics.util.DictProxy

Example

>>> from kwcoco.coco_evaluator import *  # NOQA
>>> from kwcoco.coco_evaluator import CocoEvaluator
>>> import kwcoco
>>> true_dset = kwcoco.CocoDataset.demo('shapes2')
>>> from kwcoco.demo.perterb import perterb_coco
>>> kwargs = {
>>>     'box_noise': 0.5,
>>>     'n_fp': (0, 10),
>>>     'n_fn': (0, 10),
>>> }
>>> pred_dset = perterb_coco(true_dset, **kwargs)
>>> print('true_dset = {!r}'.format(true_dset))
>>> print('pred_dset = {!r}'.format(pred_dset))
>>> config = {
>>>     'true_dataset': true_dset,
>>>     'pred_dataset': pred_dset,
>>>     'area_range': ['small'],
>>>     'iou_thresh': [0.3],
>>> }
>>> coco_eval = CocoEvaluator(config)
>>> results = coco_eval.evaluate()
>>> # Now we can draw / serialize the results as we please
>>> dpath = ub.ensure_app_cache_dir('kwcoco/tests/test_out_dpath')
>>> #
>>> # test deserialization works
>>> state = results.__json__()
>>> self2 = CocoResults.from_json(state)
>>> #
>>> # xdoctest: +REQUIRES(module:kwplot)
>>> results.dump_figures(dpath)
>>> results.dump(join(dpath, 'metrics.json'), indent='    ')
dump_figures(results, out_dpath, expt_title=None)[source]
__json__(results)[source]
classmethod from_json(cls, state)[source]
dump(result, file, indent='    ')[source]

Serialize to json file

class kwcoco.coco_evaluator.CocoSingleResult(result, nocls_measures, ovr_measures, cfsn_vecs, meta=None)[source]

Bases: ubelt.NiceRepr

Container class to store, draw, summarize, and serialize results from CocoEvaluator.

__nice__(result)[source]
classmethod from_json(cls, state)[source]
__json__(result)[source]
dump(result, file, indent='    ')[source]

Serialize to json file

dump_figures(result, out_dpath, expt_title=None)[source]
kwcoco.coco_evaluator._load_dets(pred_fpaths, workers=0)[source]

Example

>>> from kwcoco.coco_evaluator import _load_dets, _load_dets_worker
>>> import ubelt as ub
>>> import kwcoco
>>> from os.path import join
>>> dpath = ub.ensure_app_cache_dir('kwcoco/tests/load_dets')
>>> N = 4
>>> pred_fpaths = []
>>> for i in range(1, N + 1):
>>>     dset = kwcoco.CocoDataset.demo('shapes{}'.format(i))
>>>     dset.fpath = join(dpath, 'shapes_{}.mscoco.json'.format(i))
>>>     dset.dump(dset.fpath)
>>>     pred_fpaths.append(dset.fpath)
>>> dets, coco_dset = _load_dets(pred_fpaths)
>>> print('dets = {!r}'.format(dets))
>>> print('coco_dset = {!r}'.format(coco_dset))
kwcoco.coco_evaluator._load_dets_worker(single_pred_fpath, with_coco=True)[source]
class kwcoco.coco_evaluator.CocoEvalCLIConfig(data=None, default=None, cmdline=False)[source]

Bases: scriptconfig.Config

Evaluate detection metrics using a predicted and truth coco file.

default[source]
kwcoco.coco_evaluator.main(cmdline=True, **kw)[source]

Todo

  • [ ] should live in kwcoco.cli.coco_eval

CommandLine

# Generate test data
xdoctest -m kwcoco.cli.coco_eval CocoEvalCLI.main

kwcoco eval \
    --true_dataset=$HOME/.cache/kwcoco/tests/eval/true.mscoco.json \
    --pred_dataset=$HOME/.cache/kwcoco/tests/eval/pred.mscoco.json \
    --out_dpath=$HOME/.cache/kwcoco/tests/eval/out \
    --force_pycocoutils=False \
    --area_range=all,0-4096,4096-inf

nautilus $HOME/.cache/kwcoco/tests/eval/out