[Validate] Add Metadata and Attribute filters to Metrics by gatli · Pull Request #269 · scaleapi/nucleus-python-client

gatli · 2022-03-31T13:45:43Z

This builds on @Anirudh-Scale 3D Cuboid metrics PR: #261

It hasn't been merged so the diff is huge 😅

I've added annotation_filters and prediction_filters that allow arbitrary filtering of the data inputs to the metrics supporting (cond AND cond AND cond ...) or (cond AND cond AND cond ...) or .... For a concrete example:

For example if you would like to test pedestrians closer than 70 meters with more than 15 points and pedestrians further away than 70 m with 5 points only against pedestrian annotations you could write something like this:

close_pedestrians = [MetadataFilter("relative_distance", "<", 70), [MetadataFilter("n_points", ">", 15), FieldFilter("label", "=", "pedestrian")]
farther_away_pedestrians = [MetadataFilter("relative_distance", ">", 70), [MetadataFilter("n_points", ">", 5), FieldFilter("label", "=", "pedestrian")]
either_close_or_farther = [close_pedestrians, farther_away_pedestrians]
only_pedestrian_annotations = [FieldFilter("label", "=" "pedestrian")]

evaluation_functions = [
        eval_functions.cuboid_precision(prediction_filters=[either_close_or_farther], annotation_filters=[only_pedestrian_annotations]),
        eval_functions.cuboid_recall(prediction_filters=[either_close_or_farther], annotation_filters=[only_pedestrian_annotations])]
        ]
 
config_description = [f"{f.key} {f.op} {f.value}" for f in close_pedestrians] + [[f"{f.key} {f.op} {f.value}" for f in farther_away_pedestrians]
test_name = f"Frog Lidar V2 - ({config_description})"
client.validate.create_scenario_test(test_name,
                                         training_slc_id, evaluation_functions=evaluation_functions)

The filters support the comparison operators:

class FilterOp(str, Enum):
    GT = ">"
    GTE = ">="
    LT = "<"
    LTE = "<="
    EQ = "="
    EQEQ = "=="
    NEQ = "!="
    IN = "in"
    NOT_IN = "not in"

TODO:

Finish up adding the filters to all Metric classes
Make shapely installation optional

linting for circle ci version used native polygon adding shapely adding shapely changing shapely changing shapely updating shapely poetry added shapely edge case np type

sasha-scale

Overall code looks really well structured and clean - amazing work 🙌

Testing locally, I think I found a small bug (although I wasn't able to root cause it).

Here's the bug report info:

https://dashboard.scale.com/nucleus/ut_c933c89spgb00ehnftjg?spoof=sasha.harrison@scale.com

Here's how I created the unit test:

from nucleus.metrics import MetadataFilter, FieldFilter
BDD_TEST_SLICE = 'slc_c933az3ptmpg0cs2vyk0'
example_annotation_filter = [MetadataFilter("occluded", "==", False), FieldFilter("label", "=", "pedestrian")]
evaluation_functions = [
        client.validate.eval_functions.cuboid_precision(annotation_filters=example_annotation_filter),
        client.validate.eval_functions.cuboid_recall(annotation_filters=example_annotation_filter)
]
config_description = [f"{f.key} {f.op} {f.value}" for f in example_annotation_filter]
test_name = f"BDD Experiments 1 - ({config_description})"
test = client.validate.create_scenario_test(test_name,
                                         BDD_TEST_SLICE, evaluation_functions=evaluation_functions)

The behavior I observed was that when I evaluated the unit test, there was an exception on the ML side: the error is:

"Traceback (most recent call last):\n File \"/usr/local/lib/python3.8/site-packages/celery/app/trace.py\", line 405, in trace_task\n R = retval = fun(*args, **kwargs)\n File \"/usr/local/lib/python3.8/site-packages/celery/app/trace.py\", line 697, in __protected_call__\n return self.run(*args, **kwargs)\n File \"/workspace/model_ci/services/evaluate/tasks.py\", line 51, in evaluate\n response: MetricEvaluationResponse = evaluate_metric(request)\n File \"/workspace/model_ci/services/metrics.py\", line 66, in evaluate_metric\n per_item_metrics, overall_metric = compute_metric(preds_grouped, gt_grouped, eval_func)\n File \"/workspace/model_ci/services/metrics.py\", line 42, in compute_metric\n result = nucleus_metric(anno_list, pred_list)\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 272, in __call__\n cuboid_annotations = filter_by_metadata_fields(\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 190, in filter_by_metadata_fields\n and_conditions = [\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 191, in <listcomp>\n filter_to_comparison_function(cond) for cond in or_branch\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 127, in filter_to_comparison_function\n if FilterType(metadata_filter.type) == FilterType.FIELD:\nAttributeError: 'str' object has no attribute 'type'\n"

sasha-scale · 2022-03-31T21:22:36Z

nucleus/validate/client.py

                slice_id=slice_id,
                evaluation_functions=[
-                    ef.to_entry() for ef in evaluation_functions  # type:ignore
+                    EvalFunctionListEntry(


thank you 🙏

No problem! Your solution was fine but I just disliked that it made some immutable fields look mutable 🙂

gatli · 2022-04-04T08:06:34Z

Overall code looks really well structured and clean - amazing work 🙌

Testing locally, I think I found a small bug (although I wasn't able to root cause it).

Here's the bug report info:

https://dashboard.scale.com/nucleus/ut_c933c89spgb00ehnftjg?spoof=sasha.harrison@scale.com

Here's how I created the unit test:

from nucleus.metrics import MetadataFilter, FieldFilter
BDD_TEST_SLICE = 'slc_c933az3ptmpg0cs2vyk0'
example_annotation_filter = [MetadataFilter("occluded", "==", False), FieldFilter("label", "=", "pedestrian")]
evaluation_functions = [
        client.validate.eval_functions.cuboid_precision(annotation_filters=example_annotation_filter),
        client.validate.eval_functions.cuboid_recall(annotation_filters=example_annotation_filter)
]
config_description = [f"{f.key} {f.op} {f.value}" for f in example_annotation_filter]
test_name = f"BDD Experiments 1 - ({config_description})"
test = client.validate.create_scenario_test(test_name,
                                         BDD_TEST_SLICE, evaluation_functions=evaluation_functions)

The behavior I observed was that when I evaluated the unit test, there was an exception on the ML side: the error is:

"Traceback (most recent call last):\n File \"/usr/local/lib/python3.8/site-packages/celery/app/trace.py\", line 405, in trace_task\n R = retval = fun(*args, **kwargs)\n File \"/usr/local/lib/python3.8/site-packages/celery/app/trace.py\", line 697, in __protected_call__\n return self.run(*args, **kwargs)\n File \"/workspace/model_ci/services/evaluate/tasks.py\", line 51, in evaluate\n response: MetricEvaluationResponse = evaluate_metric(request)\n File \"/workspace/model_ci/services/metrics.py\", line 66, in evaluate_metric\n per_item_metrics, overall_metric = compute_metric(preds_grouped, gt_grouped, eval_func)\n File \"/workspace/model_ci/services/metrics.py\", line 42, in compute_metric\n result = nucleus_metric(anno_list, pred_list)\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 272, in __call__\n cuboid_annotations = filter_by_metadata_fields(\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 190, in filter_by_metadata_fields\n and_conditions = [\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 191, in <listcomp>\n filter_to_comparison_function(cond) for cond in or_branch\n File \"/usr/local/lib/python3.8/site-packages/nucleus/metrics/cuboid_metrics.py\", line 127, in filter_to_comparison_function\n if FilterType(metadata_filter.type) == FilterType.FIELD:\nAttributeError: 'str' object has no attribute 'type'\n"

I'll have a look. I actually just fixed list of AND re-parsing, I think it might be related to that. Thanks for testing it end-to-end!

gatli · 2022-04-04T19:23:15Z

I've updated the deployment and tested all public functions. They all work in my tests 🙂

commit 7aef2e8 Author: Gunnar Atli Thoroddsen <gunnar.thoroddsen@scale.com> Date: Wed Jan 19 12:02:58 2022 +0000 Update version and add changelog commit 5c0fec5 Author: Gunnar Atli Thoroddsen <gunnar.thoroddsen@scale.com> Date: Mon Jan 17 13:04:00 2022 +0000 Improved error message commit a0d58c2 Author: Gunnar Atli Thoroddsen <gunnar.thoroddsen@scale.com> Date: Mon Jan 17 13:00:03 2022 +0000 Add shapely specific import error commit 39299d5 Author: Gunnar Atli Thoroddsen <gunnar.thoroddsen@scale.com> Date: Mon Jan 17 12:24:19 2022 +0000 Add apt-get update commit 1fc50b9 Author: Gunnar Atli Thoroddsen <gunnar.thoroddsen@scale.com> Date: Mon Jan 17 12:19:39 2022 +0000 Add README section commit 708d9e3 Author: Gunnar Atli Thoroddsen <gunnar.thoroddsen@scale.com> Date: Mon Jan 17 12:07:54 2022 +0000 Adapt test suite to skip metrics tests if shapely not found commit f50bebd Author: Gunnar Atli Thoroddsen <gunnar.thoroddsen@scale.com> Date: Mon Jan 17 11:08:06 2022 +0000 Make Shapely installation optional with extras flag [metrics]

usage

pfmark

overall really great work! 🚀

Tested it locally and I have a few questions:

I added pretty arbitrary filters ([MetadataFilter("relative_distance", ">", 50000), FieldFilter("label", "=", "lalala")]) which should result in no items being passed to the test. Evaluating the test with a model that has the metadata and a model that doesn't have the relevant metadata still has results. Any thought why this happens? Am I getting sth. wrong here?
Any chance we can add some feedback to the scenario test creation which would let the user know how many items were filtered per evaluation function? I think this would be super useful.

CHANGELOG.md

pfmark · 2022-04-06T10:26:09Z

README.md

+apt-get install libgeos-dev
+```
+
+`pip install scale-nucleus[metrics]`


The default installation would still be pip install scale-nucleus?

Yup! But this should be pip install scale-nucleus[shapely]

pfmark · 2022-04-06T10:30:50Z

nucleus/metrics/filtering.py

+    EQ = "="
+    EQEQ = "=="


can we just put == here? They are both mapped to == anyways. I think having both of them is confusing.

One is Python style and the other is SQL style, I think it is fine practice to support both, PyArrow does the same. It has no effect on maintainability but might save somebody a headache so I'll keep it in. 🙂

Co-authored-by: Mark Pfeiffer <mark.pfeiffer@scale.com>

…-attribute-filters

Anirudh-Scale and others added 7 commits March 31, 2022 15:43

3D Cuboid Metrics

e55e7cc

linting for circle ci version used native polygon adding shapely adding shapely changing shapely changing shapely updating shapely poetry added shapely edge case np type

CuboidMetrics can filter metadata

5185c88

Add Cuboid configs

8fcf1f7

Fix mypy errors

2b91b50

Add field filters

80e9628

Add tests for filtering functions and move them to seperate module

6e3c61a

Add in and not in statements

7f58390

gatli self-assigned this Mar 31, 2022

gatli requested review from pfmark and sasha-scale March 31, 2022 13:59

sasha-scale reviewed Mar 31, 2022

View reviewed changes

gatli added 4 commits April 4, 2022 11:03

Add filtering to all metrics

19254f1

Add parameters to config interface

7f5e5e5

Fixing some typing documentation

d9a552f

Rollback segmentation metadata changes

b074461

gatli added 5 commits April 5, 2022 12:53

Don't skip all metrics tests if shapely is missing

4f5d4ef

Import shapely in Cuboid utils only and only raise ModuleNotFound on

a675627

usage

Fix optional install in circleci

3cf6aa6

Update lock

fe468de

gatli requested a review from jean-lucas April 5, 2022 12:13

gatli added 2 commits April 5, 2022 15:34

Bump version to 0.8.4 and add entry to changelog

fb61fa9

Fix short-circuit of filters

2d03b73

pfmark reviewed Apr 6, 2022

View reviewed changes

gatli and others added 3 commits April 7, 2022 10:14

Update CHANGELOG.md according to @pfmark suggestion

7199af1

Co-authored-by: Mark Pfeiffer <mark.pfeiffer@scale.com>

Clean up optional install instructions

c0d7c78

Merge remote-tracking branch 'origin/master' into gunnar-metadata-and…

07909ea

…-attribute-filters

gatli changed the title ~~Add Metadata and Attribute filters~~ [Validate] Add Metadata and Attribute filters to Metrics Apr 7, 2022

gatli merged commit e6328af into master Apr 7, 2022

gatli deleted the gunnar-metadata-and-attribute-filters branch April 7, 2022 08:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Validate] Add Metadata and Attribute filters to Metrics#269

[Validate] Add Metadata and Attribute filters to Metrics#269
gatli merged 21 commits intomasterfrom
gunnar-metadata-and-attribute-filters

gatli commented Mar 31, 2022 •

edited

Loading

Uh oh!

sasha-scale left a comment

Uh oh!

sasha-scale Mar 31, 2022

Uh oh!

gatli Apr 4, 2022

Uh oh!

gatli commented Apr 4, 2022

Uh oh!

gatli commented Apr 4, 2022

Uh oh!

pfmark left a comment

Uh oh!

Uh oh!

pfmark Apr 6, 2022

Uh oh!

gatli Apr 7, 2022

Uh oh!

pfmark Apr 6, 2022

Uh oh!

gatli Apr 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

gatli commented Mar 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO:

Uh oh!

sasha-scale left a comment

Choose a reason for hiding this comment

Uh oh!

sasha-scale Mar 31, 2022

Choose a reason for hiding this comment

Uh oh!

gatli Apr 4, 2022

Choose a reason for hiding this comment

Uh oh!

gatli commented Apr 4, 2022

Uh oh!

gatli commented Apr 4, 2022

Uh oh!

pfmark left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pfmark Apr 6, 2022

Choose a reason for hiding this comment

Uh oh!

gatli Apr 7, 2022

Choose a reason for hiding this comment

Uh oh!

pfmark Apr 6, 2022

Choose a reason for hiding this comment

Uh oh!

gatli Apr 7, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gatli commented Mar 31, 2022 •

edited

Loading