Skip to content

Commit 529fb01

Browse files
authored
Explain EvaluationCriterion "magic" and documenting Pydantic usage (#198)
* Add initial README around EvaluationCriterion * Add Pydantic section to README.md * Move EvaluationCriteria README to docstring * Minor markdown problem
1 parent 77dd069 commit 529fb01

File tree

4 files changed

+123
-3
lines changed

4 files changed

+123
-3
lines changed

README.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,90 @@ poetry run pytest tests/test_dataset.py
6262
poetry run pytest -m "not integration"
6363
```
6464

65+
## Pydantic Models
66+
67+
Prefer using [Pydantic](https://pydantic-docs.helpmanual.io/usage/models/) models rather than creating raw dictionaries
68+
or dataclasses to send or receive over the wire as JSONs. Pydantic is created with data validation in mind and provides very clear error
69+
messages when it encounters a problem with the payload.
70+
71+
The Pydantic model(s) should mirror the payload to send. To represent a JSON payload that looks like this:
72+
```json
73+
{
74+
"example_json_with_info": {
75+
"metadata": {
76+
"frame": 0
77+
},
78+
"reference_id": "frame0",
79+
"url": "s3://example/scale_nucleus/2021/lidar/0038711321865000.json",
80+
"type": "pointcloud"
81+
},
82+
"example_image_with_info": {
83+
"metadata": {
84+
"author": "Picasso"
85+
},
86+
"reference_id": "frame0",
87+
"url": "s3://bucket/0038711321865000.jpg",
88+
"type": "image"
89+
},
90+
}
91+
```
92+
93+
Could be represented as the following structure. Note that the field names map to the JSON keys and the usage of field
94+
validators (`@validator`).
95+
96+
```python
97+
import os.path
98+
from pydantic import BaseModel, validator
99+
from typing import Literal
100+
101+
102+
class JsonWithInfo(BaseModel):
103+
metadata: dict # any dict is valid
104+
reference_id: str
105+
url: str
106+
type: Literal["pointcloud", "recipe"]
107+
108+
@validator("url")
109+
def has_json_extension(cls, v):
110+
if not v.endswith(".json"):
111+
raise ValueError(f"Expected '.json' extension got {v}")
112+
return v
113+
114+
115+
class ImageWithInfo(BaseModel):
116+
metadata: dict # any dict is valid
117+
reference_id: str
118+
url: str
119+
type: Literal["image", "mask"]
120+
121+
@validator("url")
122+
def has_valid_extension(cls, v):
123+
valid_extensions = {".jpg", ".jpeg", ".png", ".tiff"}
124+
_, extension = os.path.splitext(v)
125+
if extension not in valid_extensions:
126+
raise ValueError(f"Expected extension in {valid_extensions} got {v}")
127+
return v
128+
129+
130+
class ExampleNestedModel(BaseModel):
131+
example_json_with_info: JsonWithInfo
132+
example_image_with_info: ImageWithInfo
133+
134+
# Usage:
135+
import requests
136+
payload = requests.get("/example")
137+
parsed_model = ExampleNestedModel.parse_obj(payload.json())
138+
requests.post("example/post_to", json=parsed_model.dict())
139+
```
140+
141+
142+
### Migrating to Pydantic
143+
- When migrating an interface from a dictionary use `nucleus.pydantic_base.DictCompatibleModel`. That allows you to get
144+
the benefits of Pydantic but maintaints backwards compatibility with a Python dictionary by delegating `__getitem__` to
145+
fields.
146+
- When migrating a frozen dataclass use `nucleus.pydantic_base.ImmutableModel`. That is a base class set up to be
147+
immutable after initialization.
148+
65149
**Updating documentation:**
66150
We use [Sphinx](https://www.sphinx-doc.org/en/master/) to autogenerate our API Reference from docstrings.
67151

nucleus/modelci/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
__all__ = [
44
"ModelCI",
55
"UnitTest",
6+
"EvaluationCriterion",
67
]
78

89
from .client import ModelCI

nucleus/modelci/client.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -67,8 +67,8 @@ def create_unit_test(
6767
Args:
6868
name: unique name of test
6969
slice_id: id of (pre-defined) slice of items to evaluate test on.
70-
evaluation_criteria: Pass/fail criteria for the test. Created with a comparison with an eval functions.
71-
See :class:`eval_functions`.
70+
evaluation_criteria: :class:`EvaluationCriterion` defines a pass/fail criteria for the test. Created with a
71+
comparison with an eval functions. See :class:`eval_functions`.
7272
7373
Returns:
7474
Created UnitTest object.
@@ -117,7 +117,7 @@ def delete_unit_test(self, unit_test_id: str) -> bool:
117117
client = nucleus.NucleusClient("YOUR_SCALE_API_KEY")
118118
unit_test = client.modelci.list_unit_tests()[0]
119119
120-
success = client.modelci.create_unit_test(unit_test.id)
120+
success = client.modelci.delete_unit_test(unit_test.id)
121121
122122
Args:
123123
unit_test_id: unique ID of unit test

nucleus/modelci/data_transfer_objects/eval_function.py

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,41 @@ class EvaluationCriterion(ImmutableModel):
1111
An Evaluation Criterion is defined as an evaluation function, threshold, and comparator.
1212
It describes how to apply an evaluation function
1313
14+
Notes:
15+
To define the evaluation criteria for a scenario test we've created some syntactic sugar to make it look closer to an
16+
actual function call, and we also hide away implementation details related to our data model that simply are not clear,
17+
UX-wise.
18+
19+
Instead of defining criteria like this::
20+
21+
from nucleus.modelci.data_transfer_objects.eval_function import (
22+
EvaluationCriterion,
23+
ThresholdComparison,
24+
)
25+
26+
criteria = [
27+
EvaluationCriterion(
28+
eval_function_id="ef_c6m1khygqk400918ays0", # bbox_recall
29+
threshold_comparison=ThresholdComparison.GREATER_THAN,
30+
threshold=0.5,
31+
),
32+
]
33+
34+
we define it like this::
35+
36+
bbox_recall = client.modelci.eval_functions.bbox_recall
37+
criteria = [
38+
bbox_recall() > 0.5
39+
]
40+
41+
The chosen method allows us to document the available evaluation functions in an IDE friendly fashion and hides away
42+
details like internal IDs (`"ef_...."`).
43+
44+
The actual `EvaluationCriterion` is created by overloading the comparison operators for the base class of an evaluation
45+
function. Instead of the comparison returning a bool, we've made it create an `EvaluationCriterion` with the correct
46+
signature to send over the wire to our API.
47+
48+
1449
Parameters:
1550
eval_function_id (str): ID of evaluation function
1651
threshold_comparison (:class:`ThresholdComparison`): comparator for evaluation. i.e. threshold=0.5 and threshold_comparator > implies that a test only passes if score > 0.5.

0 commit comments

Comments
 (0)