Skip to content

Commit 389275c

Browse files
authored
Endpoint to fetch annotations grouped by scenes (#439)
* initial * tune page size * formatting * Add docstring, parameterize page_size
1 parent 629b899 commit 389275c

File tree

3 files changed

+57
-1
lines changed

3 files changed

+57
-1
lines changed

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,19 @@ All notable changes to the [Nucleus Python Client](https://github.com/scaleapi/n
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [0.17.6](https://github.com/scaleapi/nucleus-python-client/releases/tag/v0.17.6) - 2024-07-03
9+
10+
### Added
11+
- Method for downloading all annotations grouped by `scene` and `track_reference_id`.
12+
13+
Example usage:
14+
15+
```python
16+
dataset = client.get_dataset("ds_...")
17+
for scene in dataset.scene_and_annotation_generator():
18+
#...
19+
```
20+
821
## [0.17.5](https://github.com/scaleapi/nucleus-python-client/releases/tag/v0.17.5) - 2024-04-15
922

1023
### Added

nucleus/dataset.py

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1449,6 +1449,49 @@ def items_and_annotations(
14491449
)
14501450
return convert_export_payload(api_payload[EXPORTED_ROWS])
14511451

1452+
def scene_and_annotation_generator(self, page_size=10):
1453+
"""Provides a generator of all DatasetItems and Annotations in the dataset grouped by scene.
1454+
1455+
1456+
Returns:
1457+
Generator where each element is a nested dict (representing a JSON) structured in the following way:
1458+
1459+
Iterable[{
1460+
"file_location": str,
1461+
"metadata": Dict[str, Any],
1462+
"annotations": {
1463+
"{trackId}": {
1464+
"label": str,
1465+
"name": str,
1466+
"frames": List[{
1467+
"left": int,
1468+
"top": int,
1469+
"width": int,
1470+
"height": int,
1471+
"key": str, # frame key
1472+
"metadata": Dict[str, Any]
1473+
}]
1474+
}
1475+
}
1476+
}]
1477+
1478+
This is similar to how the Scale API returns task data
1479+
"""
1480+
1481+
if page_size > 30:
1482+
raise ValueError("Page size must be less than or equal to 30")
1483+
1484+
endpoint_name = "exportForTrainingByScene"
1485+
json_generator = paginate_generator(
1486+
client=self._client,
1487+
endpoint=f"dataset/{self.id}/{endpoint_name}",
1488+
result_key=EXPORT_FOR_TRAINING_KEY,
1489+
page_size=page_size,
1490+
)
1491+
1492+
for data in json_generator:
1493+
yield data
1494+
14521495
def items_and_annotation_generator(
14531496
self,
14541497
query: Optional[str] = None,

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ ignore = ["E501", "E741", "E731", "F401"] # Easy ignore for getting it running
2525

2626
[tool.poetry]
2727
name = "scale-nucleus"
28-
version = "0.17.5"
28+
version = "0.17.6"
2929
description = "The official Python client library for Nucleus, the Data Platform for AI"
3030
license = "MIT"
3131
authors = ["Scale AI Nucleus Team <nucleusapi@scaleapi.com>"]

0 commit comments

Comments
 (0)