Skip to content

Commit 317d2bf

Browse files
authored
allow for private dataset creation (#403)
1 parent 3b0abe7 commit 317d2bf

File tree

4 files changed

+15
-1
lines changed

4 files changed

+15
-1
lines changed

CHANGELOG.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,14 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

88

9+
## [0.16.6](https://github.com/scaleapi/nucleus-python-client/releases/tag/v0.16.6) - 2023-11-01
10+
11+
### Added
12+
- Allow datasets to be created in "privacy mode". For example, `client.create_dataset('name', use_privacy_mode=True)`.
13+
- Privacy Mode lets customers use Nucleus without sensitive raw data ever leaving their servers.
14+
- When set to `True`, you can submit URLs to Nucleus that link to raw data assets like images or point clouds, instead of transferring that data to Scale. Access control is then completely in the hands of users: URLs may optionally be protected behind your corporate VPN or an IP whitelist. When you load a Nucleus web page, your browser will directly fetch the raw data from your servers without it ever being accessible to Scale.
15+
16+
917
## [0.16.5](https://github.com/scaleapi/nucleus-python-client/releases/tag/v0.16.5) - 2023-10-30
1018

1119
### Added

nucleus/__init__.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@
7979
AUTOTAGS_KEY,
8080
DATASET_ID_KEY,
8181
DATASET_IS_SCENE_KEY,
82+
DATASET_PRIVACY_MODE_KEY,
8283
DEFAULT_NETWORK_TIMEOUT_SEC,
8384
EMBEDDING_DIMENSION_KEY,
8485
EMBEDDINGS_URL_KEY,
@@ -429,6 +430,7 @@ def create_dataset(
429430
self,
430431
name: str,
431432
is_scene: Optional[bool] = None,
433+
use_privacy_mode: bool = False,
432434
item_metadata_schema: Optional[Dict] = None,
433435
annotation_metadata_schema: Optional[Dict] = None,
434436
) -> Dataset:
@@ -443,6 +445,8 @@ def create_dataset(
443445
is_scene: Whether the dataset contains strictly :class:`scenes
444446
<LidarScene>` or :class:`items <DatasetItem>`. This value is immutable.
445447
Default is False (dataset of items).
448+
use_privacy_mode: Whether the images of this dataset should be uploaded to Scale. If set to True,
449+
customer will have to adjust their file access policy with Scale.
446450
item_metadata_schema: Dict defining item-level metadata schema. See below.
447451
annotation_metadata_schema: Dict defining annotation-level metadata schema.
448452
@@ -473,6 +477,7 @@ def create_dataset(
473477
{
474478
NAME_KEY: name,
475479
DATASET_IS_SCENE_KEY: is_scene,
480+
DATASET_PRIVACY_MODE_KEY: use_privacy_mode,
476481
ANNOTATION_METADATA_SCHEMA_KEY: annotation_metadata_schema,
477482
ITEM_METADATA_SCHEMA_KEY: item_metadata_schema,
478483
},

nucleus/constants.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@
4343
DATASET_LENGTH_KEY = "length"
4444
DATASET_MODEL_RUNS_KEY = "model_run_ids"
4545
DATASET_NAME_KEY = "name"
46+
DATASET_PRIVACY_MODE_KEY = "use_privacy_mode"
4647
DATASET_SLICES_KEY = "slice_ids"
4748
DEFAULT_ANNOTATION_UPDATE_MODE = False
4849
DEFAULT_NETWORK_TIMEOUT_SEC = 120

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ ignore = ["E501", "E741", "E731", "F401"] # Easy ignore for getting it running
2525

2626
[tool.poetry]
2727
name = "scale-nucleus"
28-
version = "0.16.5"
28+
version = "0.16.6"
2929
description = "The official Python client library for Nucleus, the Data Platform for AI"
3030
license = "MIT"
3131
authors = ["Scale AI Nucleus Team <nucleusapi@scaleapi.com>"]

0 commit comments

Comments
 (0)