Skip to content

Commit 01052e3

Browse files
author
Matt Sokoloff
committed
update basic examples and added labels notebook
1 parent 78e9d93 commit 01052e3

File tree

2 files changed

+489
-42
lines changed

2 files changed

+489
-42
lines changed

examples/basics/basics.ipynb

Lines changed: 221 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"id": "complimentary-passing",
66
"metadata": {},
77
"source": [
8-
"# Basics"
8+
"# Basics\n"
99
]
1010
},
1111
{
@@ -24,9 +24,19 @@
2424
"* For more details : https://docs.labelbox.com/python-sdk/en/index-en#labelbox-python-sdk"
2525
]
2626
},
27+
{
28+
"cell_type": "markdown",
29+
"id": "cheap-damages",
30+
"metadata": {},
31+
"source": [
32+
"#### The remainder of this notebook is an interactive version of the fundamental concepts docs.\n",
33+
"* For more details you can read the docs here: \n",
34+
" * https://docs.labelbox.com/python-sdk/en/index-en#fundamental-concepts"
35+
]
36+
},
2737
{
2838
"cell_type": "code",
29-
"execution_count": 1,
39+
"execution_count": 5,
3040
"id": "everyday-street",
3141
"metadata": {},
3242
"outputs": [],
@@ -56,7 +66,7 @@
5666
},
5767
{
5868
"cell_type": "code",
59-
"execution_count": 9,
69+
"execution_count": 6,
6070
"id": "instructional-reply",
6171
"metadata": {},
6272
"outputs": [],
@@ -67,9 +77,18 @@
6777
"DATASET_NAME = \"Example Jellyfish Dataset\""
6878
]
6979
},
80+
{
81+
"cell_type": "markdown",
82+
"id": "chinese-playing",
83+
"metadata": {},
84+
"source": [
85+
"#### Client\n",
86+
"* Starting point for all db interactions"
87+
]
88+
},
7089
{
7190
"cell_type": "code",
72-
"execution_count": 10,
91+
"execution_count": 7,
7392
"id": "thick-gasoline",
7493
"metadata": {},
7594
"outputs": [],
@@ -81,7 +100,7 @@
81100
},
82101
{
83102
"cell_type": "code",
84-
"execution_count": 11,
103+
"execution_count": 8,
85104
"id": "victorian-consumer",
86105
"metadata": {},
87106
"outputs": [],
@@ -93,7 +112,7 @@
93112
},
94113
{
95114
"cell_type": "code",
96-
"execution_count": 12,
115+
"execution_count": 9,
97116
"id": "industrial-onion",
98117
"metadata": {},
99118
"outputs": [
@@ -103,7 +122,7 @@
103122
"<Project ID: ckk4q1viuc0w20704eh69u28h>"
104123
]
105124
},
106-
"execution_count": 12,
125+
"execution_count": 9,
107126
"metadata": {},
108127
"output_type": "execute_result"
109128
}
@@ -112,73 +131,205 @@
112131
"project"
113132
]
114133
},
134+
{
135+
"cell_type": "markdown",
136+
"id": "popular-nylon",
137+
"metadata": {},
138+
"source": [
139+
"#### Fields\n",
140+
"* All db objects have fields (look at the source code to see them https://github.com/Labelbox/labelbox-python/blob/develop/labelbox/schema/project.py)\n",
141+
"* These fields are attributes of the object"
142+
]
143+
},
115144
{
116145
"cell_type": "code",
117-
"execution_count": null,
118-
"id": "superb-revolution",
146+
"execution_count": 12,
147+
"id": "guided-institute",
119148
"metadata": {},
120-
"outputs": [],
121-
"source": []
149+
"outputs": [
150+
{
151+
"name": "stdout",
152+
"output_type": "stream",
153+
"text": [
154+
"Sample Project\n",
155+
"Demonstrating image segmentation and object detection\n",
156+
"Example Jellyfish Dataset\n"
157+
]
158+
}
159+
],
160+
"source": [
161+
"print(project.name)\n",
162+
"print(project.description)\n",
163+
"print(dataset.name)"
164+
]
165+
},
166+
{
167+
"cell_type": "markdown",
168+
"id": "protective-multimedia",
169+
"metadata": {},
170+
"source": [
171+
"* Fields can be updated. This will be reflected server side (you will see it in labelbox) "
172+
]
122173
},
123174
{
124175
"cell_type": "code",
125176
"execution_count": 13,
126-
"id": "cubic-joint",
177+
"id": "according-subdivision",
178+
"metadata": {},
179+
"outputs": [],
180+
"source": [
181+
"project.update(description = \"new description field\")\n",
182+
"print(project.description)"
183+
]
184+
},
185+
{
186+
"cell_type": "markdown",
187+
"id": "viral-power",
188+
"metadata": {},
189+
"source": [
190+
"#### Pagination\n",
191+
"* Queries that return a list of database objects return them as a PaginatedCollection\n",
192+
"* The goal here is to limit the data being returned to only the necessary data."
193+
]
194+
},
195+
{
196+
"cell_type": "code",
197+
"execution_count": 17,
198+
"id": "ideal-processing",
127199
"metadata": {},
128200
"outputs": [
129201
{
130202
"data": {
131203
"text/plain": [
132-
"<labelbox.pagination.PaginatedCollection at 0x10caa6160>"
204+
"<labelbox.pagination.PaginatedCollection at 0x1110afe80>"
133205
]
134206
},
135-
"execution_count": 13,
207+
"execution_count": 17,
136208
"metadata": {},
137209
"output_type": "execute_result"
138210
}
139211
],
140212
"source": [
141-
"#Or you can fetch all based on a condition\n",
142-
"projects = client.get_projects(where = Project.name == PROJECT_NAME)\n",
143-
"datasets = client.get_datasets(where = Dataset.name == DATASET_NAME)\n",
144-
"projects"
213+
"labels_paginated_collection = project.labels()\n",
214+
"labels_paginated_collection"
145215
]
146216
},
147217
{
148218
"cell_type": "code",
149-
"execution_count": null,
150-
"id": "rational-marshall",
219+
"execution_count": 19,
220+
"id": "convinced-force",
151221
"metadata": {},
152-
"outputs": [],
222+
"outputs": [
223+
{
224+
"data": {
225+
"text/plain": [
226+
"<Label ID: cklw9cboq00063h68gqrsvi15>"
227+
]
228+
},
229+
"execution_count": 19,
230+
"metadata": {},
231+
"output_type": "execute_result"
232+
}
233+
],
153234
"source": [
235+
"#Iterate over them to get the items out.\n",
236+
"next(labels_paginated_collection)\n",
237+
"#Be careful not to call list(paginated_collection) on a large collection"
238+
]
239+
},
240+
{
241+
"cell_type": "markdown",
242+
"id": "widespread-startup",
243+
"metadata": {},
244+
"source": [
245+
"#### Query parameters\n",
246+
"* Query with the following conventions:\n",
247+
" * `DbObject.Field`"
248+
]
249+
},
250+
{
251+
"cell_type": "code",
252+
"execution_count": 28,
253+
"id": "cubic-joint",
254+
"metadata": {},
255+
"outputs": [
256+
{
257+
"name": "stdout",
258+
"output_type": "stream",
259+
"text": [
260+
"<labelbox.pagination.PaginatedCollection object at 0x114255640>\n",
261+
"<Project {'auto_audit_number_of_labels': 3, 'auto_audit_percentage': 0.1, 'created_at': datetime.datetime(2021, 1, 20, 1, 2, 31, tzinfo=datetime.timezone.utc), 'description': 'new description field', 'last_activity_time': datetime.datetime(2021, 3, 19, 13, 46, 50, 920000, tzinfo=datetime.timezone.utc), 'name': 'Sample Project', 'setup_complete': datetime.datetime(2021, 1, 20, 1, 2, 31, 152000, tzinfo=datetime.timezone.utc), 'uid': 'ckk4q1viuc0w20704eh69u28h', 'updated_at': datetime.datetime(2021, 3, 19, 13, 46, 50, 920000, tzinfo=datetime.timezone.utc)}>\n",
262+
"None\n",
263+
"None\n"
264+
]
265+
}
266+
],
267+
"source": [
268+
"datasets = client.get_datasets(where = Dataset.name == DATASET_NAME )\n",
269+
"\n",
270+
"projects = client.get_projects(where = (\n",
271+
" (Project.name == PROJECT_NAME)\n",
272+
" & \n",
273+
" (Project.description == \"new description field\")\n",
274+
"))\n",
275+
" \n",
154276
"#The above two queries return PaginatedCollections because the filter parameters aren't guarenteed to be unique.\n",
155-
"#This object is an iterable containing the query results\n",
156-
"next(projects)"
277+
"#So even if there is one element returned it is in a paginatedCollection.\n",
278+
"print(projects)\n",
279+
"print(next(projects, None))\n",
280+
"print(next(projects, None))\n",
281+
"print(next(projects, None))\n",
282+
"#We can see there is only one."
283+
]
284+
},
285+
{
286+
"cell_type": "markdown",
287+
"id": "french-toner",
288+
"metadata": {},
289+
"source": [
290+
"#### Querying Limitations\n",
291+
"* The DbObject used for the query must be the same as the DbObject returned by the querying function. \n",
292+
"* eg. is not valid since get_project returns a Project but we are filtering on a Dataset\n",
293+
"> `>>> projects = client.get_projects(where = Dataset.name == \"dataset_name\")`\n"
294+
]
295+
},
296+
{
297+
"cell_type": "markdown",
298+
"id": "defensive-bidder",
299+
"metadata": {},
300+
"source": [
301+
"#### Relationship\n",
302+
"* This solves the above problem of querying by a relationship\n",
303+
"* You can find all realtionships of a DB object in the source code\n",
304+
" * eg. for a Project ( https://github.com/Labelbox/labelbox-python/blob/develop/labelbox/schema/project.py))"
157305
]
158306
},
159307
{
160308
"cell_type": "code",
161-
"execution_count": null,
309+
"execution_count": 31,
162310
"id": "handmade-yugoslavia",
163311
"metadata": {},
164-
"outputs": [],
312+
"outputs": [
313+
{
314+
"data": {
315+
"text/plain": [
316+
"[<Project ID: ckk4q1viuc0w107041siuht7p>]"
317+
]
318+
},
319+
"execution_count": 31,
320+
"metadata": {},
321+
"output_type": "execute_result"
322+
}
323+
],
165324
"source": [
166-
"# Filtering is only supported using the object you are querying for\n",
167-
"#eg. is not valid since get_project returns a Project but we are filtering on a Dataset\n",
168-
"projects = client.get_projects(where = Dataset.name == \"dataset_name\") #INVALID\n",
169-
"\n",
170-
"## Instead we should use relationships.\n",
171-
"#If we want all projects where there is a particular attached dataset we can do\n",
172-
"list(dataset.project())\n",
173-
"#Filtering Takeaways\n",
174-
"#1. Filtering only works on a single object type at a time\n",
175-
"#2. The where clause requires that we pass an object that is of the same type that is being retured by the query\n",
176-
"#3. If we want to filter based on a relationship, we should use the relationship attribute of objects"
325+
"#Dataset has a Relationship to a Project so we can use the following\n",
326+
"list(dataset.projects())\n",
327+
"#This will return all projects that are attached to this dataset"
177328
]
178329
},
179330
{
180331
"cell_type": "code",
181-
"execution_count": 15,
332+
"execution_count": 32,
182333
"id": "future-bargain",
183334
"metadata": {},
184335
"outputs": [
@@ -188,23 +339,51 @@
188339
"[<Dataset ID: cklv1qzlv1oqn0y9ne7b9gtpb>]"
189340
]
190341
},
191-
"execution_count": 15,
342+
"execution_count": 32,
192343
"metadata": {},
193344
"output_type": "execute_result"
194345
}
195346
],
196347
"source": [
197-
"# If you are interested in the relationship between objects then \n",
198-
"#You can only filter on attributes of either a dataset or a project.\n",
199-
"#If you want all datasets that belongs to a particular project then you can do that with the following query.\n",
200348
"sample_project_datasets = project.datasets()\n",
201349
"list(sample_project_datasets)"
202350
]
203351
},
352+
{
353+
"cell_type": "markdown",
354+
"id": "metric-speaker",
355+
"metadata": {},
356+
"source": [
357+
"#### Delete\n",
358+
"* Most DBObjects support deletion"
359+
]
360+
},
361+
{
362+
"cell_type": "code",
363+
"execution_count": 37,
364+
"id": "persistent-briefs",
365+
"metadata": {},
366+
"outputs": [],
367+
"source": [
368+
"#Eg.\n",
369+
"##### project.delete()\n",
370+
"##### dataset.delete()\n",
371+
"##### data_row.delete()"
372+
]
373+
},
374+
{
375+
"cell_type": "markdown",
376+
"id": "confused-peace",
377+
"metadata": {},
378+
"source": [
379+
"* We reccomend using bulk operations where possible.\n",
380+
"* You can find specific deletion instructions in tutorials on each object."
381+
]
382+
},
204383
{
205384
"cell_type": "code",
206385
"execution_count": null,
207-
"id": "bacterial-yield",
386+
"id": "thirty-interval",
208387
"metadata": {},
209388
"outputs": [],
210389
"source": []

0 commit comments

Comments
 (0)