Share your data with import/export features #245
leoll2
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Looking for a way to share your datasets with other Geti users?
Or do you need to migrate your projects to another instance?
Or do you want a backup of your experiments?
All this is possible in Geti thanks to its versatile import and export features. Specifically, it supports two modes to transfer data to and from the platform, the UI and the SDK, both of which allow you to export and import both single or entire projects.
Dataset I/E
The first option allows you to export a dataset and its annotations into one of the popular standard formats such as COCO, VOC or YOLO. The list of supported formats varies depending on the task, but there is one that is particularly flexible and suitable for almost any need: Datumaro. In addition to being a convenient dataset format, Datumaro is the swiss-army knife of dataset conversion, as it allows you to transform datasets between a myriad of different formats thanks to its convenient CLI (see the docs). Another reason for choosing Datumaro is that it natively supports videos, whereas other formats convert the frames to individual images while exporting.
For reference, info about dataset I/E is also available here in the public docs.
Export
You can easily export a dataset through the UI by clicking on the "Export dataset" button.
In the next menu you can choose the format and how to export the videos, if any.
After you click "Export", the operation proceeds in the background. It may take several minutes, depending on the size of the dataset; you can monitor its progress via the jobs panel.
Finally, download the dataset to your computer as a zip file. This archive is portable, you may later import it to another Geti instance or even another application which supports the chosen dataset format.
Import
Geti offers the possibility to create a new project from a dataset archive, or to import it into an already existing project.
Create a new project
Choose "Create from dataset" to import your annotated dataset to a new project. Geti analyzes the content of the archive, it offers you a list of task types compatible with the annotations and, after you have chosen the desired option, it creates the project.
In fact, this is one of the fastest ways to setup a new Geti project from a pre-annotated dataset.
Import into an existing project
Alternatively, you may import a dataset to extend an existing one: in this case, choose "Import dataset" from the dataset selector, as shown below.
Project I/E
Geti offers not only the possibility to export datasets, but also full projects including models and evaluations too. The output is a single zip archive, which you can think of as a complete snapshot of the project; you can later import that archive to the same or another Geti instance to obtain an almost exact copy of the original project.
For reference, info about project I/E is also available here in the public docs.
Export
Exporting a project is a one-click operation: after you select "Export", the server collects and packages all the files (media, models, etc...) and eventually returns a zip ready to be downloaded.
Note
The zip archive may be quite large if the project contains several models. Make sure to have enough free space in your disk before initiating the download.
Import
To import a project, simply click "Create from exported project", as shown below. This operation may take a while: check the jobs panel to monitor its progress.
Other options
The UI is the most user-friendly way to import/export data into Geti, but it is not the only one: users who prefer a code-based solution can take advantage of the Geti SDK. If you're not already familiar with Geti SDK and its usage, its README and the notebooks are a great starting point.
The
GetiIE
class exposes handy methods to trigger all the import/export operations described above:export_dataset(...)
[sourceimport_dataset_as_new_project(...)
[source]export_project(...)
[source]import_project(...)
[source]It also implements an additional path to export and import data in Geti, namely the
download_project_data
[source] andupload_project_data
[source] functions. These utilities leverage the media and annotation REST API to download/upload the dataset content to/from a local folder. Although less portable than the other dataset I/E approach (i.e. zip archive in standardized format), this solution comes in handy in some cases with special resource/compatibility constraints.Comparison
As a user who wants to export data from Geti, you might wonder which feature to use between dataset and project export. The answer depends on the specific use case and relative constraints. First, beware of their differences, summarized in the following table:
Then we recommend to use project export when you want to:
Conversely, use dataset export if you want to:
The following diagram may help with the decision. If you are still unsure, feel free to ask!
Beta Was this translation helpful? Give feedback.
All reactions