Skip to content

LS Converter - COCO Dataset - KeyError #8707

@wasson5e

Description

@wasson5e

Im trying to retrain a YOLO model with label-studio (probably more than I want to do), but I don't want to lose the data the model was already trained on. Since YOLO uses a COCO dataset, Im trying to add my information to the COCO dataset. To do that, Im trying to get COCO into Label Studio (LS) but Im having trouble.

Ive pulled down COCO information from different locations:

  1. Ultralytics will download the dataset during training of YOLO
  2. From Kaggle - https://www.kaggle.com/datasets/awsaf49/coco-2017-dataset/data
  3. From COCO - https://cocodataset.org/#download

When I try and run the conversion, Im hitting the same issue, typically a key error in the JSON data:

When using the json file from #3:

label-studio-converter import coco -i '/orin-robotics/datasets/annotations/captions_train2017.json' -o '/orin-robotics/datasets/coco_output' 
INFO:root:Reading COCO notes and categories from /orin-robotics/datasets/annotations/captions_train2017.json
Traceback (most recent call last):
  File "/root/Documents/label-studio/bin/label-studio-converter", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/root/Documents/label-studio/lib/python3.12/site-packages/label_studio_converter/main.py", line 191, in main
    imports(args)
  File "/root/Documents/label-studio/lib/python3.12/site-packages/label_studio_converter/main.py", line 173, in imports
    import_coco.convert_coco_to_ls(
  File "/root/Documents/label-studio/lib/python3.12/site-packages/label_studio_converter/imports/coco.py", line 139, in convert_coco_to_ls
    categories = {int(category['id']): category for category in coco['categories']}
                                                                ~~~~^^^^^^^^^^^^^^
KeyError: 'categories'

When using the data that Ultralytics pulled down in #1:

(label-studio) root@94c756140b69:/orin-robotics/datasets/coco/annotations# label-studio-converter import coco -i '/orin-robotics/datasets/coco/annotations/instances_val2017.json' -o '/orin-robotics/datasets/coco_output' 
INFO:root:Reading COCO notes and categories from /orin-robotics/datasets/coco/annotations/instances_val2017.json
INFO:root:Found 80 categories, 5000 images and 36781 annotations
WARNING:root:Segmentation in COCO is experimental
ERROR:root:RLE in segmentation is not yet supported in COCO
Traceback (most recent call last):
  File "/root/Documents/label-studio/bin/label-studio-converter", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/root/Documents/label-studio/lib/python3.12/site-packages/label_studio_converter/main.py", line 191, in main
    imports(args)
  File "/root/Documents/label-studio/lib/python3.12/site-packages/label_studio_converter/main.py", line 173, in imports
    import_coco.convert_coco_to_ls(
  File "/root/Documents/label-studio/lib/python3.12/site-packages/label_studio_converter/imports/coco.py", line 219, in convert_coco_to_ls
    item = create_segmentation(
           ^^^^^^^^^^^^^^^^^^^^
  File "/root/Documents/label-studio/lib/python3.12/site-packages/label_studio_converter/imports/coco.py", line 54, in create_segmentation
    segmentation = annotation['segmentation'][0]
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
KeyError: 0

So my question is, am I using the correct data? Or am I doing something else wrong?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions