Skip to content

IvisionLab/MEDIA-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

A holistic approach for classifying dental conditions from textual reports and panoramic radiographs

In this repository, you will find instructions on how to request the data sets used in the aforementioned paper. We provide three data sets: Raw Panoramic Radiographs (RPR), OdontoAI Open Panoramic Radiographs (O$^2$PR), and Textual Reports Panoramic Radiographs (TRPR). The data sets are described in the following sections.

To be notified of code releases, new data sets, and errata, watch this repo.

Raw Panoramic Radiographs (RPR) Data Set

The RPR data set comprises 4,795 unannotated panoramic radiographs in their raw format. These were all acquired using the same imaging device as the O$^2$PR and TRPR data sets, ensuring consistent quality and resolution.

OdontoAI Open Panoramic Radiographs (O$^2$PR) Data Set

The O$^2$PR was constructed from the DNS Panoramic Images v2 data set, which comprises 450 panoramic dental radiographs. The initial data set was used for benchmarking the performance of several instance segmentation networks, and best-performing architecture was used to speed up the annotation process through the Human-in-the-Loop (HITL) concept. We consider that this new data set evolves all of our previously tooth instance segmentation published data sets: UFBA UESC DENTAL IMAGES, UFBA UESC DENTAL IMAGES Deep, DNS Panoramic Images, DNS Panoramic Images v2. Therefore, we deprecated all these data sets in favor of this new one called OdontoAI Open Panoramic Radiographs (O$^2$PR).

The dataset comprises 4,000 panoramic radiograph images along with corresponding annotations. These annotations cover all 52 tooth types (32 permanent and 20 deciduous) in the human dentition.

The provided annotations are in a COCO-style data set format, which includes segmentation masks for the teeth. The format follows this structure:

{
    "images": [image],
    "annotations": [annotation],
    "categories": [category]
}

where each image is defined as

{
    "id": int,
    "width": int,
    "height": int,
    "file_name": str,
}

The categories are defined as:

{
    "id": int,
    "name": str,
    "supercategory": str
}

The annotations are defined as:

{
    "id": int,
    "image_id": int,
    "category_id": int,
    "segmentation": [[polygon]],
    "area": float,
    "bbox": [x, y, width, height],
    "iscrowd": 0 or 1,
    "width": int,
    "height": int
}

The data set statistics can be found in the orignal paper coined Boosting research on dental panoramic radiographs: a challenging data set, baselines, and a task central online platform for benchmark (Silva et al, 2023).

Demonstration of Annotations in the O$^2$PR Dataset

Follow the provided Jupyter notebook demo.ipynb to get a quick overview of the dataset. The conversions.py file defines useful functions for visualizing the annotations.

Textual Reports Panoramic Radiographs (TRPR) Data Set

The TRPR dataset contains 8,029 panoramic radiographs along with their corresponding textual reports, which do not include tooth segmentation labels. The reports, written in Portuguese, were independently authored by two radiologists.

Each report shares the same filename as its corresponding image (except for the extension — here .txt instead of .png) and is composed of written entries that, in our format, consist of a two-digit number followed by a line of text, as in:

01) Maxila edêntula.
02) Dentes ausentes na mandíbula: 35, 36, 37, 38, 45, 46, 47 e 48.
03) Dente 34: Núcleo metálico e tratamento endodôntico.
...
07) Calcificação do complexo ligamentar estilo hioideo direito e esquerdo.

Citation

If you use any data set of provided, fully or partially, please cite our following paper:

B. Silva, L. Pinheiro, B. Sobrinho, F. Lima, B. Sobrinho, K. Abdalla, M. Pithon, P. Cury, and L. Oliveira, “A holistic approach for classifying dental conditions from textual reports and panoramic radiographs,” in Medical Image Analysis, 2024.

While the paper is not yet published, you can cite the following bibtex entry:

@misc{silva2024semisupervisedclassificationdentalconditions,
      title={Semi-supervised classification of dental conditions in panoramic radiographs using large language model and instance segmentation: A real-world dataset evaluation}, 
      author={Bernardo Silva and Jefferson Fontinele and Carolina Letícia Zilli Vieira and João Manuel R. S. Tavares and Patricia Ramos Cury and Luciano Oliveira},
      year={2024},
      eprint={2406.17915},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2406.17915}, 
}

Previous Works

This data set and its corresponding paper are a continuation of other works of our group. Please, consider reading and citing:

  • B. Silva, L. Pinheiro, B. Sobrinho, F. Lima, B. Sobrinho, K. Abdalla, M. Pithon, P. Cury, and L. Oliveira, “Boosting research on dental panoramic radiographs: a challenging data set, baselines, and a task central online platform for benchmark,” in Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 2023.
@article{doi:10.1080/21681163.2022.2157747,
author = {Bernardo Peters Menezes Silva and Laís Bastos Pinheiro and Brenda Pereira Pinheiro Sobrinho and Fernanda Pereira Lima and Bruna Pereira Pinheiro Sobrinho and Kalyf Abdalla Buzar Lima and Matheus Melo Pithon and Patricia Ramos Cury and Luciano Rebouças de Oliveira},
title = {Boosting research on dental panoramic radiographs: a challenging data set, baselines, and a task central online platform for benchmark},
journal = {Computer Methods in Biomechanics and Biomedical Engineering: Imaging \& Visualization},
volume = {0},
number = {0},
pages = {1-21},
year  = {2023},
publisher = {Taylor & Francis},
doi = {10.1080/21681163.2022.2157747}}

  • B. Silva, L. Pinheiro, L. Oliveira, and M. Pithon, “A study on tooth segmentation and numbering using end-to-end deep neural networks,” in Conference on Graphics, Patterns and Images. IEEE, 2020.
@inproceedings{silva2020study,
  title={A study on tooth segmentation and numbering using end-to-end deep neural networks},
  author={Silva, Bernardo and Pinheiro, La{\'\i}s and Oliveira, Luciano and Pithon, Matheus},
  booktitle={Conference on Graphics, Patterns and Images (SIBGRAPI)},
  year={2020},
  organization={IEEE}
}
  • G. Jader, J. Fontineli, M. Ruiz, K. Abdalla, M. Pithon, and L. Oliveira, “Deep instance segmentation of teeth in panoramic X-ray images,” in Conference on Graphics, Patterns and Images. IEEE, 2018.
@inproceedings{jader2018deep,
  title={Deep instance segmentation of teeth in panoramic X-ray images},
  author={Jader, Gil and Fontineli, Jefferson and Ruiz, Marco and Abdalla, Kalyf and Pithon, Matheus and Oliveira, Luciano},
  booktitle={Conference on Graphics, Patterns and Images (SIBGRAPI)},
  pages={400--407},
  year={2018},
  organization={IEEE}
}
  • G. Silva, L. Oliveira, and M. Pithon, “Automatic segmenting teeth in X-ray images: Trends, a novel data set, benchmarking and future perspectives,” Expert Systems with Applications, Patterns and Images. vol. 107, pp. 15-31, 2018.
@article{silva2018automatic,
  title={Automatic segmenting teeth in X-ray images: Trends, a novel data set, benchmarking and future perspectives},
  author={Silva, Gil and Oliveira, Luciano and Pithon, Matheus},
  journal={Expert Systems with Applications},
  volume={107},
  pages={15--31},
  year={2018},
  publisher={Elsevier}
}
  • L. Pinheiro, B. Silva, B. Sobrinho, F. Lima, P. Cury, L. Oliveira, “Numbering permanent and deciduous teeth via deep instance segmentation in panoramic X-rays,” in Symposium on Medical Information Processing and Analysis (SIPAIM). SPIE, 2021.
@inproceedings{pinheiro2021numbering,
  title={Numbering permanent and deciduous teeth via deep instance segmentation in panoramic X-rays},
  author={Pinheiro, La{\'\i}s and Silva, Bernardo and Sobrinho, Brenda and Lima, Fernanda and Cury, Patr{\'\i}cia and Oliveira, Luciano.},
  booktitle={Symposium on Medical Information Processing and Analysis (SIPAIM)},
  year={2021},
  organization={SPIE}
}

Request the Data Sets

Copy the text below in a PDF file, fill out the fields in the text header, and sign it at the end. Please send an e-mail to lrebouca@ufba.br to receive a link to download the OdontoAI Open Panoramic Radiographs data set with the PDF in attachment. The e-mail must be sent from a professor's valid institutional account:

Subject: Request to download the data sets of dental conditions.

"Name: [your first and last name]

Affiliation: [university where you work]

Department: [your department]

Current position: [your job title]

E-mail: [must be the e-mail at the above-mentioned institution]

I have read and agreed to follow the terms and conditions below: The following conditions define the use of the three data sets of dental conditions:

This data set is provided "AS IS" without any express or implied warranty. Although every effort has been made to ensure accuracy, IvisionLab does not take any responsibility for errors or omissions;

Without the expressed permission of IvisionLab, any of the following will be considered illegal: redistribution, modification, and commercial usage of this data set in any way or form, either partially or in its entirety;

All images in this data set are only allowed for demonstration in academic publications and presentations;

This data set will only be used for research purposes. I will not make any part of this data set available to a third party. I'll not sell any part of this data set or make any profit from its use.

[your signature]"

P.S. A link to the data set file will be sent as soon as possible.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published