Skip to content

Tabular datamodule (Custom dataset from DataFrame, CSV, or Parquet) #2713

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
- πŸš€ Add new SOTA video Anomaly detection module FUVAS
- πŸš€ Add VAD dataset by @abc-125 in https://github.com/open-edge-platform/anomalib/pull/2603
- πŸš€ Add Tiled Ensemble for V2 by @blaz-r in https://github.com/open-edge-platform/anomalib/pull/2660
- πŸš€ Add Tabular datamodule by @manuelkonrad in https://github.com/openvinotoolkit/anomalib/pull/2713

### Removed

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,13 @@ Dataset format compatible with Intel Getiβ„’.
Custom folder-based dataset organization.
:::

:::{grid-item-card} Tabular
:link: anomalib.data.datamodules.image.Tabular
:link-type: doc

Custom tabular dataset.
:::

:::{grid-item-card} Kolektor
:link: anomalib.data.datamodules.image.Kolektor
:link-type: doc
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Tabular Datamodule

```{eval-rst}
.. automodule:: anomalib.data.datamodules.image.tabular
:members:
:show-inheritance:
```
73 changes: 73 additions & 0 deletions examples/configs/data/tabular.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
class_path: anomalib.data.Tabular
init_args:
name: bottle
root: "datasets/MVTecAD/bottle"
train_batch_size: 32
eval_batch_size: 32
num_workers: 8
test_split_mode: from_dir
test_split_ratio: 0.2
val_split_mode: same_as_test
val_split_ratio: 0.5
seed: null
samples:
- image_path: train/good/000.png
label_index: 0
mask_path: ""
split: train
- image_path: train/good/001.png
label_index: 0
mask_path: ""
split: train
- image_path: train/good/002.png
label_index: 0
mask_path: ""
split: train
- image_path: train/good/003.png
label_index: 0
mask_path: ""
split: train
- image_path: train/good/004.png
label_index: 0
mask_path: ""
split: train
- image_path: test/broken_large/000.png
label_index: 1
mask_path: ground_truth/broken_large/000_mask.png
split: test
- image_path: test/broken_large/002.png
label_index: 1
mask_path: ground_truth/broken_large/002_mask.png
split: test
- image_path: test/broken_large/004.png
label_index: 1
mask_path: ground_truth/broken_large/004_mask.png
split: test
- image_path: test/good/000.png
label_index: 0
mask_path: ""
split: test
- image_path: test/good/001.png
label_index: 0
mask_path: ""
split: test
- image_path: test/good/003.png
label_index: 0
mask_path: ""
split: test
- image_path: test/broken_large/001.png
label_index: 1
mask_path: ground_truth/broken_large/001_mask.png
split: test
- image_path: test/broken_large/003.png
label_index: 1
mask_path: ground_truth/broken_large/003_mask.png
split: test
- image_path: test/good/002.png
label_index: 0
mask_path: ""
split: test
- image_path: test/good/004.png
label_index: 0
mask_path: ""
split: test
3 changes: 3 additions & 0 deletions examples/notebooks/100_datamodules/105_tabular.ipynb
Git LFS file not shown
4 changes: 4 additions & 0 deletions src/anomalib/data/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@
MVTecAD2,
MVTecLOCO,
RealIAD,
Tabular,
Visa,
)
from .datamodules.video import Avenue, ShanghaiTech, UCSDped, VideoDataFormat
Expand All @@ -75,6 +76,7 @@
KolektorDataset,
MVTecADDataset,
MVTecLOCODataset,
TabularDataset,
VADDataset,
VisaDataset,
)
Expand Down Expand Up @@ -181,6 +183,7 @@ def get_datamodule(config: DictConfig | ListConfig | dict) -> AnomalibDataModule
"MVTecAD2",
"MVTecLOCO",
"RealIAD",
"Tabular",
"VAD",
"Visa",
# Video Data Modules
Expand All @@ -196,6 +199,7 @@ def get_datamodule(config: DictConfig | ListConfig | dict) -> AnomalibDataModule
"KolektorDataset",
"MVTecADDataset",
"MVTecLOCODataset",
"TabularDataset",
"VADDataset",
"VisaDataset",
"AvenueDataset",
Expand Down
3 changes: 2 additions & 1 deletion src/anomalib/data/datamodules/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# SPDX-License-Identifier: Apache-2.0

from .depth import Folder3D, MVTec3D
from .image import VAD, BTech, Datumaro, Folder, Kolektor, MVTec, MVTecAD, Visa
from .image import VAD, BTech, Datumaro, Folder, Kolektor, MVTec, MVTecAD, Tabular, Visa
from .video import Avenue, ShanghaiTech, UCSDped

__all__ = [
Expand All @@ -16,6 +16,7 @@
"Kolektor",
"MVTec", # Include MVTec for backward compatibility
"MVTecAD",
"Tabular",
"VAD",
"Visa",
"Avenue",
Expand Down
5 changes: 5 additions & 0 deletions src/anomalib/data/datamodules/image/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- ``MVTecAD``: MVTec Anomaly Detection Dataset
- ``MVTecAD2``: MVTec Anomaly Detection Dataset 2
- ``MVTecLOCO``: MVTec LOCO Dataset with logical and structural anomalies
- ``Tabular``: Custom tabular dataset with image paths and labels
- ``VAD``: Valeo Anomaly Detection Dataset
- ``Visa``: Visual Anomaly Dataset

Expand All @@ -36,6 +37,7 @@
from .mvtecad import MVTec, MVTecAD
from .mvtecad2 import MVTecAD2
from .realiad import RealIAD
from .tabular import Tabular
from .vad import VAD
from .visa import Visa

Expand All @@ -54,6 +56,7 @@ class ImageDataFormat(str, Enum):
- ``MVTEC_AD_2``: MVTec AD 2 Dataset
- ``MVTEC_3D``: MVTec 3D AD Dataset
- ``MVTEC_LOCO``: MVTec LOCO Dataset
- ``TABULAR``: Custom Tabular Dataset
- ``REALIAD``: Real-IAD Dataset
- ``VAD``: Valeo Anomaly Detection Dataset
- ``VISA``: Visual Anomaly Dataset
Expand All @@ -69,6 +72,7 @@ class ImageDataFormat(str, Enum):
MVTEC_3D = "mvtec_3d"
MVTEC_LOCO = "mvtec_loco"
REAL_IAD = "realiad"
TABULAR = "tabular"
VAD = "vad"
VISA = "visa"

Expand All @@ -83,6 +87,7 @@ class ImageDataFormat(str, Enum):
"MVTecAD2",
"MVTecLOCO",
"RealIAD",
"Tabular",
"VAD",
"Visa",
]
Loading
Loading