Skip to content

Commit c8ab62d

Browse files
manuelkonradsamet-akcayrajeshgangireddy
authored
Tabular datamodule (Custom dataset from DataFrame, CSV, or Parquet) (#2713)
* added first draft of tabular datamodule Signed-off-by: Manuel Konrad <84141230+manuelkonrad@users.noreply.github.com> * refactored make_tabular_dataset and addressed some minor comments Signed-off-by: Manuel Konrad <84141230+manuelkonrad@users.noreply.github.com> * added example notebook for the tabular datamodule Signed-off-by: Manuel Konrad <84141230+manuelkonrad@users.noreply.github.com> * added docstring example for Tabular.from_file constructor Signed-off-by: Manuel Konrad <84141230+manuelkonrad@users.noreply.github.com> --------- Signed-off-by: Manuel Konrad <84141230+manuelkonrad@users.noreply.github.com> Co-authored-by: Samet Akcay <samet.akcay@intel.com> Co-authored-by: Rajesh Gangireddy <rajesh.gangireddy@intel.com>
1 parent 2563624 commit c8ab62d

File tree

15 files changed

+755
-4
lines changed

15 files changed

+755
-4
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
1111
- 🚀 Add new SOTA video Anomaly detection module FUVAS
1212
- 🚀 Add VAD dataset by @abc-125 in https://github.com/open-edge-platform/anomalib/pull/2603
1313
- 🚀 Add Tiled Ensemble for V2 by @blaz-r in https://github.com/open-edge-platform/anomalib/pull/2660
14+
- 🚀 Add Tabular datamodule by @manuelkonrad in https://github.com/openvinotoolkit/anomalib/pull/2713
1415

1516
### Removed
1617

docs/source/markdown/guides/reference/data/datamodules/image.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,13 @@ Dataset format compatible with Intel Geti™.
2828
Custom folder-based dataset organization.
2929
:::
3030
31+
:::{grid-item-card} Tabular
32+
:link: anomalib.data.datamodules.image.Tabular
33+
:link-type: doc
34+
35+
Custom tabular dataset.
36+
:::
37+
3138
:::{grid-item-card} Kolektor
3239
:link: anomalib.data.datamodules.image.Kolektor
3340
:link-type: doc
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Tabular Datamodule
2+
3+
```{eval-rst}
4+
.. automodule:: anomalib.data.datamodules.image.tabular
5+
:members:
6+
:show-inheritance:
7+
```

examples/configs/data/tabular.yaml

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
class_path: anomalib.data.Tabular
2+
init_args:
3+
name: bottle
4+
root: "datasets/MVTecAD/bottle"
5+
train_batch_size: 32
6+
eval_batch_size: 32
7+
num_workers: 8
8+
test_split_mode: from_dir
9+
test_split_ratio: 0.2
10+
val_split_mode: same_as_test
11+
val_split_ratio: 0.5
12+
seed: null
13+
samples:
14+
- image_path: train/good/000.png
15+
label_index: 0
16+
mask_path: ""
17+
split: train
18+
- image_path: train/good/001.png
19+
label_index: 0
20+
mask_path: ""
21+
split: train
22+
- image_path: train/good/002.png
23+
label_index: 0
24+
mask_path: ""
25+
split: train
26+
- image_path: train/good/003.png
27+
label_index: 0
28+
mask_path: ""
29+
split: train
30+
- image_path: train/good/004.png
31+
label_index: 0
32+
mask_path: ""
33+
split: train
34+
- image_path: test/broken_large/000.png
35+
label_index: 1
36+
mask_path: ground_truth/broken_large/000_mask.png
37+
split: test
38+
- image_path: test/broken_large/002.png
39+
label_index: 1
40+
mask_path: ground_truth/broken_large/002_mask.png
41+
split: test
42+
- image_path: test/broken_large/004.png
43+
label_index: 1
44+
mask_path: ground_truth/broken_large/004_mask.png
45+
split: test
46+
- image_path: test/good/000.png
47+
label_index: 0
48+
mask_path: ""
49+
split: test
50+
- image_path: test/good/001.png
51+
label_index: 0
52+
mask_path: ""
53+
split: test
54+
- image_path: test/good/003.png
55+
label_index: 0
56+
mask_path: ""
57+
split: test
58+
- image_path: test/broken_large/001.png
59+
label_index: 1
60+
mask_path: ground_truth/broken_large/001_mask.png
61+
split: test
62+
- image_path: test/broken_large/003.png
63+
label_index: 1
64+
mask_path: ground_truth/broken_large/003_mask.png
65+
split: test
66+
- image_path: test/good/002.png
67+
label_index: 0
68+
mask_path: ""
69+
split: test
70+
- image_path: test/good/004.png
71+
label_index: 0
72+
mask_path: ""
73+
split: test
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
version https://git-lfs.github.com/spec/v1
2+
oid sha256:e38748d6c72f0c5a115c1e925b6012c72be1cf2529f03d508f533ca11634033c
3+
size 9777

src/anomalib/data/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@
6161
MVTecAD2,
6262
MVTecLOCO,
6363
RealIAD,
64+
Tabular,
6465
Visa,
6566
)
6667
from .datamodules.video import Avenue, ShanghaiTech, UCSDped, VideoDataFormat
@@ -75,6 +76,7 @@
7576
KolektorDataset,
7677
MVTecADDataset,
7778
MVTecLOCODataset,
79+
TabularDataset,
7880
VADDataset,
7981
VisaDataset,
8082
)
@@ -181,6 +183,7 @@ def get_datamodule(config: DictConfig | ListConfig | dict) -> AnomalibDataModule
181183
"MVTecAD2",
182184
"MVTecLOCO",
183185
"RealIAD",
186+
"Tabular",
184187
"VAD",
185188
"Visa",
186189
# Video Data Modules
@@ -196,6 +199,7 @@ def get_datamodule(config: DictConfig | ListConfig | dict) -> AnomalibDataModule
196199
"KolektorDataset",
197200
"MVTecADDataset",
198201
"MVTecLOCODataset",
202+
"TabularDataset",
199203
"VADDataset",
200204
"VisaDataset",
201205
"AvenueDataset",

src/anomalib/data/datamodules/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
# SPDX-License-Identifier: Apache-2.0
55

66
from .depth import Folder3D, MVTec3D
7-
from .image import VAD, BTech, Datumaro, Folder, Kolektor, MVTec, MVTecAD, Visa
7+
from .image import VAD, BTech, Datumaro, Folder, Kolektor, MVTec, MVTecAD, Tabular, Visa
88
from .video import Avenue, ShanghaiTech, UCSDped
99

1010
__all__ = [
@@ -16,6 +16,7 @@
1616
"Kolektor",
1717
"MVTec", # Include MVTec for backward compatibility
1818
"MVTecAD",
19+
"Tabular",
1920
"VAD",
2021
"Visa",
2122
"Avenue",

src/anomalib/data/datamodules/image/__init__.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
- ``MVTecAD``: MVTec Anomaly Detection Dataset
1111
- ``MVTecAD2``: MVTec Anomaly Detection Dataset 2
1212
- ``MVTecLOCO``: MVTec LOCO Dataset with logical and structural anomalies
13+
- ``Tabular``: Custom tabular dataset with image paths and labels
1314
- ``VAD``: Valeo Anomaly Detection Dataset
1415
- ``Visa``: Visual Anomaly Dataset
1516
@@ -36,6 +37,7 @@
3637
from .mvtecad import MVTec, MVTecAD
3738
from .mvtecad2 import MVTecAD2
3839
from .realiad import RealIAD
40+
from .tabular import Tabular
3941
from .vad import VAD
4042
from .visa import Visa
4143

@@ -54,6 +56,7 @@ class ImageDataFormat(str, Enum):
5456
- ``MVTEC_AD_2``: MVTec AD 2 Dataset
5557
- ``MVTEC_3D``: MVTec 3D AD Dataset
5658
- ``MVTEC_LOCO``: MVTec LOCO Dataset
59+
- ``TABULAR``: Custom Tabular Dataset
5760
- ``REALIAD``: Real-IAD Dataset
5861
- ``VAD``: Valeo Anomaly Detection Dataset
5962
- ``VISA``: Visual Anomaly Dataset
@@ -69,6 +72,7 @@ class ImageDataFormat(str, Enum):
6972
MVTEC_3D = "mvtec_3d"
7073
MVTEC_LOCO = "mvtec_loco"
7174
REAL_IAD = "realiad"
75+
TABULAR = "tabular"
7276
VAD = "vad"
7377
VISA = "visa"
7478

@@ -83,6 +87,7 @@ class ImageDataFormat(str, Enum):
8387
"MVTecAD2",
8488
"MVTecLOCO",
8589
"RealIAD",
90+
"Tabular",
8691
"VAD",
8792
"Visa",
8893
]

0 commit comments

Comments
 (0)