-
Notifications
You must be signed in to change notification settings - Fork 772
π feat(model): Add Dinomaly Model #2835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
rajeshgangireddy
wants to merge
46
commits into
open-edge-platform:main
Choose a base branch
from
rajeshgangireddy:dinomaly_workspace
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 38 commits
Commits
Show all changes
46 commits
Select commit
Hold shift + click to select a range
25b1b42
Rebuilt again
rajeshgangireddy 5c7355b
feat(ViTill): enhance model initialization and validation, improve feβ¦
rajeshgangireddy 049c7c5
feat(Dinomaly): enhance model documentation and improve training/valiβ¦
rajeshgangireddy d9ddea3
fix block's mem attention giving only one output in return
rajeshgangireddy cb85343
feat(Dinomaly): Working model. update model initialization and optimiβ¦
rajeshgangireddy f77de9f
Refactor DINOv2 training code: remove deprecated training scripts andβ¦
rajeshgangireddy b7760bf
feat(DINOmaly): Start cleaning up and adding doc strings
rajeshgangireddy 4d2c62e
feat(Dinomaly): start adding doc strings
rajeshgangireddy a7990d9
feat(ModelLoader): simplify class design, improve API, and enhance erβ¦
rajeshgangireddy fbfe346
refactor: remove model loader test script and improvement summary
rajeshgangireddy 6684f85
feat(Dinomaly): add StableAdamW optimizer and WarmCosineScheduler claβ¦
rajeshgangireddy b5891ab
feat(Dinomaly): implement WarmCosineScheduler and refactor model loadβ¦
rajeshgangireddy 1c4bfa8
Merge remote-tracking branch 'upstream/main' into dinomaly_workspace
rajeshgangireddy 510802c
Refactor and optimize code across multiple modules
rajeshgangireddy a0003f6
docs: update README and module docstrings for Dinomaly model; improveβ¦
rajeshgangireddy b9ac935
Remove files not used bu dinov2
rajeshgangireddy e442e1b
fix: update import paths for model components and adjust README tableβ¦
rajeshgangireddy 1938628
refactor: remove xFormers dependency checks from attention and block β¦
rajeshgangireddy cc07edd
refactor: remove SwiGLUFFN and related xFormers logic from swiglu_ffn.py
rajeshgangireddy 5c9c9b9
refactor: remove unused NestedTensorBlock and SwiGLUFFN imports from β¦
rajeshgangireddy d8212ec
refactor: clean up imports and remove unused code in dinov2 components
rajeshgangireddy 600e8aa
feat: add utility functions for Dinomaly model and benchmark configurβ¦
rajeshgangireddy 69113ab
feat: implement DinomalyMLP class and update model loader for DINOv2 β¦
rajeshgangireddy f0482da
refactor: replace Mlp with DinomalyMLP in model layers and update refβ¦
rajeshgangireddy 6aa9c24
feat: implement global cosine hard mining loss function and refactor β¦
rajeshgangireddy 9ee0123
refactor: replace custom DropPath and LayerScale implementations withβ¦
rajeshgangireddy 1fbc37a
refactor: reorganize Dinomaly model components and update imports forβ¦
rajeshgangireddy f95baf5
feat: add layer implementations and training utilities for Dinomaly mβ¦
rajeshgangireddy af8511c
refactor: reorganize Dinomaly model components and update imports forβ¦
rajeshgangireddy a3391b5
refactor: clean up code formatting and improve import organization acβ¦
rajeshgangireddy f45bfbe
refactor: improve readability by formatting parameters in patch embedβ¦
rajeshgangireddy 1af6c76
Remove workspace from Git tracking
rajeshgangireddy 279699b
Refactor Dinomaly model components for improved type safety and errorβ¦
rajeshgangireddy 8c24fc2
fix: update error message for sparse gradients in StableAdamW optimizβ¦
rajeshgangireddy 254c2a5
feat: add training utilities and update Dinomaly model for enhanced lβ¦
rajeshgangireddy 5280841
refactor: standardize weight downloading process and improve cache diβ¦
rajeshgangireddy b81b065
refactor: update image transformation methods and enhance training stβ¦
rajeshgangireddy cdf8640
refactor: remove example usage from ViTill class docstrings for clarity
rajeshgangireddy 87927d5
docs: enhance README.md with detailed architecture and key componentsβ¦
rajeshgangireddy 06882d2
Small refactor and minor improvements as per PR comments
rajeshgangireddy 1e5246f
refactor: replace einsum operations with matrix operations for OpenVIβ¦
rajeshgangireddy 51cd329
ruff complains about commented code
rajeshgangireddy d1162e2
refactor: add comments for security mitigations in weight loading andβ¦
rajeshgangireddy 462ea07
add dinomaly entry in reference guide
rajeshgangireddy 3eb4711
fix: make ruff/linters happy
rajeshgangireddy 5786251
refactor: remove outdated note about DDPStrategy in Dinomaly class
rajeshgangireddy File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
model: | ||
class_path: anomalib.models.Dinomaly | ||
init_args: | ||
encoder_name: dinov2reg_vit_base_14 | ||
bottleneck_dropout: 0.2 | ||
decoder_depth: 8 | ||
|
||
trainer: | ||
max_steps: 5000 | ||
callbacks: | ||
- class_path: lightning.pytorch.callbacks.EarlyStopping | ||
init_args: | ||
patience: 20 | ||
monitor: image_AUROC | ||
mode: max |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Dinomaly: Vision Transformer-based Anomaly Detection with Feature Reconstruction | ||
|
||
This is the implementation of the Dinomaly model based on the [original implementation](https://github.com/guojiajeremy/Dinomaly). | ||
|
||
Model Type: Segmentation | ||
|
||
## Description | ||
|
||
Dinomaly is a Vision Transformer-based anomaly detection model that uses an encoder-decoder architecture for feature reconstruction. The model leverages pre-trained DINOv2 Vision Transformer features and employs a reconstruction-based approach to detect anomalies by comparing encoder and decoder features. | ||
|
||
### Feature Extraction | ||
|
||
Features are extracted from multiple intermediate layers of a pre-trained DINOv2 Vision Transformer encoder. The model typically uses features from layers 2-9 for base models, providing multi-scale feature representations that capture both low-level and high-level semantic information. | ||
|
||
### Architecture | ||
|
||
The Dinomaly model consists of three main components: | ||
|
||
1. **DINOv2 Encoder**: Pre-trained Vision Transformer that extracts multi-layer features | ||
2. **Bottleneck MLP**: Compresses the multi-layer features before reconstruction | ||
3. **Vision Transformer Decoder**: Reconstructs the compressed features back to the original feature space | ||
|
||
### Anomaly Detection | ||
|
||
Anomaly detection is performed by computing cosine similarity between encoder and decoder features at multiple scales. The model generates anomaly maps by analyzing the reconstruction quality of features, where poor reconstruction indicates anomalous regions. Both anomaly detection (image-level) and localization (pixel-level) are supported. | ||
|
||
## Usage | ||
|
||
`anomalib train --model Dinomaly --data MVTecAD --data.category <category>` | ||
|
||
## Benchmark | ||
|
||
All results gathered with seed `42`. | ||
|
||
## [MVTec AD Dataset](https://www.mvtec.com/company/research/datasets/mvtec-ad) | ||
|
||
### Image-Level AUC | ||
|
||
| | Avg | Carpet | Grid | Leather | Tile | Wood | Bottle | Cable | Capsule | Hazelnut | Metal Nut | Pill | Screw | Toothbrush | Transistor | Zipper | | ||
| -------- | :-: | :----: | :--: | :-----: | :--: | :--: | :----: | :---: | :-----: | :------: | :-------: | :--: | :---: | :--------: | :--------: | :----: | | ||
| Dinomaly | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | ||
|
||
### Pixel-Level AUC | ||
|
||
| | Avg | Carpet | Grid | Leather | Tile | Wood | Bottle | Cable | Capsule | Hazelnut | Metal Nut | Pill | Screw | Toothbrush | Transistor | Zipper | | ||
| -------- | :-: | :----: | :--: | :-----: | :--: | :--: | :----: | :---: | :-----: | :------: | :-------: | :--: | :---: | :--------: | :--------: | :----: | | ||
| Dinomaly | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | | ||
|
||
### Image F1 Score | ||
|
||
| | Avg | Carpet | Grid | Leather | Tile | Wood | Bottle | Cable | Capsule | Hazelnut | Metal Nut | Pill | Screw | Toothbrush | Transistor | Zipper | | ||
| -------- | :-: | :----: | :--: | :-----: | :--: | :--: | :----: | :---: | :-----: | :------: | :-------: | :--: | :---: | :--------: | :--------: | :----: | | ||
| Dinomaly | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# Copyright (C) 2025 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
"""Dinomaly: Vision Transformer-based Anomaly Detection with Feature Reconstruction. | ||
|
||
The Dinomaly model implements a Vision Transformer encoder-decoder architecture for | ||
anomaly detection using pre-trained DINOv2 features. The model extracts features from | ||
multiple intermediate layers of a DINOv2 encoder, compresses them through a bottleneck | ||
MLP, and reconstructs them using a Vision Transformer decoder. | ||
|
||
Anomaly detection is performed by computing cosine similarity between encoder and decoder | ||
features at multiple scales. The model is particularly effective for visual anomaly | ||
detection tasks where the goal is to identify regions or images that deviate from | ||
normal patterns learned during training. | ||
|
||
Example: | ||
>>> from anomalib.models.image import Dinomaly | ||
>>> model = Dinomaly() | ||
|
||
The model can be used with any of the supported datasets and task modes in | ||
anomalib. It leverages the powerful feature representations from DINOv2 Vision | ||
Transformers combined with a reconstruction-based approach for robust anomaly detection. | ||
|
||
Notes: | ||
- Uses DINOv2 Vision Transformer as the backbone encoder | ||
- Features are extracted from intermediate layers for multi-scale analysis | ||
- Employs feature reconstruction loss for unsupervised learning | ||
- Supports both anomaly detection and localization tasks | ||
- Requires significant GPU memory due to Vision Transformer architecture | ||
|
||
See Also: | ||
:class:`anomalib.models.image.dinomaly.lightning_model.Dinomaly`: | ||
Lightning implementation of the Dinomaly model. | ||
""" | ||
|
||
from anomalib.models.image.dinomaly.lightning_model import Dinomaly | ||
|
||
__all__ = ["Dinomaly"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
# Copyright (C) 2025 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
"""Components module for Dinomaly model. | ||
|
||
This module provides all the necessary components for the Dinomaly Vision Transformer | ||
architecture including layers, model loader, utilities, and vision transformer implementations. | ||
""" | ||
|
||
# Layer components | ||
from .layers import ( | ||
Attention, | ||
Block, | ||
DinomalyMLP, | ||
LinearAttention, | ||
MemEffAttention, | ||
) | ||
|
||
# Model loader | ||
from .model_loader import DinoV2Loader, load | ||
|
||
# Utility functions and classes | ||
from .training_utils import ( | ||
CosineHardMiningLoss, | ||
StableAdamW, | ||
WarmCosineScheduler, | ||
) | ||
|
||
# Vision transformer components | ||
from .vision_transformer import ( | ||
DinoVisionTransformer, | ||
) | ||
|
||
__all__ = [ | ||
# Layers | ||
"Attention", | ||
"Block", | ||
"DinomalyMLP", | ||
"LinearAttention", | ||
"MemEffAttention", | ||
# Model loader | ||
"DinoV2Loader", | ||
"load", | ||
# Utils | ||
"StableAdamW", | ||
"WarmCosineScheduler", | ||
"CosineHardMiningLoss", | ||
# Vision transformer | ||
"DinoVisionTransformer", | ||
] |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.