Skip to content

🔒 fix(model): huggingface unsafe download #2823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

samet-akcay
Copy link
Contributor

@samet-akcay samet-akcay commented Jul 9, 2025

📝 Description

  • This PR addresses a security vulnerability (B615:huggingface_unsafe_download) reported by bandit. The from_pretrained() method in the Hugging Face backend was called without a specific revision, creating a risk of downloading a compromised model.
    This change resolves the issue by explicitly pinning the model revision to "main" for both the processor and the model in src/anomalib/models/image/vlm_ad/backends/huggingface.py. This ensures that a safe and deterministic version of the model is always used.

✨ Changes

Select what type of change your PR is:

  • 🚀 New feature (non-breaking change which adds functionality)
  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • 🔄 Refactor (non-breaking change which refactors the code base)
  • ⚡ Performance improvements
  • 🎨 Style changes (code style/formatting)
  • 🧪 Tests (adding/modifying tests)
  • 📚 Documentation update
  • 📦 Build system changes
  • 🚧 CI/CD configuration
  • 🔧 Chore (general maintenance)
  • 🔒 Security update
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)

✅ Checklist

Before you submit your pull request, please make sure you have completed the following steps:

  • 📚 I have made the necessary updates to the documentation (if applicable).
  • 🧪 I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).
  • 🏷️ My PR title follows conventional commit format.

For more information about code review checklists, see the Code Review Checklist.

…base

This commit updates various DataLoader instances in the project to enable the  option, enhancing performance for data loading on GPU. Changes were made in the following files:

- : Updated train and test DataLoader configurations.
- : Modified datamodule DataLoader to include .
- : Added  to evaluation DataLoader.
- : Updated DataLoader for datasets to utilize .
- : Enabled  for reference dataset DataLoader.
- : Adjusted inference DataLoader to include .

These changes aim to optimize memory usage and improve data transfer speeds during model training and inference.

Signed-off-by: samet-akcay <samet.akcay@intel.com>
This commit refactors the  function in  to utilize a dictionary mapping for decoder architectures, improving readability and maintainability. The previous conditional checks have been replaced with a more efficient approach, enhancing the overall structure of the code.

Signed-off-by: samet-akcay <samet.akcay@intel.com>
…nsistency

This commit modifies the initialization of the logit_scale parameter in the CLIP model to utilize torch.log instead of np.log. This change ensures consistency in tensor operations and improves compatibility with PyTorch's computation graph.

Signed-off-by: samet-akcay <samet.akcay@intel.com>
…lculations

This commit modifies the anomaly map generation logic to utilize PyTorch tensors instead of NumPy arrays for various calculations. This change enhances compatibility with the PyTorch computation graph and improves performance by leveraging GPU acceleration. Key updates include the conversion of statistical calculations and tensor operations to use PyTorch functions, ensuring consistency in tensor handling throughout the code.

Signed-off-by: samet-akcay <samet.akcay@intel.com>
…istical calculations

This commit refactors the anomaly map generation logic to replace NumPy-based statistical calculations with PyTorch equivalents, specifically using the  distribution for computing tau. Additionally, it improves precision handling by allowing the use of float64 in high precision mode. The changes streamline the computation process and maintain compatibility with the PyTorch computation graph.

Signed-off-by: samet-akcay <samet.akcay@intel.com>
Signed-off-by: samet-akcay <samet.akcay@intel.com>
This commit improves the URL validation in the download function to ensure only http and https schemes are allowed. Additionally, it adds comments to clarify the safety of using  under these conditions, enhancing code readability and security awareness.

Signed-off-by: samet-akcay <samet.akcay@intel.com>
This commit introduces an optional  parameter in the Huggingface backend's initialization, allowing users to specify the model revision when loading the processor and model. The changes ensure that the correct model version is utilized, enhancing flexibility and usability for different model configurations.

Signed-off-by: samet-akcay <samet.akcay@intel.com>
…rameter

This commit removes the optional `revision` parameter from the Huggingface backend's initialization, defaulting to "main" for model loading. This change streamlines the code and ensures consistent model version usage, enhancing clarity and maintainability.

Signed-off-by: Samet Akcay <samet.akcay@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant