-
Notifications
You must be signed in to change notification settings - Fork 770
Adding the pipeline for the task explanation and Llm #2190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Bepitic
wants to merge
50
commits into
open-edge-platform:main
Choose a base branch
from
Bepitic:llm-pipeline
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 37 commits
Commits
Show all changes
50 commits
Select commit
Hold shift + click to select a range
adbca17
Add Task EXPLANATION and the visualization of images with description.
Bepitic 5611ec1
upd dataset task with explanation
Bepitic 8ed23a3
fix tasktype on metrics, depth, cataset, inferencer.
Bepitic a463b5b
Merge branch 'main' into llm-pipeline
Bepitic d5baf6b
fix lint on visualization/image
Bepitic b7c8eaa
Merge branch 'openvinotoolkit:main' into llm-pipeline
Bepitic 5b563d9
Merge branch 'llm-pipeline' of github.com:Bepitic/anomalib into llm-pβ¦
Bepitic bfd936e
Fix formatting dataset
Bepitic f541316
fix format data/base/depth
Bepitic 4e392a9
Fix formatting openvino_inferencer
Bepitic 5fc70ba
fix formatting
Bepitic 75099af
Add Explanation to error-msg.
Bepitic e5040d3
OpenAI - VLM init
Bepitic 86ad803
Add wrapper to run OpenAI
Bepitic 3678f72
add in ppyproject
Bepitic 7413842
Add Test and fix description/title
Bepitic dc42cbd
Add Readme and fix bug.
Bepitic 5788d22
Update src/anomalib/models/image/openai_vlm/lightning_model.py
Bepitic e4f6bec
Update src/anomalib/models/image/openai_vlm/__init__.py
Bepitic 5437467
Add fix pipeline bug.
Bepitic 982c9ca
Add test.
Bepitic 642fd26
Merge branch 'OpenAI-VLM' of github.com:Bepitic/anomalib into OpenAI-VLM
Bepitic b8cacf0
add changes
Bepitic 0929dc9
Add integration test and unit test + skip export.
Bepitic 39cf996
change to LANGUAGE
Bepitic 671693d
Update images in Readme.
Bepitic 224118b
Update src/anomalib/models/image/chatgpt_vision/__init__.py
Bepitic b703a41
Update src/anomalib/models/image/chatgpt_vision/chatgpt.py
Bepitic 24c5486
Update src/anomalib/models/image/chatgpt_vision/lightning_model.py
Bepitic 68e757e
Update tests/integration/model/test_models.py
Bepitic 86714a1
Update src/anomalib/models/image/chatgpt_vision/lightning_model.py
Bepitic 196d2a3
Update src/anomalib/models/image/chatgpt_vision/lightning_model.py
Bepitic b7f345a
fix comments
Bepitic b285d10
remove last file of chatgpt_vision.
Bepitic a688530
fix tests
Bepitic 0fb5f79
Merge pull request #1 from Bepitic/OpenAI-VLM (GPTVad)
Bepitic 6503543
Merge branch 'main' into llm-pipeline
Bepitic 8e92e5e
Update src/anomalib/models/image/gptvad/chatgpt.py
Bepitic 5ab044d
upd: language -> VISUAL_PROMPTING
Bepitic 3f9ca93
fix visual prompting and model_name
Bepitic 391b4c4
fix GPT for Gpt and the folder of the tests.
Bepitic ca1a0bb
fix: change import error outside.
Bepitic 022dcb7
fix readme pointing to the right model.
Bepitic af7b9e9
fix import cycle, and separate usecase by explicit if.
Bepitic faf334f
upd: add comments to the few shot / zero shot.
Bepitic 3ed8d3f
fix: dataset expected colums
Bepitic 7f454c4
upd: add the same logic of the label on visualize_full.
Bepitic 45bd520
Merge branch 'main' into llm-pipeline
Bepitic 44586d6
Fix in the logic of the code.
Bepitic 7adb835
Merge branch 'llm-pipeline' of github.com:Bepitic/anomalib into llm-pβ¦
Bepitic File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
"""Generative Pre-Trained Transformer (GPT) based Large Language Model (LLM).""" | ||
|
||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
from .lightning_model import GPTVad | ||
|
||
__all__ = ["GPTVad"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
"""Wrapper for the OpenAI calls to the VLM model.""" | ||
|
||
import logging | ||
import os | ||
from typing import Any | ||
|
||
import openai | ||
|
||
|
||
class GPTWrapper: | ||
"""A wrapper class for making API calls to OpenAI's GPT-4 model to detect anomalies in images. | ||
|
||
Environment variable OPENAI_API_KEY (str): API key for OpenAI. | ||
https://platform.openai.com/docs/quickstart/step-2-set-up-your-api-key | ||
Other possible models: https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4 | ||
All models with vision capabilities: 'gpt-4-turbo-2024-04-09', 'gpt-4-turbo', | ||
all versions of 'gpt-4o-mini', and 'gpt-4o' | ||
|
||
Args: | ||
model_name (str): Model name for OpenAI API VLM. Default "gpt-4o" | ||
detail (bool): If the images will be sended with high detail or low detail. | ||
|
||
""" | ||
|
||
def __init__(self, model_name: str = "gpt-4o", detail: bool = True) -> None: | ||
openai_key = os.getenv("OPENAI_API_KEY") | ||
self.model_name = model_name | ||
self.detail = detail | ||
if not openai_key: | ||
from anomalib.engine.engine import UnassignedError | ||
Bepitic marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
msg = "OpenAI environment key not found.(OPENAI_API_KEY)" | ||
raise UnassignedError(msg) | ||
|
||
def api_call( | ||
self, | ||
images: list[str], | ||
extension: str = "png", | ||
) -> str: | ||
"""Makes an API call to OpenAI's GPT-4 model to detect anomalies in an image. | ||
|
||
Args: | ||
images (list[str]): List of base64 images that serve as examples and last one to check for anomalies. | ||
extension (str): Extension of the group of images that needs to be checked for anomalies. Default = 'png' | ||
|
||
Returns: | ||
str: The response from the GPT-4 model indicating whether the image has anomalies or not. | ||
It returns 'NO' if there are no anomalies and 'YES: description' if there are anomalies, | ||
where 'description' provides details of the anomaly and its position. | ||
|
||
Raises: | ||
openai.error.OpenAIError: If there is an error during the API call. | ||
""" | ||
prompt: str = "" | ||
if len(images) > 0: | ||
prompt = """ | ||
You will receive an image that is going to be an example of the typical image without any anomaly, | ||
and the last image that you need to decide if it has an anomaly or not. | ||
Answer with a 'NO' if it does not have any anomalies and 'YES: description' | ||
where description is a description of the anomaly provided, position. | ||
""" | ||
else: | ||
prompt = """ | ||
Examine the provided image carefully to determine if there is an obvious anomaly present. | ||
Anomalies may include mechanical malfunctions, unexpected objects, safety hazards, structural damages, | ||
or unusual patterns or defects in the objects. | ||
|
||
Instructions: | ||
|
||
1. Thoroughly inspect the image for any irregularities or deviations from normal operating conditions. | ||
|
||
2. Clearly state if an obvious anomaly is detected. | ||
- If an anomaly is detected, begin with 'YES,' followed by a detailed description of the anomaly. | ||
- If no anomaly is detected, simply state 'NO' and end the analysis. | ||
|
||
Example Output Structure: | ||
|
||
'YES: | ||
- Description: Conveyor belt misalignment causing potential blockages. | ||
This may result in production delays and equipment damage. | ||
Immediate realignment and inspection are recommended.' | ||
|
||
'NO' | ||
|
||
Considerations: | ||
|
||
- Ensure accuracy in identifying anomalies to prevent overlooking critical issues. | ||
- Provide clear and concise descriptions for any detected anomalies. | ||
- Focus on obvious anomalies that could impact final use of the object operation or safety. | ||
""" | ||
|
||
detail_img = "high" if self.detail else "low" | ||
messages: list[dict[str, Any]] = [ | ||
{ | ||
"role": "system", | ||
"content": prompt, | ||
}, | ||
] | ||
for image in images: | ||
Bepitic marked this conversation as resolved.
Show resolved
Hide resolved
|
||
image_message = [ | ||
{ | ||
"role": "user", | ||
"content": [ | ||
{ | ||
"type": "image_url", | ||
"image_url": { | ||
"url": f"data:image/{extension};base64,{image}", | ||
"detail": detail_img, | ||
}, | ||
}, | ||
], | ||
}, | ||
] | ||
messages.extend(image_message) | ||
|
||
try: | ||
# Make the API call using the openai library | ||
response = openai.chat.completions.create( | ||
model=self.model_name, | ||
messages=messages, | ||
max_tokens=300, | ||
) | ||
return response.choices[-1].message.content or "" | ||
except Exception: | ||
msg = "The openai API trow an exception." | ||
Bepitic marked this conversation as resolved.
Show resolved
Hide resolved
|
||
logging.exception(msg) | ||
raise |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,155 @@ | ||
"""OpenAI Visual Large Model: Zero-/Few-Shot Anomaly Classification. | ||
|
||
Paper (No paper) | ||
""" | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
import base64 | ||
import logging | ||
from pathlib import Path | ||
|
||
import torch | ||
from lightning.pytorch.utilities.types import STEP_OUTPUT | ||
from torch.utils.data import DataLoader | ||
|
||
from anomalib import LearningType | ||
from anomalib.metrics.threshold import ManualThreshold | ||
from anomalib.models.components import AnomalyModule | ||
|
||
from .chatgpt import GPTWrapper | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
__all__ = ["GPTVad"] | ||
|
||
|
||
class GPTVad(AnomalyModule): | ||
Bepitic marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"""OpenAI VLM Lightning model using OpenAI's GPT-4 for image anomaly detection. | ||
|
||
Args: | ||
k_shot(int): The number of images that will compare to detect if it is an anomaly. | ||
model_name (str): The OpenAI VLM for visual anomaly detection. | ||
detail (bool): The detail of the input in the vlm for the image detection 'high'(true) 'low'(false). | ||
""" | ||
|
||
def __init__( | ||
self, | ||
k_shot: int = 0, | ||
model_name: str = "gpt-4o", | ||
detail: bool = True, | ||
) -> None: | ||
super().__init__() | ||
|
||
self.k_shot = k_shot | ||
|
||
self.model_name = model_name | ||
self.detail = detail | ||
self.image_threshold = ManualThreshold() | ||
self.vlm = GPTWrapper(model_name=self.model_name, detail=self.detail) | ||
|
||
def _setup(self) -> None: | ||
dataloader = self.trainer.datamodule.train_dataloader() | ||
pre_images = self.collect_reference_images(dataloader) | ||
self.pre_images = pre_images | ||
|
||
def _encode_image(self, image_path: str) -> str: | ||
"""Function to encode the image into base64 to send it with the prompt.""" | ||
path = Path(image_path) | ||
with path.open("rb") as image_file: | ||
return base64.b64encode(image_file.read()).decode("utf-8") | ||
|
||
def training_step(self, batch: dict[str, str | torch.Tensor], *args, **kwargs) -> dict[str, str | torch.Tensor]: | ||
"""Train Step of LLM.""" | ||
del args, kwargs # These variables are not used. | ||
# no train on llm | ||
return batch | ||
|
||
@staticmethod | ||
def configure_optimizers() -> None: | ||
"""OpenaiVlm doesn't require optimization, therefore returns no optimizers.""" | ||
return | ||
|
||
def validation_step( | ||
self, | ||
batch: dict[str, str | list[str] | torch.Tensor], | ||
*args, | ||
**kwargs, | ||
) -> STEP_OUTPUT: | ||
"""Get batch of anomaly maps from input image batch. | ||
|
||
Args: | ||
batch (dict[str, str | list[str] | torch.Tensor]): Batch containing image filename, image, label and mask | ||
args: Additional arguments. | ||
kwargs: Additional keyword arguments. | ||
|
||
Returns: | ||
dict[str, Any]: str_otput and pred_scores, the output of the Llm and pred_scores 1.0 if is an anomaly image. | ||
""" | ||
del args, kwargs # These variables are not used. | ||
batch_size = len(batch["image_path"]) | ||
outputs: list[str] = [] | ||
predictions: list[float] = [] | ||
for i in range(batch_size): | ||
# Getting the base64 string | ||
base64_images = [self._encode_image(img) for img in self.pre_images] | ||
base64_images.append(self._encode_image(batch["image_path"][i])) | ||
|
||
try: | ||
output = self.vlm.api_call(base64_images) | ||
except Exception: | ||
logging.exception( | ||
f"Error calling openAI API for image {batch['image_path'][i]}", | ||
) | ||
output = "Error" | ||
|
||
# set an error and get to normal if not followed | ||
prediction = 0.0 | ||
if output.startswith("N"): | ||
prediction = 0.0 | ||
elif output.startswith("Y"): | ||
prediction = 1.0 | ||
else: | ||
logging.warning( | ||
f"(Set predition to '0' Normal)Could not identify if there is anomaly by the output:\n{output}", | ||
) | ||
|
||
outputs.append(output) | ||
predictions.append(prediction) | ||
logging.debug(f"Output: {output}, Prediction: {prediction}") | ||
|
||
batch["str_output"] = outputs | ||
batch["pred_scores"] = torch.tensor(predictions).to(self.device) | ||
batch["pred_labels"] = torch.tensor(predictions).to(self.device) | ||
return batch | ||
|
||
@property | ||
def trainer_arguments(self) -> dict[str, int | float]: | ||
"""Set model-specific trainer arguments.""" | ||
return {} | ||
|
||
@property | ||
def learning_type(self) -> LearningType: | ||
"""The learning type of the model. | ||
|
||
Llm is a zero-/few-shot model, depending on the user configuration. Therefore, the learning type is | ||
set to ``LearningType.FEW_SHOT`` when ``k_shot`` is greater than zero and ``LearningType.ZERO_SHOT`` otherwise. | ||
""" | ||
return LearningType.ZERO_SHOT if self.k_shot == 0 else LearningType.FEW_SHOT | ||
|
||
def collect_reference_images(self, dataloader: DataLoader) -> list[str]: | ||
"""Collect reference images for few-shot inference. | ||
|
||
The reference images are collected by iterating the training dataset until the required number of images are | ||
collected. | ||
|
||
Returns: | ||
ref_images list[str]: A list containing the reference images path. | ||
""" | ||
reference_images_paths: list[str] = [] | ||
for batch in dataloader: | ||
image_paths = batch["image_path"][: self.k_shot - len(reference_images_paths)] | ||
reference_images_paths.extend(image_paths) | ||
if self.k_shot == len(reference_images_paths): | ||
break | ||
return reference_images_paths |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.