GitHub - Superb-AI-Suite/aws-marketplace: ZERO

An Industrial Vision Foundation Model

This repository provides an example Jupyter Notebook on how to use the ZERO model package on Amazon SageMaker. ZERO is an industrial Vision Foundation Model (VFM) ready for immediate deployment without the need for data labeling or model retraining. It leverages zero-shot and open-world technologies to instantly detect and pinpoint new or unseen objects using simple text or image box prompts.

📖 Table of Contents

🧠 About the Model

Traditional Vision AI demands extensive data labeling and repetitive model retraining—a process that consumes significant time, cost, and specialized expertise. Superb AI's ZERO brings a paradigm shift as an industrial-specialized Vision Foundation Model (VFM).

Leveraging Open World Visual Grounding technology, ZERO comprehends novel concepts without prior training. This zero-shot capability empowers instant AI adoption for new tasks and flexible, on-the-fly changes to detection targets, eliminating the need for additional training. Instead of time-consuming retraining, you simply describe your target in text (e.g., "a dent on the car door") or provide an example image box, and ZERO adapts instantly. This dramatically cuts the time and cost of AI solution development, making AI adoption faster and more accessible.

This repository and the accompanying Jupyter Notebook (/zero/ZERO-Marketplace.ipynb) demonstrate how to subscribe, deploy, and run inference with ZERO on Amazon SageMaker.

✨ Key Features & Highlights

🚀 Zero-Shot Deployment: Instantly detect untrained objects without complex data collection, labeling, or model retraining. Adapt immediately to new products, defect types, or environment changes, dramatically cutting development time and costs.
✍️ Flexible Multi-Prompt Input: Deploy and operate AI instantly by simply describing your target object in text or providing an example image box. ZERO supports diverse input prompts for intuitive, human-like interaction.
🏭 Industrial-Specialized VFM: Trained on invaluable, real-world data from dozens of industrial sectors including manufacturing, logistics, and retail. ZERO delivers high performance and immediate usability across complex industrial domains.
💻 Edge & On-Premise Ready: Engineered for high efficiency with a lightweight 622M parameters and 1.03 TFLOPS processing, ensuring seamless operation on both cloud infrastructure and resource-constrained edge devices without requiring expensive GPU hardware.

🚀 SageMaker Deployment & Usage

This section provides a step-by-step guide to deploying ZERO and running inference, based on the provided Jupyter Notebook.

Step 1: Subscribe and Deploy the Model

Before you can use the model, you need to subscribe to it on the AWS Marketplace and deploy it to a SageMaker endpoint.

Navigate to the ZERO Model Listing on AWS Marketplace.
Click the Continue to Subscribe button.
Review the terms and conditions, then click "Accept Terms".
Once the subscription is active, follow the deployment instructions in the "Usage Information" tab on the Marketplace listing to create a SageMaker endpoint. The recommended instance type is ml.g4dn.xlarge.
Make a note of your endpoint's name.

Step 2: Set Up Environment for Inference

Configure your environment with your AWS credentials and initialize the Boto3 client to interact with SageMaker.

import boto3
import os

# Configure your AWS credentials and region
# It's recommended to use environment variables for security
os.environ['AWS_ACCESS_KEY_ID'] = 'YOUR_AWS_ACCESS_KEY_ID'
os.environ['AWS_SECRET_ACCESS_KEY'] = 'YOUR_AWS_SECRET_ACCESS_KEY'
os.environ['AWS_REGION'] = 'your-aws-region' # e.g., 'ap-northeast-2'

# Initialize Boto3 session and SageMaker runtime client
boto_session = boto3.Session(region_name=os.environ['AWS_REGION'])
sm_runtime = boto_session.client("sagemaker-runtime")

# The name of the endpoint you deployed in Step 1
zero_endpoint_name = "zero-marketplace" # Or your custom endpoint name

Step 3: Perform Real-time Inference

Once the endpoint is InService, you can send it image data and prompts to get predictions. The payload should be a JSON object containing the search_image and a list of queries.

Example 1: Using a Text Prompt

Here, we find all instances of "strawberry" in an image.

import json
import base64
from io import BytesIO
import requests
from PIL import Image

# Helper function to encode image to base64
def base64_encode(data):
    if isinstance(data, str):
        data = data.encode("utf-8")
    return base64.b64encode(data).decode("utf-8")

# 1. Load your image
image_url = "https://www.californiastrawberries.com/wp-content/uploads/2021/05/Rainbow-Fruit-Salad-1024.jpg"
response = requests.get(image_url)
image = Image.open(BytesIO(response.content)).convert("RGB")

# 2. Convert image to base64
buffered = BytesIO()
image.save(buffered, format="JPEG")
search_image_base64 = base64_encode(buffered.getvalue())

# 3. Construct the payload
body = {
    "search_image": f"data:image/jpeg;base64,{search_image_base64}",
    "queries": [
        {
            "prompt_image": "",
            "prompts": [
                {
                    "text": "strawberry",
                    "box": [],
                    "box_threshold": 0.1,
                    "multimodal_threshold": 0.22
                }
            ]
        }
    ]
}

# 4. Get prediction
response = sm_runtime.invoke_endpoint(
    EndpointName=zero_endpoint_name,
    Body=json.dumps(body),
    ContentType="application/json",
)

# The 'result' will contain bounding boxes for the detected objects.
result = json.load(response["Body"])
# visualize_detection(result, image)

Example 2: Using a Box (Semantic) Prompt

Provide an image, a text label, and a bounding box around an object of interest. The model will find other, similar objects in the search_image.

# (Image loading and encoding is similar to the above example)

# Construct the payload with a box prompt
# The box coordinates are [x_min, y_min, x_max, y_max] in absolute pixel format.
body = {
    "search_image": f"data:image/jpeg;base64,{search_image_base64}",
    "queries": [
        {
            "prompt_image": f"data:image/jpeg;base64,{search_image_base64}",
            "prompts": [
                {
                    "text": "potato",
                    "box": [1779.58, 2096.02, 1891.69, 2158.37],
                    "box_threshold": 0.1,
                    "multimodal_threshold": 0.3
                }
            ]
        }
    ]
}

# Get prediction from the endpoint
# ...

Step 4: Clean Up Resources

To avoid incurring ongoing charges, it is important to delete the SageMaker endpoint when you are finished.

# Initialize a sagemaker client to delete the endpoint
sagemaker_client = boto_session.client("sagemaker")

# Delete the endpoint
sagemaker_client.delete_endpoint(EndpointName=zero_endpoint_name)

# Optionally, delete the endpoint configuration and model
sagemaker_client.delete_endpoint_config(EndpointConfigName=zero_endpoint_name)
# You will need to find the model name associated with your endpoint to delete it.
# sagemaker_client.delete_model(ModelName=model_name)

📋 Input/Output Interface

Input Payload (`application/json`)

The model expects a JSON object with the following structure:

{
  "search_image": "data:image/jpeg;base64,<base64_encoded_string>",
  "queries": [
    {
      "prompt_image": "data:image/jpeg;base64,<base64_encoded_string>",
      "prompts": [
        {
          "text": "string_prompt_1",
          "box": [x_min, y_min, x_max, y_max],
          "box_threshold": 0.1,
          "multimodal_threshold": 0.22
        }
      ]
    }
  ]
}

search_image: Required. The image to perform detection on, as a Base64 encoded string with a data URI prefix.
queries: Required. A list of query objects. Each query can have its own prompt_image.
- prompt_image: The image containing the example objects for semantic search. Can be an empty string ("") for simple text-based search.
- prompts: A list of prompt definitions.
  - text: Required. The text label for the object you want to find.
  - box: A list of four numbers defining a bounding box [x_min, y_min, x_max, y_max]. Coordinates must be in absolute pixel values. For text-only prompts, provide an empty list [].
  - box_threshold: A confidence threshold for box-only prompts.
  - multimodal_threshold: A confidence threshold for combined text/box prompts.

Output Response (`application/json`)

The model returns a JSON object containing the prediction results:

{
    "output": [
        {
            "boxes": [
                [x1_min, y1_min, x1_max, y1_max],
                [x2_min, y2_min, x2_max, y2_max],
                ...
            ],
            "text": [
                "string_prompt_1",
                "string_prompt_2",
                ...
            ]
        }
    ]
}

output: A list containing a single dictionary with the detection results.
- boxes: A list of predicted bounding boxes. Each box corresponds to a label in the text list at the same index.
- text: A list of labels corresponding to the detected objects.

🛠️ Technical Specifications

Model Size: 622M parameters
Performance: 1.03 TFLOPS
Recommended Instance Type: ml.g4dn.xlarge or other GPU instances.

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

❓ Support

For questions, issues, or support regarding the ZERO model or this sample notebook, please open an issue in this GitHub repository.

For business inquiries or questions about Superb AI's other offerings, please contact us at contact@superb-ai.com.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📖 Table of Contents

🧠 About the Model

✨ Key Features & Highlights

🚀 SageMaker Deployment & Usage

Step 1: Subscribe and Deploy the Model

Step 2: Set Up Environment for Inference

Step 3: Perform Real-time Inference

Step 4: Clean Up Resources

📋 Input/Output Interface

Input Payload (`application/json`)

Output Response (`application/json`)

🛠️ Technical Specifications

📜 License

❓ Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
zero		zero
LICENSE		LICENSE
README.md		README.md

License

Superb-AI-Suite/aws-marketplace

Folders and files

Latest commit

History

Repository files navigation

📖 Table of Contents

🧠 About the Model

✨ Key Features & Highlights

🚀 SageMaker Deployment & Usage

Step 1: Subscribe and Deploy the Model

Step 2: Set Up Environment for Inference

Step 3: Perform Real-time Inference

Step 4: Clean Up Resources

📋 Input/Output Interface

Input Payload (application/json)

Output Response (application/json)

🛠️ Technical Specifications

📜 License

❓ Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Input Payload (`application/json`)

Output Response (`application/json`)

Packages