Skip to content

hailo-ai/hailo_model_zoo_genai

Repository files navigation

docs/images/hmzga_logo.png

Hailo Model Zoo GenAI

HailoRT Ollama License: MIT

The Hailo Model Zoo GenAI is a curated collection of pre-trained GenAI models and example applications optimized for Hailo's AI processors, designed to accelerate GenAI application development.

It includes Hailo-Ollama, an Ollama-compatible REST API written in C++ on top of HailoRT, enabling seamless integration with various external tools and frameworks.

Ollama simplifies running large language models locally by managing model downloads, deployments, and interactions through a convenient REST API.

Models are specifically optimized for Hailo hardware, providing efficient, high-performance inference tailored for GenAI tasks.

Models

For a detailed list of supported models, including download links and relevant information, visit the models page.

Installation

Prerequisites

  • Hailo-10H module.
  • Ensure HailoRT is installed.

Two installation methods are available

  1. Pre-built Debian package (Recommended):
  • Download the latest Debian package from the Developer Zone.

  • Install it:

    sudo dpkg -i hailo_model_zoo_gen_ai_<ver>_<arch>.deb
    
  1. Build from source (Alternative):
  • Clone the repository and build the Hailo-Ollama server:

    git clone https://github.com/hailo-ai/hailo_model_zoo_gen_ai.git
    cd hailo-model-zoo-genai/
    mkdir build && cd build
    cmake -DCMAKE_BUILD_TYPE=Release ..
    cmake --build .
    
  • Install to user home (still in the build dir):

    cp ./src/apps/server/hailo-ollama ~/.local/bin/
    mkdir -p ~/.config/hailo-ollama/
    cp ../config/hailo-ollama.json ~/.config/hailo-ollama/
    mkdir -p ~/.local/share/hailo-ollama
    cp -r ../models/ ~/.local/share/hailo-ollama
    

Basic Usage

  • Start the Hailo-Ollama server:

    hailo-ollama
    
  • List available models:

    curl --silent http://localhost:8000/hailo/v1/list
    
  • Pull a specific model. For example:

    curl --silent http://localhost:8000/api/pull \
         -H 'Content-Type: application/json' \
         -d '{ "model": "qwen2:1.5b", "stream" : true }'
    
  • Chat with the model:

    curl --silent http://localhost:8000/api/chat \
         -H 'Content-Type: application/json' \
         -d '{"model": "qwen2:1.5b", "messages": [{"role": "user", "content": "Translate to French: The cat is on the table."}]}'
    

Optional Open WebUI

Example for running the Hailo-Ollama server with WebUI:

  • Install WebUI Ollama client.

  • Start the Hailo-Ollama server:

    hailo-ollama
    
  • Run WebUI Ollama client:

    OLLAMA_BASE_URL=http://127.0.0.1:8000 DATA_DIR=~/.open-webui uvx --python 3.10 open-webui@latest serve
    
  • Access the WebUI at http://localhost:8080

For detailed usage instructions and advanced examples, see the USAGE page.

Changelog

See the CHANGELOG page for detailed release notes.

License

The Hailo Model Zoo GenAI is distributed under the MIT license. Refer to the LICENSE file for details.

Support

For support, please post your question on the Hailo community Forum or contact us directly via hailo.ai.

About Hailo

Hailo provides innovative AI Inference Accelerators and AI Vision Processors specifically engineered for efficient, high-performance embedded deep learning applications on edge devices.

Hailo's AI Inference Accelerators enable edge devices to execute deep learning applications at full scale, leveraging architectures optimized for neural network operations. The Hailo AI Vision Processors (SoC) integrate powerful AI inferencing with advanced computer vision, delivering superior image quality and sophisticated video analytics.

For more information, visit hailo.ai.

About

Model zoo for Gen AI models for Hailo products

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •