Skip to content

This Streamlit application allows users to upload images and engage in interactive conversations about them using the Ollama Vision Model (llama3.2-vision). The app provides a user-friendly interface for image analysis, combining visual inputs with natural language processing to deliver detailed and context-aware responses.

Notifications You must be signed in to change notification settings

agituts/ollama-vision-model-enhanced

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Enhanced Image Analysis with Ollama Vision Model

This Streamlit app allows users to analyze images using the Ollama Vision Model, with features such as conversation management.

2024-11-11_15-53-40

Requirements

  • Python 3.x
  • Streamlit
  • Ollama
  • Pillow

Setup

Introducing a chatgpt style conversational vision model

  1. Clone the repository:

    git clone https://github.com/agituts/ollama-vision-model-enhanced.git
    cd ollama-vision-model-enhanced
  2. Set Up a Virtual Environmentt (Recommended):

    Create a virtual environment to isolate project dependencies.

    • On Windows
    python -m venv venv
    venv\Scripts\activate
    • On macOS/Linux
    python3 -m venv venv
    source venv/bin/activate
  3. Install Python Dependencies:

    Install the required Python packages using pip:

    pip install -r requirements.txt

    If requirements.txt is missing or incomplete, install the packages manually:

    pip install streamlit ollama Pillow
  4. Install and Set Up Ollama:

    • Download and Install Ollama: Visit https://ollama.com/ and follow the installation instructions for your operating system.

    • Verify Ollama Installation:

    ollama version
  5. Pull the Required Ollama Model:

    • Pull the llama2:3b-vision model using Ollama:
    ollama pull llama2:3b-vision
  6. Start the Ollama server:

    • Start the Ollama server to enable communication with the model:
    ollama serve

    Tips:

    • Keep this terminal window open as the server needs to run continuously. You may open a new terminal window for the next steps.
    • Also, in Windows, you will need to exit any instance of Ollama running in the taskbar.
  7. Launch the App:

    • In a new terminal window (with your virtual environment activated), navigate to the project directory if you're not already there:
     cd ollama-vision-model-enhanced
    • Launch the app:
    streamlit run app.py

    This command will start the Streamlit server and open the app in your default web browser. If it doesn't open automatically, you can manually visit http://localhost:8501 in your browser.

Usage

Basic Operations:

  • Upload an Image: Use the file uploader to select and upload an image (PNG, JPG, or JPEG).
  • Add Context (Optional): In the sidebar under "Conversation Management", you can add any relevant context for the conversation.
  • Enter Prompts: Use the chat input at the bottom of the app to ask questions or provide prompts related to the uploaded image.
  • View Responses: The app will display the AI assistant's responses based on the image analysis and your prompts.

Conversation Management

  • Save Conversations: Conversations are saved automatically and can be managed from the sidebar under "Previous Conversations".
  • Load Conversations: Load previous conversations by clicking the folder icon (📂) next to the conversation title.
  • Edit Titles: Edit conversation titles by clicking the pencil icon (✏️) and saving your changes.
  • Delete Conversations: Delete individual conversations using the trash icon (🗑️) or delete all conversations using the "Delete All Conversations" button.

Troubleshooting

Issue: Ollama Model Not Found

Symptoms:

  • Errors indicating the model cannot be found.
  • The app fails to generate responses.

Solution:

  • Ensure you've pulled the correct model with the exact name used in the code.
  • Double-check the model name in the process_image_and_text function:
# Verify model name in code
response = ollama.chat(
    model='llama3.2-vision',   
)

Issue: Connection Error with Ollama Server

Symptoms:

  • Errors related to connecting to the Ollama server.
  • The app is unable to process image and text prompts.

Solution:

  • Ensure the Ollama server is running in a terminal (ollama serve).
  • Verify there are no firewall restrictions blocking communication.
  • Restart the Ollama server if necessary.

Issue: Missing Python Packages

Symptoms:

  • Import errors when running the app (e.g., ModuleNotFoundError).
  • The app fails to start due to missing packages.

Solution:

  • Ensure all dependencies are installed:
pip install -r requirements.txt
  • If using a virtual environment, ensure it's activated when installing packages.

Issue: Streamlit App Not Starting

Symptoms:

  • Terminal shows errors when running streamlit run app.py.
  • The app doesn't open in the browser.

Solution:

  • Verify that you're in the correct directory (ollama-vision-model-enhanced).
  • Ensure app.py exists in the directory.
  • Check for syntax errors or typos in app.py.

Additional Tips

Ollama Server:

  • The Ollama server needs to run continuously while you're using the app.
  • If you close the terminal or the server stops, restart it with:
ollama serve

Running on a Different Port:

  • If you need to run the Streamlit app on a different port:
streamlit run app.py --server.port <PORT_NUMBER>

Stopping the App:

  • To stop the Streamlit app, press Ctrl+C in the terminal where it's running. Updating the App Code:
  • If you make changes to app.py, Streamlit will prompt you to rerun the app. Click "Rerun" or press R in the terminal.

About

This Streamlit application allows users to upload images and engage in interactive conversations about them using the Ollama Vision Model (llama3.2-vision). The app provides a user-friendly interface for image analysis, combining visual inputs with natural language processing to deliver detailed and context-aware responses.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages