This document provides step-by-step instructions for setting up the development environment and running the application.
Before starting, ensure that you have all the necessary tools installed on your system.
Ollama is required to provide model inference capabilities.
- Download and install Ollama from https://ollama.com/
- Start the Ollama service with the command:
ollama serve
ollama pull llama3.2:3b
LLama-Stack will be used to manage our inference environment.
- Install the
uv
package manager - Set up a virtual environment (venv)
- Run the following command inside the virtual environment:
or
INFERENCE_MODEL=llama3.2:3b llama stack build --template ollama --image-type venv --run
INFERENCE_MODEL=meta-llama/Llama-3.3-70B-Instruct llama stack build --template together --image-type venv --run
Clone this repository and install the necessary dependencies:
-
Clone the repository:
git clone [https://github.com/ricardoborges/chatlab.git] cd [chatlab]
-
Create a virtual environment and install dependencies:
uv venv uv pip install -r myproject.toml
Create togetherAI account if you won't start Ollama local service. So, you would first get an API key from Together if you dont have one already.
How to get your API key: https://docs.google.com/document/d/1Vg998IjRW_uujAPnHdQ9jQWvtmkZFt74FldW2MblxPY/edit?tab=t.0
You will need this env variables in your .env file:
TAVILY_SEARCH_API_KEY= TOGETHER_API_KEY=
Or just ignore and set DEFAULT_STACK="Ollama" in main.py (if you will run local Ollama service)
Start the Gradio application with the following command:
gradio main.py
After running this command, the application interface will be available in your browser.
If you encounter any issues during installation, check:
- That the Ollama service is running
- That the virtual environment was activated correctly
- That all dependencies were successfully installed
For more information about LLama-Stack, refer to the official documentation.