GitHub - lazyelectrons/GenAI-LLM-Demo-Toolkit: Repository to quickly setup and run Gen AI/LLM applications stack on a x86/64 ubuntu with Nvidia GPU

GenAI/LLM Demo Toolkit

Get LLM or GenAI running on your GPU enabled PC/Server with 4 commands. This repository provides scripts to automate the installation of the LLM/GenAI (Generative AI) software stack on a single node(server/pc). Ideal for PoC (Proof of Concept), demonstration and testing purposes, this stack simplifies the setup process, allowing you to focus on exploring and evaluating various GenAI tools and capabilities. You can also run NIM/NGC containers on this node.

This toolkit installs the following payloads in containers:

Oogaboogaa for LLM/Chat
- oogabooga container oogabooga container
OpenWebUI for Chat & RAG
Stable Diffusion WebUI for image generation
- Stable Diffusion WebUI Container Container
AI Monitor for GPU/CPU utilization monitoring

This enables you to quickly configure a system with a GPU to run open-source GenAI/LLMs locally. Currently, it supports NVIDIA GPUs. Refer to the documentation from respective repository for detailed instrutions.

Special thanks to AI Toolkit for the inspiration.

Text GEN UI

Open Web UI

RAG

Stable Diffusion

What's Included

Installation Scripts: Automated scripts to install baseline packages and dependencies.
LLM Text Gen UI: To run various models on the local node.
OpenWebUI: For RAG.
Stable Diffusion: For image generation.
Docker Infrastructure: In case you'd like to run Nvidia NIMs.
Baseline Libraries: Torch, Conda, and others, in case you like to experiment or run bare-metal loads.

Requirements

Operating System: Ubuntu 22.04 LTS
Hardware:
- NVIDIA GPUs (1 or more) with CUDA support
- At least 100 GB free disk space
Software:
- ubuntu minimal install
- sudo access

Installation Instructions

Clone the Repository to your home directory

cd 
git clone https://github.com/lazyelectrons/GenAI-LLM-Demo-Toolkit.git

cd GenAI-LLM-Demo-Toolkit

Run the CUDA/Driver Installation Script
```
./ai.sh
```
This script will install all necessary drivers, and platform tools and reboot the server.

After the reboot, you can proceed with the next steps.
Install and Start LLM/Web UI containers
```
./llm-install.sh
```
This command will install the textgen UI and OpenWebUI, dowload microsoft Phi-3-mini-4k-instruct model for textgen UI and start both applications. Once the installation is complete, you can access the LLM UI using the following URLs: Note: It can take up to a minute to bring up the UI, depending on your compute/network speed.
- Text Gen Web UI Access UI via http://<serverIP>:7070
- Open Web UI: Access via http://<serverIP>:8080

Note: Check the troubleshooting if you are facing issues with Open Web UI.

Monitor GPU/CPU Utilization

On a separate terminal, run the following command:
```
python /ai/ai-monitor/ai-monitor.py
```
To monitor the CPU and GPU utliization.

You can also use nvtop in the terminal window to monitor GPU performance.
To stop the LLM Containers
```
./llm-stop.sh
```
This will stop the LLM containers but will not remove them.
To run the LLM Containers again
```
./llm-start.sh
```
This will start the LLM containers again.
To Install/Start stable diffusion ImageGen
```
./image-gen-install.sh
```
This will install the stable diffusion image generator and start the application. You can access the image generation application via http://<server IP>:7860 Note: It can take up to a minute to bring up the UI, depending on your compute/network speed.
To stop the stable diffusion Image Gen
```
./image-gen-stop.sh
```
This will stop the stable diffusion image generator but will not remove it.
To run the stable diffusion Image Gen again
```
./image-gen-start.sh
```
This will re-start the stable diffusion image generator.

Running Nvidia NIMs

You need the access/API key from Nvidia to access their repo/NIMs(container). Here is an example script to run the llama3-8b-instruct NIM on this node

docker login nvcr.io
user: $oauthtoken 
password: <API KEY>

export NGC_API_KEY=<KEY>
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"
docker run -it --rm \
--gpus all \
--shm-size=16GB \
-e NGC_API_KEY \
-v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
-u $(id -u) \
-p 8000:8000 \
nvcr.io/nim/meta/llama3-8b-instruct:1.0.0

Note: You need have access to specific NIMs to download/run them locally.Securing an Nvidia API key will not get you NIM download access.

Behind the scenes

Text Gen UI is deployed with API support. Open WEBUI connects to the text-gen API port for Chat/RAG. The model name displayed on the Open WEB UI is maintained for compatibility with OpenAI API.

The default model for text gen is Microsoft Phi. It's highly recommended to update the model to llama2.x or 3.x or similar for better performance, especially on RAG. You can do that via text gen web UI or manually using huggingface cli

huggingface-cli download meta-llama/Meta-Llama-3.1-8B-Instruct --local-dir ~/text-generation-webui-docker/config/models/Meta-Llama-3.1-8B-Instruct --token <your HF token>

Troubleshooting

All paths are relative, ensure you run the scripts exactly as specified above. Check $HOME/ucsx-ai.log file for driver install log.

If you have multiple network interfaces, ensure the docker binding is on the correct interface.

To troubleshoot container start-up, run each container manually to isolate the error.

If openweb UI is not listing the model, run ping <hostname> on the server and ensure it's resolving to an interface IP, not 127.0.0.1. If it is pinging to 127.0.0.1, edit /etc/hosts and make sure 127.0.0.1 is not pointing to the hostname

           IPAddress     Hostname    		
            127.0.0.1			localhost	 	
            10.1.1.1            my-llm-host

You can also verify the same by issuing hostname -i to ensure it returms only one interface IP, not loopbacks.

Or you can edit docker-compose-ow.yml file and update the following section with server interface IP and restart the containers with ./llm-stop.sh and ./llm-start.sh

    - OPENAI_API_BASE_URL=http://<IP Address>:5000/v1

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
README.md		README.md
ai.sh		ai.sh
docker-compose-ob.yml		docker-compose-ob.yml
docker-compose-ow.yml		docker-compose-ow.yml
image-gen-install.sh		image-gen-install.sh
image-gen-start.sh		image-gen-start.sh
image-gen-stop.sh		image-gen-stop.sh
imggen.png		imggen.png
llm-install.sh		llm-install.sh
llm-start.sh		llm-start.sh
llm-stop.sh		llm-stop.sh
owui.png		owui.png
rag.png		rag.png
textgenui.png		textgenui.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GenAI/LLM Demo Toolkit

Text GEN UI

Open Web UI

RAG

Stable Diffusion

What's Included

Requirements

Installation Instructions

Behind the scenes

Troubleshooting

About

Uh oh!

Releases

Packages

Languages

lazyelectrons/GenAI-LLM-Demo-Toolkit

Folders and files

Latest commit

History

Repository files navigation

GenAI/LLM Demo Toolkit

Text GEN UI

Open Web UI

RAG

Stable Diffusion

What's Included

Requirements

Installation Instructions

Behind the scenes

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages