This is a RAG chatbot built using LangGraph, LangChain, and either the IBM Granite model, LLaMA 3.1 via Ollama, or LLama 2. The chatbot answers technical questions based on the KRKN pod scenarios documentation.
Note: To ensure accurate responses based on the provided documentation, please include the keyword “krkn” or other krkn context in your questions. This helps the system retrieve relevant context from the Krkn knowledge base, rather than generating general answers from unrelated sources.
git clone https://github.com/tejugang/krkn-lightspeed-rag-chatbot.git
cd krkn-lightspeed-rag-chatbot
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
If using the llama 3.1 LLM (reccomended), run this script:
brew install ollama
ollama run llama3
If using llama 2:7b LLM, run this script:
brew install ollama
ollama pull llama2:7b
Download instructions here
Ensure that ollama is running in the background
- open main.py and uncomment the code for the LLM you would like to use
- run
python3 main.py
(depending on your python version)
- run
streamlit run app.py
LLM performance improves significantly with better laptop hardware. LLM was tested on two different laptops:
- Laptop 1: Apple M3 Pro, 36 GB RAM, 12-core CPU, 18-core GPU
- Laptop 2: Apple M1, 16 GB RAM, 8-core CPU, 12-core GPU
Answers were generated in under 10 seconds on laptop 1, whereas answers were generated in 15-30 seconds on laptop 2. (for llama 3.1 LLM)
Enhancements being planned can be found in the roadmap
If you want to evaluate the performance of the LLM being used to generate answers: User guide to the evaluation pipeline
Note: The output of steps 1-3 are the files in the folder evaluationPipeline
- open eval.py and uncomment the code for the model you are evaluating
- edit the email field on line 121 with the email that evaluation metrics should be sent to
- after the script runs, open the json file (file name is on line 125)
- copy the entire json file and open the Evaluation Pipeline Endpoint (must connected to VPN).
- make sure the json structure matches the required format in the endpoint and paste it in these three endpoints
/evaluate_context_retrieval
,evaluate_response
, andevaluate_all
- evaluation metrics should be emailed to you