AI Agents Framework

RAG-Based Assistant for Researchers

This repository provides a comprehensive framework for building a Retrieval-Augmented Generation (RAG) chatbot specifically designed for researchers. This framework combines various components to create a powerful system that can retrieve relevant information from documents and generate accurate, contextually appropriate responses.

How is RAG Implemented?

RAG combines retrieval systems with large language models to generate more accurate and relevant responses by incorporating external knowledge.

Key Components:

Dataprep Component: The framework includes a dataprep component that processes input documents, breaking them into manageable chunks that can be embedded and indexed.
Embedding Service: Documents are transformed into vector representations using embedding models like BAAI/bge-base-en-v1.5 which captures the semantic meaning of text.
Vector Storage: These embeddings are stored in a Redis vector database, allowing for efficient similarity searches.
Retriever Component: When a user asks a question, the retriever component finds the most relevant document chunks by comparing the query embedding with the stored document embeddings.
Reranker Component: Retrieved documents are further refined through a reranking process using models like BAAI/bge-reranker-base to ensure the most relevant context is provided to the LLM.
LLM Backend: The LLM (either via vLLM or Groq) generates responses based on the retrieved context and the user's query.
Audio Component: Transcribes audio input into text, enabling voice interaction with the RAG system. Uses OpenAI's Whisper base model.
Frontend UI: User interface to interact with the chatbot.

This architecture allows real-time information integration and better response quality by combining search capabilities with generative models.

Each service is containerized for modularity and scalability.

vLLM vs Groq for model serving:

The repository offers two options for the LLM serving component: Groq and vLLM.

vLLM is an open-source high-performance inference engine for large language models.
Groq is a proprietary inference engine designed for ultra-fast response times using specialized hardware.

Groq is recommended for running the application on personal computers as Groq's API-based approach does NOT require running the LLM locally, making it much less resource-intensive for personal computers that may not have powerful GPUs. Additionally using Groq requires only an API key rather than setting up a complex local inference environment.

The vLLM option is more suitable for server deployments with dedicated GPU resources where running models locally might be preferred for reasons like data privacy.

Instructions for application setup:

Build images for all services

export no_proxy="127.0.0.1,localhost,dataprep-redis,tei-embedding-service,retriever,tei-reranking-service,backend,mongodb,vllm-service,whisper-service,groq-service"
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
export LLM_MODEL_ID="meta-llama/Llama-3.1-8B-Instruct"
export REDIS_URL="redis://redis-vector-db:6379"
export INDEX_NAME="rag-redis"
export EMBEDDING_SERVER_HOST_IP=tei-embedding-service
export EMBEDDING_SERVER_PORT=80
export RETRIEVER_SERVICE_HOST_IP=retriever
export RETRIEVER_SERVICE_PORT=7000 
export RERANK_SERVER_HOST_IP=tei-reranking-service
export RERANK_SERVER_PORT=80
export WHISPER_SERVICE_HOST_IP=whisper-service
export WHISPER_SERVICE_PORT=8765
export LLM_SERVER_HOST_IP=vllm-service
export LLM_SERVER_PORT=8000
export MONGO_HOST=mongodb
export MEGA_SERVICE_PORT=5008
export SERVER_HOST_URL="localhost:$MEGA_SERVICE_PORT"
export NEXT_PUBLIC_SERVER_URL=$SERVER_HOST_URL
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export DATAPREP_OUT_DIR=</path/to/dataprep/out/dir>
export HF_CACHE=</path/to/hf_cache/dir>

Note: host would be localhost for local dev or server hostname for remote server

Groq Serving (Recommended for PC):

export LLM_SERVER_HOST_IP=groq-service
export LLM_SERVER_PORT=8000
export GROQ_MODEL=llama-3.3-70b-versatile
export GROQ_API_KEY=${GROQ_API_KEY}

docker compose -f install/docker/docker-compose-groq.yaml up

vLLM Serving (Not required for PC development):

export SERVER_HOST_IP=vllm-service
export LLM_SERVER_HOST_IP=vllm-service
export LLM_SERVER_PORT=8000

docker compose -f install/docker/research-assistant/docker-compose.yaml up

The application will be available on http://localhost:5009

References

Setup

Run all services

Note: host would be localhost for local dev or server hostname for remote server

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
backup		backup
comps		comps
design-patterns		design-patterns
install		install
.gitignore		.gitignore
BACKEND_FULL_CONFIG_SUCCESS.md		BACKEND_FULL_CONFIG_SUCCESS.md
BACKEND_OBJECTIVES_FINAL_REPORT.md		BACKEND_OBJECTIVES_FINAL_REPORT.md
BACKEND_OBJECTIVES_STATUS.md		BACKEND_OBJECTIVES_STATUS.md
COMPLETE_SYSTEM_SAVE.md		COMPLETE_SYSTEM_SAVE.md
CURRENT_SESSION_STATE.md		CURRENT_SESSION_STATE.md
CURRENT_STATE.md		CURRENT_STATE.md
CURRENT_STATE_FINAL.md		CURRENT_STATE_FINAL.md
Dockerfile.embedding		Dockerfile.embedding
Dockerfile.lightweight_retriever		Dockerfile.lightweight_retriever
Dockerfile.simple_retriever		Dockerfile.simple_retriever
PROJECT_PROGRESS_SUMMARY.md		PROJECT_PROGRESS_SUMMARY.md
README.md		README.md
README.research.md		README.research.md
RESTART_GUIDE.md		RESTART_GUIDE.md
RESTART_GUIDE_COMPLETE.md		RESTART_GUIDE_COMPLETE.md
RESTART_GUIDE_FINAL.md		RESTART_GUIDE_FINAL.md
SYSTEM_SAVE_STATE.md		SYSTEM_SAVE_STATE.md
SYSTEM_STATE_CLEAN.md		SYSTEM_STATE_CLEAN.md
SYSTEM_STATE_SAVE.md		SYSTEM_STATE_SAVE.md
VISUAL_CONFIRMATION_GUIDE.md		VISUAL_CONFIRMATION_GUIDE.md
app.yaml		app.yaml
clean_git_history.bat		clean_git_history.bat
commit_and_push.bat		commit_and_push.bat
create_real_embeddings.py		create_real_embeddings.py
dashboard.py		dashboard.py
db.dump		db.dump
embedding_server.py		embedding_server.py
fix_git_scope.bat		fix_git_scope.bat
fix_github_tokens.bat		fix_github_tokens.bat
fix_redis_docs.py		fix_redis_docs.py
fresh_repository.bat		fresh_repository.bat
fresh_repository_windows.bat		fresh_repository_windows.bat
generate_test_embedding.py		generate_test_embedding.py
light-setup.sh		light-setup.sh
lightweight_retriever.py		lightweight_retriever.py
lightweight_retriever_fixed.py		lightweight_retriever_fixed.py
manual_test.py		manual_test.py
quick_restart.sh		quick_restart.sh
requirements.lightweight_retriever.txt		requirements.lightweight_retriever.txt
restart_final.sh		restart_final.sh
restart_system.bat		restart_system.bat
restart_system.sh		restart_system.sh
restart_system_complete.sh		restart_system_complete.sh
restore_from_backup.sh		restore_from_backup.sh
save_complete_state.sh		save_complete_state.sh
save_progress.sh		save_progress.sh
save_system_state.bat		save_system_state.bat
setup-lightweight.sh		setup-lightweight.sh
setup_redis_index.py		setup_redis_index.py
setup_working_system.sh		setup_working_system.sh
simple_index.py		simple_index.py
simple_retriever.py		simple_retriever.py
simple_retriever_final.py		simple_retriever_final.py
standalone_retriever.py		standalone_retriever.py
start_development.sh		start_development.sh
system_status.sh		system_status.sh
terminal_guide.md		terminal_guide.md
test_embedding.py		test_embedding.py
test_light_processor.py		test_light_processor.py
test_redis_indexer.py		test_redis_indexer.py
test_retrieval.py		test_retrieval.py
test_retrieval_command.sh		test_retrieval_command.sh
test_system.bat		test_system.bat
verify_system.py		verify_system.py
visual_test_rag.py		visual_test_rag.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Agents Framework

RAG-Based Assistant for Researchers

How is RAG Implemented?

Key Components:

vLLM vs Groq for model serving:

Instructions for application setup:

Build images for all services

Groq Serving (Recommended for PC):

vLLM Serving (Not required for PC development):

References

Setup

Run all services

About

Uh oh!

Releases

Packages

Languages

Panchadip-128/RAG_Based_Research_Assistant_System

Folders and files

Latest commit

History

Repository files navigation

AI Agents Framework

RAG-Based Assistant for Researchers

How is RAG Implemented?

Key Components:

vLLM vs Groq for model serving:

Instructions for application setup:

Build images for all services

Groq Serving (Recommended for PC):

vLLM Serving (Not required for PC development):

References

Setup

Run all services

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages