Skip to content

Built a bilingual Retrieval-Augmented Generation (RAG) chatbot for Odisha government schemes using OpenRouter-hosted LLMs. Embedded chunked PDF data using all-MiniLM-L6 v2, performed cosine similarity-based retrieval, and integrated a 49B instruction-tuned model (thedrummer/valkyrie-49b-v1) for English and cohere/command-r-plus for odia responses

Notifications You must be signed in to change notification settings

anurag965/OdiaGenAI_JanaSathi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ—£οΈ JanaSathi: Odia E-Governance Chatbot

JanaSathi is a bilingual AI chatbot that helps users understand and access Odisha Government schemes. It uses a Retrieval-Augmented Generation (RAG) architecture to retrieve relevant content from official documents and generate responses in English and Odia.


πŸ€– Features

  • βœ… Answers queries about government schemes like KALIA Yojana, Mission Shakti, and Biju Swasthya Kalyan Yojana
  • πŸ“„ Reads and processes official government PDF documents
  • 🌐 Uses Large Language Models via OpenRouter API
  • πŸ” Retrieves relevant information using semantic search (MiniLM)
  • 🧠 Generates responses in English and translates to Odia
  • πŸ–₯️ Streamlit frontend for interactive usage

πŸ› οΈ Tech Stack

  • Python 3.10+
  • Streamlit
  • Sentence-Transformers (all-MiniLM-L6-v2)
  • OpenAI-compatible LLMs via OpenRouter
  • Cohere models for Odia translation
  • PyPDF2 for PDF parsing
  • scikit-learn for cosine similarity

πŸš€ Getting Started

1. Clone the repository

git clone https://github.com/your-username/odia-e-gov-chatbot.git
cd odia-e-gov-chatbot

2. Install dependencies

pip install -r requirements.txt

3. Add government scheme PDFs

Place your scheme-related PDFs in the project root directory.

4. Run the chatbot

streamlit run streamlit_app.py

πŸ’‘ How It Works

  • PDF documents are processed and chunked into smaller sections.
  • Embeddings are generated using all-MiniLM-L6-v2.
  • The chatbot retrieves top relevant chunks using cosine similarity.
  • Uses a 49B LLM to generate responses in English.
  • Translates the responses into Odia using Cohere’s LLM.
  • Presents both responses through a clean Streamlit interface.

πŸ“· Demo

Streamlit UI


πŸ“Œ To Do

  • Improve document chunking for regional formatting
  • Add document upload feature in the UI
  • Support voice-based queries and responses
  • Log chat history for audit and learning

πŸ™‹β€β™‚οΈ Author

Anurag Pradhan
πŸ“§ anuragpradhancb@gmail.com
πŸ”— LinkedIn β€’ GitHub


πŸ“„ License

This project is licensed under the MIT License.

About

Built a bilingual Retrieval-Augmented Generation (RAG) chatbot for Odisha government schemes using OpenRouter-hosted LLMs. Embedded chunked PDF data using all-MiniLM-L6 v2, performed cosine similarity-based retrieval, and integrated a 49B instruction-tuned model (thedrummer/valkyrie-49b-v1) for English and cohere/command-r-plus for odia responses

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages