Multimodal Voice RAG Agent using Speech-to-Text, FAISS Search, and Text-to-Speech
-
Updated
Jun 22, 2025 - Python
Multimodal Voice RAG Agent using Speech-to-Text, FAISS Search, and Text-to-Speech
A pure Python implementation of ReAct agent without using any frameworks like LangChain. It follows the standard ReAct loop of Thought, Action, PAUSE, and Observation. The agent utilizes multiple tools, including Calculator, Wikipedia, Web Search, and Weather. A web UI is also provided using Streamlit.
A hybrid Research Assistant that combines an exact Knowledge Graph (Neo4j) with a Retrieval‑Augmented Generation pipeline (FAISS + Cross‑Encoder + FLAN‑T5) behind a sleek Streamlit interface.
Add a description, image, and links to the rag-agent topic page so that developers can more easily learn about it.
To associate your repository with the rag-agent topic, visit your repo's landing page and select "manage topics."