This project explores the application of Retrieval-Augmented Generation (RAG) across different environments, leveraging both local and cloud-based AI models to enhance information retrieval, extraction, and reasoning capabilities.
RAG combines retrieval mechanisms with generative models, enabling more contextually relevant and informed responses. This project investigates RAG in two distinct domains:
- Enhancing local Large Language Models (LLMs) with vector embeddings to retrieve and analyze Ted Talks data.
- Utilizing a dataset containing transcripts and metadata to extract meaningful insights.
- Local LLM installation reference: Mozilla Ocho llamafile
- Dataset source: Kaggle Ted Talks Dataset
- Implementing a database-aware AI assistant using OpenAI models hosted on Azure.
- Exploring how retrieval-augmented AI can interact with structured data to enhance query answering and insights.
- Experiment with different retrieval mechanisms and embedding strategies.
- Compare local vs. cloud-based approaches for augmenting generative models.
- Analyze performance and use cases of RAG in diverse applications.
- Python (with relevant dependencies for LLM and vector embedding frameworks)
- Jupyter Notebook for running the experiments
- Access to local or cloud-based LLMs
- SQL database setup (for Azure OpenAI use case)
- Expanding RAG applications to additional datasets and domains.
- Fine-tuning retrieval strategies to enhance accuracy and efficiency.
- Benchmarking response quality and performance across different setups.
This project provides a structured exploration of how RAG can improve AI-driven knowledge retrieval, offering insights into both local and cloud-based implementations.