This project leverages LangChain, OpenAI, ChromaDB, and Gradio to create a question-answering system for any YouTube videos. By inputting questions related to the content of the provided videos, users receive answers along with a corresponding YouTube video.
- Extracts and translates text from YouTube videos.
- Splits text into manageable chunks for processing.
- Embeds text data using OpenAI embeddings.
- Stores and retrieves text data efficiently using Chroma vector store.
- Generates answers to questions using OpenAI's language model.
- Displays both the answer and the related YouTube video in an interactive Gradio interface.
- OpenAI API key
- Necessary Python packages (specified in the
requirements.txt
)
-
Clone the repository:
git clone https://github.com/MuratcanLaloglu/openai-rag-with-youtube-transcripts-and-chromadb-ai-assistant.git cd openai-rag-with-youtube-transcripts-and-chromadb-ai-assistant
-
Install the required packages:
pip install -r requirements.txt
-
Set your OpenAI API key: Put your OpenAI API Key in
.env
folder. -
Add YouTube URLs: Use
Dataset_Creator.ipynb
to createvideo_links.txt
. (URLs of the YouTube videos you want to process).
-
Run the notebook
-
Open your web browser and go to the URL provided by Gradio
-
Ask a question: Type your question related to a video in the input box.
-
View the answer and video: The system will display the answer and embed the related YouTube video.