A conversational AI application that allows you to interact with PDF documents using OpenAI's LLM. Built as part of a tutorial project to demonstrate NLP skills, Langchain integration, and Streamlit UI development. Ideal for document analysis, Q&A, and chatbot functionalities.
- Upload PDFs and extract text content.
- Split documents into manageable chunks for LLM processing.
- Generate embeddings and build a searchable index using Langchain.
- Ask questions in natural language and receive AI-powered answers.
- User-friendly interface built with Streamlit.
- Langchain: Framework for LLM integration, text splitting, and embeddings.
- OpenAI: GPT model and API for generative responses.
- Streamlit: Frontend UI for PDF uploads and chat interface.
- PyPDF2: PDF text extraction.
- Clone the repository:
git clone https://github.com/your-username/your-repo-name.git cd your-repo-name
- Install dependencies:
pip install langchain openai streamlit pypdf2 python-dotenv
- Set up your OpenAI API key in a .env file:
OPENAI_API_KEY="your-api-key-here
- Run the Streamlit app:
streamlit run app.py
- Upload a PDF file through the Streamlit interface.
- Ask questions about the PDF content in the chatbox.
- View the AI-generated responses in real-time.
- Split PDF text into chunks.
- Generate embeddings for each chunk.
- Build a searchable index using Langchain.
- Convert user questions into embeddings.
- Retrieve relevant text chunks from the index.
- Generate answers using OpenAI's LLM.
- Backend: Langchain for data processing, OpenAI API for LLM.
- Frontend: Streamlit for UI, chat interface, and PDF uploads.
🔗 Connect with me on LinkedIn or GitHub.