Welcome to the AI-Powered MCQ and Q&A Generator, a cutting-edge tool designed to transform your documents into interactive learning resources! This project leverages advanced natural language processing (NLP) models to generate multiple-choice questions (MCQs) and provide answers to custom questions based on uploaded documents. Whether you're a student, educator, or professional, this tool simplifies studying and content exploration with an intuitive interface and robust functionality.
- MCQ Generation: Automatically create multiple-choice questions from PDF documents with one correct answer and three distractors.
- Question Answering: Ask any question about the content of your document and receive accurate, context-based answers.
- Multi-Format Support: Process PDFs, DOCX, and TXT files (with primary focus on PDFs).
- Streamlit Web Interface: A user-friendly UI powered by Streamlit for seamless interaction.
- Chunked Text Processing: Efficiently handle large documents by splitting them into manageable chunks.
- Hugging Face Integration: Utilize powerful NLP models like
EleutherAI/gpt-neo-1.3B
andmistralai/Mistral-7B-Instruct-v0.1
for text generation. - GPU Acceleration: Optimized for GPU usage to speed up processing (configurable for lighter models).
- Vector Store (Chroma): Experimental support for document embeddings to enhance question answering (placeholder implementation).
- Python 3.8+
- A Hugging Face API key (set up in a
.env
file) - GPU (optional, for faster processing)
- Required libraries (listed in
requirements.txt
)
-
Clone the Repository:
git clone https://github.com/rohanmistry231/AI-Powered-MCQ-QA-Generator.git cd AI-Powered-MCQ-QA-Generator
-
Set Up a Virtual Environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install Dependencies:
pip install -r requirements.txt
-
Configure Environment Variables: Create a
.env
file in the project root and add your Hugging Face API key:HUGGINGFACE_API_KEY=your_huggingface_api_key
-
Streamlit App (
app.py
): Launch the web interface:streamlit run app.py
Open your browser to
http://localhost:8501
and upload a PDF to start generating MCQs or asking questions. -
Command-Line MCQ Generator (
mcq_app.py
ormodel.py
): Run the script to generate MCQs from a PDF:python mcq_app.py
Enter the path to your PDF and the number of questions when prompted.
-
Command-Line Q&A Tool (
qa_app.py
): Run the script to ask questions about a document:python qa_app.py
Provide the file path and ask questions interactively.
ai-mcq-qa-generator/
├── app.py # Streamlit web app for MCQ and Q&A
├── mcq_app.py # Command-line MCQ generator
├── qa_app.py # Command-line Q&A tool
├── model.py # Core logic for MCQ generation (shared with mcq_app.py)
├── .env # Environment variables (API keys)
├── .gitignore # Git ignore file
├── requirements.txt # Python dependencies
└── README.md # Project documentation
- Document Loading: Upload a PDF (or other supported formats) via the Streamlit UI or command-line scripts.
- Text Extraction: Extract text using
PyPDF2
orfitz
(PyMuPDF) for PDFs, with support for chunking large documents. - MCQ Generation:
- Text is processed in chunks to handle large documents.
- Hugging Face models generate MCQs with one correct answer and three incorrect options.
- Question Answering:
- Text is chunked and optionally stored in a Chroma vector store (placeholder embedding).
- Custom questions are answered using context-aware prompts sent to Hugging Face models.
- Output: Results are displayed in the Streamlit UI or printed to the console.
- Education: Generate practice quizzes for students from textbooks or lecture notes.
- Training: Create assessments for corporate training materials.
- Self-Study: Explore and understand complex documents by asking targeted questions.
- Content Review: Quickly generate questions to test comprehension of reports or articles.
- Add support for more document formats (e.g., Markdown, HTML).
- Implement real embeddings for Chroma vector store to improve Q&A accuracy.
- Enhance MCQ formatting for export to formats like JSON or CSV.
- Optimize chunking and model parameters for faster processing.
- Add user authentication and session management for the Streamlit app.
We welcome contributions! To get started:
- Fork the repository.
- Create a feature branch (
git checkout -b feature/YourFeature
). - Commit your changes (
git commit -m "Add YourFeature"
). - Push to the branch (
git push origin feature/YourFeature
). - Open a pull request.
- Hugging Face for providing powerful NLP models.
- Streamlit for the intuitive web interface framework.
- PyMuPDF and PyPDF2 for PDF processing.
- LangChain for document chunking and vector store utilities.
Ready to transform your documents into interactive learning tools? Dive in and start generating MCQs and answers today! 🚀