Comprehensive-Lecture-Notes-Generator

A powerful tool that automates the generation of comprehensive lecture notes by combining transcribed audio from university lectures with content from lecture slides. The system uses OpenAI's Whisper model for transcription, vector embeddings for semantic search, and LLaMa 70B for content generation.

Overview

This project streamlines the note-taking process by:

Transcribing lecture videos using OpenAI's Whisper model
Processing lecture slides (PowerPoint) to extract content
Storing transcriptions in a vector database for semantic search
Generating comprehensive notes by combining relevant transcripts with slide content
Outputting professional-quality PDF notes

Technical Architecture

Requirements

Python 3.10+
Groq API key (for LLM access)
Required Python libraries:
- groq
- moviepy
- pydub
- langchain
- langchain_groq
- pptx
- markdown2
- xhtml2pdf
- Pillow
- sentence-transformers

Installation and Setup

Clone the repository

Install required dependencies:

pip install groq moviepy pydub langchain langchain_groq python-pptx markdown2 xhtml2pdf pillow sentence-transformers

Create a key.txt file with your groq API key.

Note Generation Pipeline

LangChain is used to create a processing pipeline that:

Takes a topic from the lecture slides Finds relevant transcription chunks from the vector database Uses LLaMa 70B model (via Groq) to generate comprehensive notes

The prompt template is defined as:

prompt_temp = PromptTemplate.from_template('''
    ### UNIVERSITY LECTURE TRANSCRIPT:
    {lecture_transcript}

    ### LECTURE SLIDE CONTENT:
    {slides_content}

    ###TOPIC:
    {unique_topic}

    ###INSTRUCTIONS:
    You are John, An expert at making comprehensive academic notes. 
    You are required do exactly what you are good at. Given the lecture transcripts and the lecture slide content for a particular topic, generate
    comprehensive lecture notes for the same covering all important details, merging the information from the slides and the transcripts.
    ensure the text is correctly formatted.

    DO NOT provide a preamble.
    ### ANSWER (NO PREAMBLE):
''')

Usage

Place your lecture video file in the project directory Place your lecture PowerPoint slides in the project directory Update file paths in the notebook Run the notebook cells sequentially Retrieve generated PDF notes

Benefits

Time Efficiency: Automates the tedious process of manual note-taking Comprehensive Coverage: Combines visual slide content with spoken explanations Semantic Relevance: Uses vector search to match transcripts with relevant slide topics High Quality Output: Leverages state-of-the-art LLMs for coherent, well-structured notes

Limitations

Requires high-quality audio for accurate transcription Slide content extraction works best with text-based slides (vs. image-heavy slides) Processing large videos may require substantial computational resources

Future Improvements

Support for additional slide formats (Google Slides, PDF) Multi-language support Improved image extraction and processing Interactive web interface Real-time processing capabilities

License MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
generated_notes_example.pdf		generated_notes_example.pdf
main.ipynb		main.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Comprehensive-Lecture-Notes-Generator

Overview

Technical Architecture

Requirements

Installation and Setup

Note Generation Pipeline

About

Uh oh!

Releases

Packages

Languages

dudesoccer123/Comprehensive-Lecture-Notes-Generator

Folders and files

Latest commit

History

Repository files navigation

Comprehensive-Lecture-Notes-Generator

Overview

Technical Architecture

Requirements

Installation and Setup

Note Generation Pipeline

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages