Skip to content

kenneally15/Surely-You-re-Joking-Mr-Llama

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A multimodal physics‑textbook summarizer & experiment visualizer inspired by Richard Feynman’s legendary Caltech lectures.

“If you can’t explain something to a first‑year student, then you haven’t really understood.” — Richard Feynman


Table of Contents

  1. Introduction
  2. Features
  3. System Architecture
  4. Quick Start
  5. Installation
  6. Usage
  7. Examples
  8. Configuration
  9. Roadmap
  10. Contributing
  11. Citation
  12. License

Introduction

Physics is best learned through clear explanations and vivid demonstrations. Surely You’re Joking, Mr. Llama (SYJML) automates both:

  1. Summarize. It extracts chapter‑level key insights, formulas, and historical context from any physics textbook (PDF, EPUB, or OCR images).
  2. Explain Experiments. It isolates every described experiment—from Millikan’s oil‑drop to J.J. Thomson’s cathode ray—and rewrites them into concise, step‑by‑step protocols.
  3. Visualize. Each protocol is piped to Gemini 2.5 Pro to generate accurate, high‑resolution schematic images suitable for lecture slides or lab manuals.

The result: a Feynman‑style digest of theory and hands‑on intuition in minutes.


Features

  • Meta Llama MM backbone for multimodal text+diagram understanding.
  • Chunk‑aware summarization that respects textbook structure, preserving sections, subsections, and equation numbering.
  • Experiment extractor powered by a fine‑tuned span‑classification head.
  • Gemini 2.5 Pro image pipeline with automatic prompt engineering (temperature, negative prompts, safety filters).
  • Export to Markdown, LaTeX, and interactive HTML (Reveal.js slides).
  • CLI & Python SDK plus optional Streamlit web UI.
  • Extensible: plug‑in support for GPT‑4o checks, custom vision models, or bespoke citation styles.

System Architecture

flowchart LR
    A[PDF / EPUB / Image] -->|OCR & segmentation| B(Text Chunks)
    B -->|Meta Llama MM| C{Classifier}
    C -->|Insights| D[Summary Store]
    C -->|Experiments| E[Experiment DSL]
    E -->|Prompt Builder| F[Gemini 2.5 Pro]
    F --> G[PNG / SVG Renders]
    D & G --> H[Assembler]
    H --> I[Export Package]
Loading

Quick Start

# 1  Clone & create env
git clone https://github.com/kenneally15/Surely-You-re-Joking-Mr-Llama.git
cd Surely-You-re-Joking-Mr-Llama
python -m venv .venv && source .venv/bin/activate

# 2  Install core deps
pip install -r requirements.txt

# 3  Download models (≈ 12 GB)
python tools/download_models.py --llama-mm --gemini-pro

# 4  Run demo summary
python syjml/run.py --input docs/feynman_vol1.pdf --out out/demo

The default config will output:

  • summary.md – textbook digest
  • experiments/ – folder of PNG schematics & captions
  • slides.html – interactive deck

Installation

SYJML targets Python ≥ 3.10 and CUDA 12. Detailed steps for Linux & macOS are in INSTALL.md. A Dockerfile is provided for one‑command setup.


Usage

CLI

python syjml/run.py \
  --input path/to/textbook.pdf \
  --chapters 1-7 9 \
  --vision-resolution 2048 \
  --export md pdf

Python API

from syjml import Pipeline
pipe = Pipeline.from_pretrained()
results = pipe("/data/griffiths_optics.pdf", chapters=["2", "3"])
print(results.summary[0].text)
results.experiments[0].image.show()

Streamlit UI

streamlit run syjml/app.py

Configuration

  • `` controls chunk size, overlap, Gemini prompt templates, image specs, and safety settings.
  • Environment variables (LLAMA_API_KEY, GEMINI_API_KEY, etc.) manage credential storage.
  • Adapters. Replace Gemini with Stable Diffusion XL by pointing vision_backend: "sdxl".

Roadmap

Phase Goal ETA
1 End‑to‑end MVP (current) ✅ Now
2 Interactive Jupyter plugin for inline explanations Q3 2025
3 Cross‑textbook knowledge graph for concept linking Q4 2025
4 Voice‑over narrated videos (TTS) 2026

Contributing

Pull requests are welcome! Please run pre‑commit and ensure new unit tests pass (pytest -q). For major changes, open a discussion first.


Citation

If you use SYJML in academic work, please cite:

@software{kenneally2025syjml,
  author       = {Kenneally, Kevin},
  title        = {Surely You’re Joking, Mr. Llama: Multimodal Physics Textbook Summarization},
  year         = {2025},
  url          = {https://github.com/kenneally15/Surely-You-re-Joking-Mr-Llama},
  version      = {1.0.0}
}

License

This project is licensed under the MIT License. See LICENSE for details.

About

Meta LlamaCon Hackathon NYC

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published