Skip to content

macjunkins/hearth

Repository files navigation

Hearth - MP4 Audio Transcription Tool

A cross-platform CLI tool for transcribing audio from MP4 files locally using Whisper AI. Built with Rust for performance and reliability.

🎯 Project Status

Current Phase: Planning & Design
Target: Alpha Release in ~2 weeks
Priority: #1 Active Project

This repository currently contains planning documents and technical specifications. Implementation begins with Phase 1 of our roadmap.

✨ Features (Planned)

  • Local Processing: Completely offline transcription using Whisper AI
  • Long File Support: Handle 45+ minute MP4 files efficiently through chunking
  • Cross-Platform: Native binaries for macOS and Windows
  • CLI-First Design: Simple command-line interface for rapid development
  • Memory Efficient: Smart chunking strategy for large files
  • Progress Tracking: Real-time progress indication during transcription

🚀 Quick Start (When Implemented)

# Basic transcription
./hearth input.mp4

# With custom output
./hearth input.mp4 --output transcript.txt

# With options
./hearth input.mp4 --model small --chunk-size 300 --verbose

📋 Requirements

  • Rust 1.70+ (for development)
  • FFmpeg (for audio extraction)
  • ~2GB disk space (for Whisper models)

macOS Setup

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Install FFmpeg
brew install ffmpeg

Windows Setup

# Install Rust from https://rustup.rs/
# Install FFmpeg from https://ffmpeg.org/download.html

🗂️ Repository Structure

├── ROADMAP_TO_ALPHA.md        # Detailed development plan
├── CLAUDE.md                  # AI assistant guidance
├── tauri_transcription_memo.md # CLI-first technical spec
├── tauri_transcription_memo-v-1.md # Original GUI approach
└── claude-web-conversation-summary.md # Design decisions

🛣️ Roadmap to Alpha

Phase Duration Goal
Phase 1 2 days Foundation (Rust project + audio extraction)
Phase 2 2 days Whisper integration (short file transcription)
Phase 3 2 days Long file handling (45-minute support)
Phase 4 1 day macOS polish and testing
Phase 5 2 days Windows cross-compilation

Total Timeline: 9 development days (~1.5-2 weeks)

See ROADMAP_TO_ALPHA.md for detailed daily tasks and milestones.

🏗️ Architecture

CLI-First Approach

We're building a command-line tool first for rapid prototyping, then extracting the core logic for a future GUI wrapper.

Technology Stack

  • Language: Rust
  • CLI Framework: clap
  • Speech Recognition: Local Whisper (candle-whisper or whisper-rs)
  • Audio Processing: ffmpeg-rs or system FFmpeg
  • Async Runtime: tokio

Core Components (Planned)

hearth/
├── src/
│   ├── main.rs           # CLI interface
│   ├── audio.rs          # MP4 → audio extraction
│   ├── transcription.rs  # Whisper integration
│   ├── chunking.rs       # Audio chunking logic
│   └── progress.rs       # Progress display
├── models/               # Whisper models
└── tests/               # Integration tests

🎯 Alpha Success Criteria

  • ✅ Transcribe MP4 files up to 45 minutes
  • ✅ Generate accurate TXT output
  • ✅ Work completely offline
  • ✅ Native macOS and Windows binaries
  • ✅ Reasonable performance (< 2x real-time)
  • ✅ Handle common MP4 formats
  • ✅ Memory-efficient processing

🧪 Testing Strategy

  1. Short files (2-3 minutes) for rapid development
  2. Medium files (15 minutes) for chunking validation
  3. Long files (45 minutes) for stress testing
  4. Cross-platform testing on macOS and Windows

🔄 Future Roadmap

After Alpha release:

  1. Beta: Add SRT/VTT export, batch processing
  2. GUI Phase: Extract core to library, build Tauri frontend
  3. Advanced Features: Multiple languages, speaker detection

📚 Documentation

  • CLAUDE.md - AI assistant guidance for development
  • ROADMAP_TO_ALPHA.md - Detailed development roadmap
  • Technical Specs - See memo files for architectural decisions

🤝 Contributing

This is currently a solo project in active development. The codebase will be ready for contributions after Alpha release.

For now, you can:

  • Review the technical specifications
  • Suggest improvements to the roadmap
  • Test Alpha builds when available

📄 License

[License TBD - will be added before first release]

📞 Contact

This is an indie development project. Issues and discussions welcome once implementation begins.


Note: This project is in active development. Star and watch for updates as we progress through the roadmap! 🚀

About

Hearth - MP4 audio transcription tool using local Whisper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •