Skip to content

anvanvan/mac-whisper-speedtest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Whisper Benchmark for Apple Silicon

A comprehensive benchmarking tool to compare different Whisper implementations optimized for Apple Silicon, focusing on speed while maintaining accuracy. Now supports 8 different implementations including native Swift frameworks and MLX-based solutions.

Example output

Test results from MacBook Pro M4 24GB:

=== Benchmark Summary for 'large' model ===

Implementation         Avg Time (s)    Parameters
--------------------------------------------------------------------------------
fluidaudio-coreml      0.1935          model=parakeet-tdt-0.6b-v2-coreml, backend=FluidAudio Swift Bridge, platform=Apple Silicon
    "Which is the fastest transcription on my Mac?"

parakeet-mlx           0.4995          model=parakeet-tdt-0.6b-v2, implementation=parakeet-mlx, platform=Apple Silicon (MLX)
    "Which is the fastest transcription on my Mac?"

mlx-whisper            1.0230          model=whisper-large-v3-turbo, quantization=none
    "Which is the fastest transcription on my Mac?"

insanely-fast-whisper  1.1324          model=whisper-large-v3-turbo, device_id=mps, batch_size=12, compute_type=float16, quantization=4bit
    "Which is the fastest transcription on my Mac?"

whisper.cpp            1.2293          model=large-v3-turbo-q5_0, coreml=True, n_threads=4
    "Which is the fastest transcription on my Mac?"

lightning-whisper-mlx  1.8160          model=large, batch_size=12, quant=none
    "which is the fastest transcription on my Mac"

whisperkit             2.2190          model=large-v3, backend=WhisperKit Swift Bridge, platform=Apple Silicon
    "Which is the fastest transcription on my Mac?"

whisper-mps            5.3722          model=large, backend=whisper-mps, device=mps, language=None
    "Which is the fastest transcription on my Mac?"

faster-whisper         6.9613          model=large-v3-turbo, device=cpu, compute_type=int8, beam_size=1, cpu_threads=12, original_model_requested=large
    "Which is the fastest transcription on my Mac?"

Demo: https://x.com/anvanvan/status/1913624854584037443

Overview

This tool measures transcription performance across different implementations of the same base model (e.g., all variants of "small"). It helps you find the fastest Whisper implementation on your Apple Silicon Mac for a given base model. With 8 different implementations including native Swift frameworks, MLX-based solutions, and CPU-optimized variants, you can find the perfect balance of speed and accuracy for your use case.

Features

  • Live speech recording with automatic audio preprocessing
  • 8 different Whisper implementations with Apple Silicon optimizations
  • Base model selection (tiny, small, base, medium, large) with automatic fallback chains
  • Transcription quality comparison - see actual transcription text alongside performance metrics
  • Apple Silicon-specific optimizations with up to 16% performance improvements
  • Native Swift bridge support for WhisperKit and FluidAudio frameworks
  • Comprehensive parameter reporting showing actual models and configurations used

Implementations Tested

πŸš€ Native Apple Silicon Implementations

  1. WhisperKit ⚑ Apple Silicon Native

    • Source: https://github.com/argmaxinc/WhisperKit
    • Technology: Native Swift + CoreML, optimized for Apple Neural Engine
    • Performance: Fastest implementation, leverages on-device inference
    • Bridge: Custom Swift bridge for seamless Python integration
  2. FluidAudio CoreML ⚑ Apple Silicon Native

    • Source: https://github.com/FluidInference/FluidAudio
    • Technology: Native Swift + CoreML with Parakeet TDT models
    • Performance: Real-time streaming ASR with ~110x RTF on M4 Pro
    • Bridge: Custom Swift bridge with internal timing measurement

πŸ”₯ MLX-Accelerated Implementations

  1. mlx-whisper

  2. Parakeet MLX ⚑ New Implementation

  3. lightning-whisper-mlx ⚑ Apple Silicon Optimized

⚑ GPU/MPS-Accelerated Implementations

  1. insanely-fast-whisper ⚑ Apple Silicon Optimized

  2. whisper-mps ⚑ New Implementation

πŸ–₯️ CPU-Optimized Implementations

  1. faster-whisper ⚑ Apple Silicon Optimized

    • Source: https://github.com/SYSTRAN/faster-whisper
    • Optimization: Dynamic CPU thread allocation, Apple Accelerate framework
    • Performance: 2.0% faster with intelligent core detection (performance vs efficiency cores)
  2. whisper.cpp + CoreML

Installation

# Clone the repository
git clone https://github.com/anvanvan/mac-whisper-speedtest.git
cd mac-whisper-speedtest

# Install dependencies (Python 3.11+ required)
uv sync

# Build Swift bridges (required for native implementations)
# WhisperKit bridge
cd tools/whisperkit-bridge && swift build -c release && cd ../..

# FluidAudio bridge (optional - only if you want FluidAudio support)
cd tools/fluidaudio-bridge && swift build -c release && cd ../..

Usage

# Run benchmark with default settings (small model, all implementations)
.venv/bin/mac-whisper-speedtest

# Run benchmark with a specific model
.venv/bin/mac-whisper-speedtest --model small

# Run benchmark with specific implementations
.venv/bin/mac-whisper-speedtest --model small --implementations "WhisperKitImplementation,FluidAudioCoreMLImplementation,ParakeetMLXImplementation"

# Test only the fastest native implementations
.venv/bin/mac-whisper-speedtest --model small --implementations "WhisperKitImplementation,FluidAudioCoreMLImplementation"

# Test MLX-based implementations
.venv/bin/mac-whisper-speedtest --model small --implementations "MLXWhisperImplementation,ParakeetMLXImplementation,LightningWhisperMLXImplementation"

# Run benchmark with more runs for statistical accuracy
.venv/bin/mac-whisper-speedtest --model small --num-runs 5

Features

Universal Transcription Display 🎯

All implementations display their transcription results in the benchmark summary, allowing you to compare both performance and transcription quality across different implementations.

Key features:

  • βœ… Universal: Works with all 9 supported implementations including native Swift bridges
  • βœ… Smart formatting: Long text is truncated, empty results show "(no transcription)"
  • βœ… Clean display: Consistent indentation and formatting across all implementations
  • βœ… Model transparency: Shows actual models used (including fallback substitutions)
  • βœ… Bridge timing: Native implementations report internal transcription time (excluding bridge overhead)
  • βœ… Performance focus: Transcription display doesn't interfere with timing comparisons

Requirements

  • macOS with Apple Silicon (M1/M2/M3/M4) - Required for optimal performance
  • macOS 14.0+ (for WhisperKit and FluidAudio native support)
  • Xcode 15.0+ (for building Swift bridges)
  • Python 3.11+ (updated requirement for latest dependencies)
  • PyAudio and its dependencies (for audio recording)
  • Swift Package Manager (for building native bridges)
  • Various Whisper implementations (installed automatically via uv)

Project Structure

mac-whisper-speedtest/
β”œβ”€β”€ pyproject.toml                    # Updated dependencies (Python 3.11+)
β”œβ”€β”€ docs/
β”‚   └── APPLE_SILICON_OPTIMIZATIONS.md  # Detailed optimization guide
β”œβ”€β”€ src/
β”‚   └── mac_whisper_speedtest/
β”‚       β”œβ”€β”€ __init__.py
β”‚       β”œβ”€β”€ audio.py                  # Audio recording/processing
β”‚       β”œβ”€β”€ benchmark.py              # Enhanced benchmarking with transcription display
β”‚       β”œβ”€β”€ implementations/          # Individual implementation wrappers
β”‚       β”‚   β”œβ”€β”€ __init__.py           # Implementation registry (9 implementations)
β”‚       β”‚   β”œβ”€β”€ base.py               # Abstract base class
β”‚       β”‚   β”œβ”€β”€ coreml.py             # WhisperCpp with CoreML
β”‚       β”‚   β”œβ”€β”€ faster.py             # Faster Whisper (Apple Silicon optimized)
β”‚       β”‚   β”œβ”€β”€ insanely.py           # Insanely Fast Whisper (Apple Silicon optimized)
β”‚       β”‚   β”œβ”€β”€ mlx.py                # MLX Whisper
β”‚       β”‚   β”œβ”€β”€ lightning.py          # Lightning Whisper MLX (4-bit quantization)
β”‚       β”‚   β”œβ”€β”€ whisperkit.py         # WhisperKit (Swift bridge)
β”‚       β”‚   β”œβ”€β”€ fluidaudio_coreml.py  # FluidAudio CoreML (Swift bridge)
β”‚       β”‚   β”œβ”€β”€ parakeet_mlx.py       # Parakeet MLX (new implementation)
β”‚       β”‚   └── whisper_mps.py        # whisper-mps (new implementation)
β”‚       └── cli.py                    # Command line interface
β”œβ”€β”€ tools/
β”‚   β”œβ”€β”€ whisperkit-bridge/           # Swift bridge for WhisperKit
β”‚   β”‚   β”œβ”€β”€ Package.swift
β”‚   β”‚   └── Sources/whisperkit-bridge/main.swift
β”‚   └── fluidaudio-bridge/           # Swift bridge for FluidAudio
β”‚       β”œβ”€β”€ Package.swift
β”‚       └── Sources/fluidaudio-bridge/main.swift
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ test_model_params.py         # Model parameter validation tests
β”‚   └── test_parakeet_integration.py # Parakeet MLX integration tests
└── README.md

Version 2.0 (Latest)

  • βœ… 4 new implementations: WhisperKit, FluidAudio CoreML, Parakeet MLX, whisper-mps
  • βœ… Native Swift bridges: Direct integration with macOS frameworks
  • βœ… Enhanced Apple Silicon optimizations: Up to 16% performance improvements
  • βœ… Transcription quality comparison: See actual transcription text in results
  • βœ… Model fallback chains: Automatic fallback for unavailable models (e.g., large β†’ large-v3-turbo β†’ large-v3)
  • βœ… Python 3.11+ support: Updated dependencies and requirements

License

MIT

About

Compare different Whisper implementations optimized for Apple Silicon

Topics

Resources

License

Stars

Watchers

Forks