FreeScribe is a modern, open-source transcription and translation web application that leverages on-device machine learning models, running entirely in your browser using Web Workers. Users can record or upload audio, transcribe speech to text, translate between languages, and export the results β all with privacy and speed, without sending data to any backend server.
- Live demo: https://free-scribe-arnob.vercel.app/
- Project Summary
- Features
- Technology Stack
- Project Structure
- How It Works
- Getting Started
- Usage Walkthrough
- Teaching Content & Examples
- Keywords
- Conclusion
- License
- ποΈ Audio Input: Record live or upload MP3/WAV files for transcription.
- βοΈ Transcription: Converts speech to text using ML models (OpenAI Whisper).
- π Translation: Translate transcribed text into multiple languages.
- β‘ Runs Locally: All ML inference runs in-browser via Web Workers for privacy and speed.
- πΎ Export: Download or copy the resulting text.
- π Modern UI: Built with React, Vite, and TailwindCSS.
- π‘ No Cost: 100% free and open-source.
- Frontend: React 18, Vite, TailwindCSS
- Web Worker ML:
@xenova/transformers
- Transcription Model: OpenAI Whisper (via transformers.js)
- Other: ESLint, PostCSS, modern ES2020+ JavaScript
/
βββ public/
β βββ vite.svg # App icon
βββ src/
β βββ components/
β β βββ Header.jsx # Top navigation and branding
β β βββ Footer.jsx # Footer
β β βββ HomePage.jsx # Landing/upload UI
β β βββ FileDisplay.jsx# Audio file display and controls
β β βββ Information.jsx# Output display
β β βββ Transcribing.jsx # Loading/transcribing UI
β βββ utils/
β β βββ presets.js # Worker message types, language codes, model names
β β βββ whisper.worker.js # Main ML Web Worker logic
β βββ App.jsx # Main application logic
β βββ main.jsx # Entry point
β βββ index.css # Tailwind and custom styles
βββ index.html # HTML template
βββ package.json # Dependencies & scripts
βββ ... (config files)
- The app delegates heavy ML inference to a Web Worker (
whisper.worker.js
). This prevents UI blocking and ensures smooth user experience. - The worker receives audio data, loads the ML model (Whisper), and performs transcription/translation asynchronously.
- Communication uses structured messages (see
presets.js
for message types).
- Transcription uses the OpenAI Whisper model, via
@xenova/transformers
, running entirely in-browser (no server needed). - Translation is performed using Whisperβs multilingual capabilities and language codes defined in
presets.js
. - Model progress and results are streamed back to the main app for display.
-
Clone the repo:
git clone https://github.com/arnobt78/FreeScribe-Transcription-Translation-ML-App--ReactVite.git cd FreeScribe-Transcription-Translation-ML-App--ReactVite
-
Install Node.js:
Download and install from nodejs.org. -
Install dependencies:
npm install
-
Install Transformers.js:
npm i @xenova/transformers
Start the development server:
npm run dev
Open http://localhost:5173/ in your browser.
-
Home Screen:
Select to record audio or upload an MP3/WAV file. -
Audio Processing:
Once uploaded or recorded, the file is displayed. Click "Transcribe" to start. -
ML Inference:
The app loads the Whisper model in a web worker and processes your audio. -
View & Translate:
The transcribed text appears. Use translation options to convert it into another language. -
Export or Copy:
Download the text as a file or copy it to your clipboard.
To add a new translation language, extend the LANGUAGES
object in src/utils/presets.js
:
export const LANGUAGES = {
...,
"Spanish": "spa_Latn",
// Add more as needed
};
The worker is initialized in App.jsx
:
worker.current = new Worker(new URL('./utils/whisper.worker.js', import.meta.url), { type: 'module' });
worker.current.postMessage({
type: MessageTypes.INFERENCE_REQUEST,
audio,
model_name: 'openai/whisper-tiny.en'
});
The worker receives audio, runs the model, and sends back results via postMessage
.
- Transcription
- Translation
- Machine Learning
- React
- Vite
- TailwindCSS
- Web Worker
- OpenAI Whisper
- Speech Recognition
- @xenova/transformers
- In-browser ML
- Audio Processing
FreeScribe streamlines advanced speech-to-text and language translationβdirectly in your browser, for free. Powered by modern frontend tools and the latest open-source ML models, itβs a practical, privacy-respecting alternative to expensive SaaS solutions.
MIT License. Β© 2030 arnobt78