A TypeScript tool for batch transcribing audio/video files using the ElevenLabs Speech-to-Text API.
-
Install dependencies:
npm install
-
Create a
.env
file with your ElevenLabs API key:cp .env.example .env # Edit .env and add your API key
-
Add audio/video files to the
to transcribe
folder
-
Development mode (with TypeScript compilation):
npm run dev
-
Production mode (compile first, then run):
npm run build npm start
- Supports all major audio/video formats (mp3, wav, m4a, flac, ogg, mp4, mov, avi, mkv)
- Outputs JSON transcription files next to source files with same name
- Includes speaker diarization and audio event tagging
- Batch processes all files in the
to transcribe
folder
npm run build
- Compile TypeScript to JavaScriptnpm run dev
- Run in development mode with ts-nodenpm start
- Run compiled JavaScriptnpm run lint
- Check code stylenpm run lint:fix
- Fix linting issuesnpm run format
- Format code with Prettiernpm run check
- Run TypeScript check and linting
Each audio file generates a JSON file containing:
text
: Full transcription textlanguage_code
: Detected languagelanguage_probability
: Confidence in language detectionwords
: Array of word objects with timestamps and speaker information