Skip to content

gergomiklos/speech2type

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Speech2Type

Voice typing from your terminal.

A simple CLI tool that gives you fast voice typing in every Mac app. Instant speech-to-text from your terminal to your cursor with one hotkey - works with Claude Code, Cursor, and any macOS application.

It just works:

npm install -g speech2type && s2t start

NPM Version License: MIT macOS

✨ Features

  • ⚑️ Real-time transcription: Your words appear instantly as you speak
  • πŸ’» Works everywhere: Claude Code, Cursor, Slack, Chrome, Lovable... any Mac app with text input
  • 🎯 Inline-typing: Text is inserted directly at the cursor’s position (no clipboard involved).
  • πŸ’Έ Completely free: Open source with free Deepgram API tier
  • 🌍 40+ languages: English, Spanish, French, German, Japanese, Chinese, and many more

Perfect for:

  • Developers: Vibe coding in any environment (Claude Code CLI, Cursor IDE, etc.)
  • Creators: Dictating in any text editor
  • Productivity: Quick voice input in Slack, email, and more

πŸš€ Quick Start

Installation

npm install -g speech2type

First Run

# Start the application
# s2t or s2t start
s2t start

# Follow the setup wizard to:
# 1. Get your FREE Deepgram API key from https://deepgram.com
# 2. Configure your language preference (default: English)
# 3. Set up your hotkey (default: ⌘;)
# 4. Grant microphone and accessibility permissions

Usage

  1. Start the app: Run s2t start in your terminal
  2. Position your cursor: Click in where you want text to appear
  3. Press your hotkey: Default is ⌘; to start listening
  4. Speak: Your words will appear in real-time at your cursor
  5. Press hotkey again: Stop listening

That's it! You now have voice typing in every Mac app.

πŸ“‹ Requirements

  • macOS 13+ with Apple Silicon
  • Node.js 18+
  • Deepgram API key (free tier available, no credit card required)
  • Xcode Command Line Tools (only for development)

System Permissions

Speech2Type requires two permissions that will be requested on first run:

  1. Microphone access: To capture your voice
  2. Accessibility access: To listen global hotkeys
  3. Automation access - System Events: To inject text into other applications

Grant these permissions in System Settings β†’ Privacy & Security for your terminal app from which you run Speech2Type!

βš™οΈ Configuration

View Current Settings

s2t config

Configure Individual Settings

# Change hotkey combination (default: ⌘;)
s2t config --hotkey

# Select language for speech recognition (default: English)
s2t config --language

# Update your Deepgram API key (or set the DEEPGRAM_API_KEY environment variable)
s2t config --deepgram-api-key

Supported Languages

Speech2Type supports 40+ languages including:

  • Bulgarian: bg,
  • Catalan: ca,
  • Chinese (Mandarin, Simplified):zh, zh-CN,zh-Hans,
  • Chinese (Mandarin, Traditional):zh-TW,zh-Hant,
  • Chinese (Cantonese, Traditional): zh-HK,
  • Czech: cs,
  • Danish: da, da-DK,
  • Dutch: nl,
  • English: en, en-US, en-AU, en-GB, en-NZ, en-IN,
  • Estonian: et,
  • Finnish: fi,
  • Flemish: nl-BE,
  • French: fr, fr-CA,
  • German: de,
  • German (Switzerland): de-CH,
  • Greek: el,
  • Hindi: hi,
  • Hungarian: hu,
  • Indonesian: id,
  • Italian: it,
  • Japanese: ja,
  • Korean: ko, ko-KR,
  • Latvian: lv,
  • Lithuanian: lt,
  • Malay: ms,
  • Norwegian: no,
  • Polish: pl,
  • Portuguese: pt, pt-BR, pt-PT,
  • Romanian: ro,
  • Russian: ru,
  • Slovak: sk,
  • Spanish: es, es-419,
  • Swedish: sv, sv-SE,
  • Thai: th, th-TH,
  • Turkish: tr,
  • Ukrainian: uk,
  • Vietnamese: vi

See supported languages for the up-to-date full list.

πŸ› οΈ Development

Local Development Setup

# Clone the repository
git clone https://github.com/gergomiklos/speech2type.git
cd speech2type

# Install dependencies
npm install

# Build Swift components
npm run build

# Run in development mode
npm run dev

Project Structure

speech2type/
β”œβ”€β”€ bin/                   # Executable scripts
β”‚   └── speech2type        # Main CLI entry point
β”œβ”€β”€ src/                   # JavaScript source code
β”‚   β”œβ”€β”€ index.js           # Main application logic
β”‚   β”œβ”€β”€ config.js          # Configuration management
β”‚   └── services/          # Core services
β”‚       β”œβ”€β”€ hotkey.js      # Global hotkey management
β”‚       β”œβ”€β”€ transcriber.js # Deepgram integration
β”‚       └── typer.js       # Text injection via osascript
|       └── permission.js  # macOS permission handling
β”œβ”€β”€ data/                  # Language data
β”‚   └── languages.json     # Supported languages
└── swift/                 # Native Swift components
    β”œβ”€β”€ hotkey-manager.swift     # Global hotkey capture
    β”œβ”€β”€ mic-recorder.swift       # Audio recording
    └── permission-checker.swift # System permissions

πŸ”§ Troubleshooting

Common Issues

1. "Permission denied" errors

  • Grant microphone access to your terminal app in System Settings β†’ Privacy & Security β†’ Microphone

  • Grant accessibility access to your terminal app in System Settings β†’ Privacy & Security β†’ Accessibility

    Important: the terminal app must be the one that is running Speech2Type. Common terminal apps: Terminal (built-in macOS terminal), Cursor (integrated in-app terminals), Warp, VS Code...

2. "Command not found: s2t"

  • Reinstall globally: npm install -g speech2type
  • Check your PATH includes npm global binaries

3. "Hotkey not working"

  • Check for conflicts with other applications
  • Reconfigure hotkey with: s2t config --hotkey
  • Ensure permissions are granted

4. "Text not appearing"

  • Ensure your deepgram api key is correct
  • Check your microphone is working
  • Check you have internet connection
  • Ensure permissions are granted

5. "Text not appearing in secure fields"

  • This is by design - secure input fields (passwords) don't accept simulated typing
  • Speech2Type works in regular text fields only

6. "How to get a Deepgram API key?"

  • Go to https://deepgram.com/

  • Sign up for a free account

  • Go to https://console.deepgram.com/

  • Click on "API Keys"

  • Click on "Create API Key"

  • Copy the API key

  • Run s2t config --deepgram-api-key and paste the API key

    The free tier includes more than 50 hours of credits. No credit card required.

FAQ: https://speech2type.com/faq

Debug Mode

Run with debug output for troubleshooting:

DEBUG=1 s2t start

Getting Help

🀝 Contributing

We welcome contributions! Here's how to get started:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes: Follow the existing code style
  4. Test thoroughly: Ensure your changes work on macOS
  5. Submit a pull request: Describe your changes clearly

Development Guidelines

  • Code style: Follow existing JavaScript and Swift patterns
  • Testing: Test on multiple macOS versions when possible
  • Documentation: Update README for new features
  • Permissions: Be mindful of security and privacy implications

Why not Whisper?

Deepgram provides real-time streaming, higher accuracy, wider language support with reasonable low cost and developer-friendly APIs, allowing Speech2Type to remain native and lightweight (Speech2Type is not affiliated with Deepgram).

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

🌟 Support the Project

If Speech2Type helps your workflow, consider:

  • Starring the repository
  • Reporting bugs and suggesting features
  • Improving documentation
  • Contributing code
  • Sharing with others who might benefit

Speech2Type - If you can type there, you can speak there.

Official website: https://speech2type.com

Built with ❀️

About

A minimal CLI tool that gives you quick voice typing in every Mac app.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •