
What it is
A push-to-talk voice typing and routing app for macOS. Hold a hotkey, speak, release - text appears wherever you want. The killer feature is per-hotkey channel routing: different key combos send your voice to different destinations with different transform pipelines.
Why it's interesting
This is basically a voice-first interface for multi-agent workflows. You can:
- Hold Cmd → paste text at cursor (basic dictation)
- Hold Cmd+1 → send structured task to Claude Code
- Hold Cmd+2 → post a message to Slack
- Hold Cmd+3 → create a Linear issue
- Hold Cmd+4 → generate a git commit message
Each channel has its own AI transform pipeline that cleans up your rambling speech into formatted output before routing it.
Key specs
- 20ms capture latency (vs ~500ms for macOS Dictation) - never clips the first word
- 100% local - speech recognition runs on Apple Neural Engine via CoreML
- AI transforms run locally too (or optionally cloud)
- Voice Activity Detection (VAD) - hands-free mode with auto-enter after 3s silence
- Voice fingerprint (v2.12) - transcribes only your voice, filters out background people/audio
- Config as code - plain TOML config that AI agents can modify programmatically
- iPhone as wireless mic via Continuity
Requirements
- macOS 14 Sonoma+
- Apple Silicon (M1+)
Install
brew install --cask speechbutton/tap/speechbutton
Pricing
| Plan | Price | Limits |
|---|---|---|
| Free | $0 | 5 min/day transcription |
| Pro Yearly | $5.83/month | Unlimited ($69.99/year) |
| Pro Monthly | $7.99/month | Unlimited |
My take
This is what Superwhisper should have been. The channel routing concept is a game-changer for anyone running multiple AI agents. The TOML config means your agents can literally configure their own voice input pipeline. Good for dictating tasks to Claude Code without switching context.