Voice to text,
instantly.

Press a shortcut, speak, and your words appear as text. Choose from multiple transcription engines, add AI post-processing, and everything runs locally on your Mac. One-time purchase, no subscription.

Built for people who think faster than they type.

Features

Everything you need,
nothing you don't.

Multiple engines

Switch between Apple Speech, OpenAI Whisper, and NVIDIA Parakeet ONNX in one click.

Each engine has different strengths: Apple Speech for zero-setup, Whisper for multilingual accuracy, Parakeet for blazing-fast English.

AI-powered modes

Go beyond raw transcription. Each mode runs your text through an LLM with a custom prompt.

Connect any OpenAI-compatible provider, Anthropic, or use Claude CLI with your existing subscription.

Per-mode shortcuts

Assign a dedicated global shortcut to each mode. One keystroke to switch and start recording.

Configure shortcuts in Settings. Supports any key combo with ⌘, ⌃, ⌥, ⇧ modifiers.

Auto-paste

Transcribed text is placed on your clipboard and pasted into the active app automatically.

Works with any app that accepts ⌘V. Toggle auto-paste on or off in settings.

Privacy first

Audio never leaves your machine. All transcription happens entirely on-device.

No cloud APIs, no data collection, no accounts. Your voice data stays on your Mac.

Zero latency

The audio engine stays pre-warmed in the background. Recording starts the instant you press the key.

No spin-up delay, no loading screens. Press the shortcut and speak immediately.

Audio feedback

Distinct sounds for start, stop, and transcription complete so you always know what's happening.

Pick from 14 macOS system sounds per action, or turn them off. You hear exactly when recording starts and when your text is ready.

AI providers

Connect OpenAI, Anthropic, Claude CLI, or any compatible API to power your modes.

Use your existing subscriptions. Each mode can use a different provider and model.

Fully configurable

Appearance theme, overlay toggle, auto-paste, Python venv path — everything is tuneable.

A native macOS settings window with live preview. No config files to edit.

Voice commands

Say 'mode cleanup' or 'mode translate' at the start of your recording to switch modes on the fly.

No buttons, no menus. Just speak the mode name and Spkn switches automatically before processing your text.

Searchable history

Every transcription is saved. Search through your history and copy any past transcript with one click.

Re-process old transcriptions with a different mode. Clean up something you dictated raw, or translate it later.

Notifications

Get a native macOS notification when your transcription is ready, even if you've switched to another app.

Never miss a completed transcription. The notification shows the beginning of your text.

Usage stats

Track your total transcriptions, recording time, words dictated, and average length.

See how much time voice-to-text saves you compared to typing.

App-aware modes

Assign a mode to each app. Spkn auto-switches when you dictate in Mail, Slack, Xcode, and more.

Create rules like: Mail → Formal email, Slack → Casual, Xcode → Code comment. Zero friction, zero menus.

Append mode

Dictate in multiple takes. New transcriptions are appended to the clipboard instead of replacing.

Perfect for long documents, meeting notes, or filling forms. Build up text over multiple recordings.

Word & char count

See live word and character count in the overlay after each transcription.

Handy for tweets, bios, commit messages, or any text with a character limit.

Use cases

Works wherever you type

From quick messages to long-form writing.

💬

Slack messages

Dictate replies without typing

📧

Email drafts

Compose emails at the speed of thought

💻

Code comments

Document your code hands-free

📝

Meeting notes

Capture ideas in real-time

How it works

Three seconds to text

From any app, at any time.

D

Press your shortcut

Hit the global hotkey from any app — a text editor, browser, Slack, terminal. The menu bar lights up and a floating overlay appears. Or say a mode name to switch on the fly.

Recording

Speak naturally

Talk at your normal pace. The overlay shows the live audio level. Take as long as you need — there's no time limit.

Transcribed & pasted

Hey, can you review the pull request I just opened? I've refactored the auth middleware to fix the session bug.

Text appears

Press the shortcut again to stop. Your speech is transcribed, run through the active mode, and pasted into the focused app. Escape to cancel.

Modes

More than transcription

Every mode can be connected to an AI provider for automatic post-processing.Or create your own with a custom system prompt.Switch modes with shortcuts, from the menu, or just by saying 'mode [name]'.

🎤️

Dictation

D

Raw transcription with no processing. Your words appear exactly as spoken — perfect for drafting emails, notes, or messages.

Direct speech-to-textNo AI processing neededFastest mode

Clean up

C

Automatically fixes grammar, punctuation, and capitalization. Ideal for when you want polished text without editing.

Grammar correctionProper punctuationKeeps your language
🌐

Translate

T

Speak in any language and get translated text. Uses your configured LLM to produce natural, fluent translations.

Any source languageNatural translationsPowered by your LLM
📝

Summary

S

Dictate a long thought and get back a concise summary. Great for meeting notes, brainstorming, or capturing ideas quickly.

Concise outputKeeps key pointsSame language

Create custom modes

Write your own system prompt, pick an AI provider and model, assign a shortcut, and choose an emoji. Your mode works exactly like the built-in ones.

Engines

Pick your engine

Three transcription backends, each with different trade-offs.Switch between them instantly in settings.

Apple Speech

Built-in

The macOS native speech recognition engine. Already on your Mac, no downloads required.

Zero setupWorks offlineLow memory usageMulti-language

Best for: Quick notes & simple dictation

Whisper

OpenAI

OpenAI's state-of-the-art model via faster-whisper. Choose from Tiny to Large v3 depending on your accuracy/speed needs.

Best accuracy10+ languagesMultiple model sizesRuns via Python

Best for: Multilingual & high-accuracy work

Parakeet ONNX

NVIDIA

NVIDIA's 1.1B parameter Parakeet model running natively through ONNX Runtime. No Python, no external dependencies.

Native SwiftNo Python needed1.1B parametersFast inference

Best for: Fast English transcription

Comparison

How Spkn stacks up

See how Spkn compares to other macOS transcription tools.

Feature
Spkn
macOS Dictation
Whisper Transcription
Superwhisper
Runs locally
Multiple engines
AI modes
Custom shortcuts
Auto-paste
Voice commands
Price
$5 once
Free
$3/mo
$10/mo
Customization

Make it yours

Global shortcuts

Configure record, stop, and cancel hotkeys. Assign per-mode shortcuts with any modifier combo.

Sound feedback

Pick different system sounds for start, stop, and transcription complete. Or turn them off entirely.

AI providers

Connect OpenAI, Anthropic, Claude CLI, or any compatible API. Use different models per mode.

Fine-tuning

Auto-paste toggle, overlay display, appearance theme, and Python venv path for Whisper.

FAQ

Frequently asked questions

No. All transcription engines run entirely on your Mac. Your audio never leaves your machine. If you use an AI mode with a cloud provider (like OpenAI or Anthropic), only the transcribed text is sent for post-processing — never the audio itself.

Apple Speech is the easiest — zero setup, works offline. Whisper offers the best accuracy and supports 10+ languages. Parakeet ONNX is the fastest for English and runs natively without Python.

Not for basic transcription. API keys are only needed if you want AI post-processing modes (Clean up, Translate, etc.). You can connect OpenAI, Anthropic, or any OpenAI-compatible local server like Ollama.

Yes. Spkn works system-wide with a global keyboard shortcut. After transcription, the text is pasted into whatever app is focused — text editors, browsers, Slack, email, terminal, anything that accepts ⌘V.

Spkn requires macOS 14 Sonoma or later. It runs natively on both Apple Silicon and Intel Macs.

Absolutely. Create custom modes with your own system prompt, choose an AI provider and model, assign a dedicated keyboard shortcut, and pick an emoji icon. Your custom modes work exactly like the built-in ones.

Instead of manually switching modes, you can say 'mode cleanup', 'mode translate', or any mode name at the beginning of your recording. Spkn detects the command, switches to that mode, and processes the rest of your speech accordingly.

Yes. Spkn is built with Swift and is fully open source. You can inspect the code, contribute, or modify it to fit your needs.

Spkn is a one-time purchase of $5. No subscription, no hidden fees. You get all features and future updates included.

Your data stays on your Mac

Privacy is not a feature — it's the architecture.

No cloud processing

All transcription engines run locally on your Mac. Audio never leaves your device.

No accounts

No sign-up, no login, no email required. Download, pay, and use.

No telemetry

Zero analytics, zero tracking, zero data collection. What you say stays with you.

Ready to try?

One-time purchase. No subscription. Runs entirely on your Mac.

$5one-time

30-day money-back guarantee

Buy Spkn — $5

v1.0.0 · Requires macOS 14 Sonoma or later · Universal binary (Apple Silicon + Intel)