Murmur

Local voice-to-text for Windows. Hold a key, talk, let go — your words land wherever you're typing.

Everything runs on your machine. No cloud, no account, no sending audio anywhere. Murmur sits in your system tray and gives you a global hotkey that turns speech into text in any app — your editor, your browser, a chat window, whatever has focus.

How It Works

graph LR
    A["🎤 Hold hotkey"] --> B["🎙️ Mic capture"]
    B --> C["📡 WebSocket"]
    C --> D["🧠 Transcription engine"]
    D --> E["💬 Partials stream to overlay"]
    E --> F["📋 Release → clipboard + paste"]

Audio flows from your mic through an AudioWorklet, gets sent as 16-bit PCM over a local WebSocket, and hits the transcription engine running on your GPU (or CPU). Partials stream back in real-time so you see words forming as you speak. When you release the key, the final transcription lands in your clipboard and gets pasted automatically.

The overlay is a transparent, always-on-top, click-through window — it shows up when you're recording and gets out of the way when you're not.

Windows 10/11
Bun
Python 3.11+
uv
just
CUDA-capable GPU recommended (driver 525+; CPU is supported but slower)
Hold-to-talk or toggle mode — bind any key as your global hotkey
Transparent overlay — live waveform and partial transcription while you speak
Two engines, hot-swappable — switch between them without restarting
Auto-paste — transcribed text goes straight to your clipboard and into the active field
Post-processing — auto-append periods, spaces, or both
Searchable history — every transcription saved locally in SQLite
In-app server controls — start, stop, restart, stream logs, all from the settings panel
External server mode — point Murmur at a remote server if you want

Engines

Murmur ships with two transcription engines. Both run locally and can be swapped at runtime from the settings panel.

	Nemotron	Whisper
Model	`nvidia/nemotron-speech-streaming-en-0.6b`	`large-v3-turbo` (via faster-whisper)
Best for	English dictation, low latency	Multilingual, accuracy
Streaming	Native streaming architecture	Chunked re-transcription
Extras	—	Hotword boosting
VRAM	~1.5 GB	~3 GB

Nemotron is the default. It's a 0.6B parameter model built for streaming — partials come back fast and the final result is usually identical to the last partial. Whisper is the fallback for non-English languages or when you need hotword support to nail domain-specific terms.

Quick Start

You need Windows 10/11 with Bun, Python 3.11+, uv, and just. A CUDA GPU is recommended but not required.

# Server
cd server
uv sync --extra all    # or: --extra nemotron / --extra whisper
just start

# App (separate terminal)
cd app
bun install
bun run dev

The app auto-detects a running server in dev mode. In production, it manages the server lifecycle itself.

Note

If you develop from WSL, run all uv/bun/just commands through PowerShell — not Linux. Running them from WSL replaces Windows binaries with Linux ones and breaks everything. See BUILDING.md.

Build

cd server && uv sync --extra all
cd ../app && bun run package:win

bun run package:win produces a small nsis-web installer stub plus payloads (for example .7z, .yml, .blockmap) in app/release/. End users need internet to install (payload download) and to fetch models on first run.

Root-level helper:

just build

See BUILDING.md for full release and troubleshooting details.

Configuration

App settings (hotkey, audio device, engine, post-processing, auto-paste) are configured through the UI.

Server settings use MURMUR_-prefixed environment variables and can also be changed at runtime from the app, which persists them to server/settings.json.

Server environment variables

Variable	Default	Description
`MURMUR_HOST`	`0.0.0.0`	Bind host
`MURMUR_PORT`	`51717`	Bind port
`MURMUR_ENGINE`	`nemotron`	Default engine (`nemotron` / `whisper`)
`MURMUR_NEMOTRON_MODEL`	`nvidia/nemotron-speech-streaming-en-0.6b`	Nemotron model
`MURMUR_NEMOTRON_DEVICE`	`auto`	Device (`auto`/`cuda`/`cpu`)
`MURMUR_WHISPER_MODEL`	`large-v3-turbo`	Whisper model
`MURMUR_WHISPER_DEVICE`	`auto`	Device (`auto`/`cuda`/`cpu`)
`MURMUR_WHISPER_COMPUTE_TYPE`	`auto`	Whisper precision mode
`MURMUR_MAX_SESSIONS`	`10`	Concurrent session cap
`MURMUR_LOG_LEVEL`	`INFO`	`DEBUG`/`INFO`/`WARNING`/`ERROR`

Project Structure

app/      Electron desktop app (Svelte 5, TypeScript, Tailwind v4)
server/   Transcription server (FastAPI, faster-whisper, NeMo)
docs/     Protocol spec and technical docs

Protocol

The app and server communicate over a custom WebSocket protocol on port 51717 — binary frames for audio, JSON frames for control and text. Full spec: docs/protocol.md

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
.cursor/rules		.cursor/rules
.dev		.dev
.github/workflows		.github/workflows
app		app
docs		docs
homepage		homepage
poc		poc
scripts		scripts
server		server
.gitignore		.gitignore
AGENT.md		AGENT.md
BUILDING.md		BUILDING.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
justfile		justfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Murmur

How It Works

Engines

Quick Start

Build

Configuration

Project Structure

Protocol

License

About

Uh oh!

Releases 3

Uh oh!

Contributors 2

Uh oh!

Languages

License

dikkadev/murmur

Folders and files

Latest commit

History

Repository files navigation

Murmur

How It Works

Engines

Quick Start

Build

Configuration

Project Structure

Protocol

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Uh oh!

Contributors 2

Uh oh!

Languages