CapsLockX 2.0

Hold CapsLock or Space + a key. Navigate without leaving home row, dictate by voice, let an LLM drive your computer.

Download Docs GitHub

The Story Behind CapsLockX

Growing up using ThinkPad's TrackPoint (the red dot), I became addicted to its efficiency. When I switched to a Surface Book, I missed that precise control. CapsLockX was born from that itch — a tool that lets you drive everything from the keyboard, with acceleration-enabled mouse simulation.

Version 2.0 is a ground-up rewrite in Rust, with a platform-agnostic core and per-OS adapters. The Rust core also unlocks features the original AutoHotkey 1.x couldn't reach: local voice dictation, a streaming LLM brainstorm panel, and an LLM agent that operates the UI on your behalf.

The core idea: heavy coding without touching the mouse. Keep your hands on the keyboard at all times — and when even that's too much, talk to your computer.

⚡ Productivity layer

CapsLock and Space become chord triggers. Hold either + a key for instant actions, with zero conflict with normal typing.

🎙️ Voice dictation

Space+V — local SenseVoice / whisper.cpp STT with optional LLM polish, translation presets, and TTS fallback chain.

🤖 LLM agent

Space+M or clx agent — describe a task and the agent drives the UI via the CLX command language (keystrokes, mouse, AX-tree waits, 60 Hz pixel reflexes).

🎯 Precise control

Physics-based mouse acceleration, vim-style cursor motion, and virtual-desktop / window-tiling commands tuned for daily use.

Core Features

🎯 No conflicts

CapsLock can stay as a normal CapsLock when tapped; only the chord (Space+CapsLock) locks you into CLX mode.

🖱️ Mouse simulation

WASD move with acceleration, QE clicks, RF scroll.

⌨️ Vim-like navigation

HJKL cursor, YUIO page, G Enter, T Delete, N/P Tab — anywhere.

🖥️ Virtual desktop control

Switch desktops with 19. Cycle / close / tile windows with Z / X / C.

💬 Brainstorm chat

Space+B opens a streaming overlay to Gemini / OpenAI / Anthropic / Ollama / MLX.

🧠 Configurable polish chain

Short voice commands return raw; long dictation flows through local MLX or LLM corrector for punctuation and cleanup.

See it in Action

🖱️ Mouse Control

Mouse Control Demo

Control the cursor with WASD + QE/RF, with acceleration.

🪟 Window Management

Window Management Demo

Cycle, close, and tile windows with Z/X/C.

⌨️ Cursor Movement

Cursor Movement Demo

Vim-style cursor motion anywhere with HJKL + YUIO.

📐 Window Arrangement

Window Arrangement Demo

One-keystroke tiling and arrangement.

Documentation

Key Features

🪟 Window Management

  • Cycle and switch windows (Z)
  • Tile / arrange (C, with Shift for side-by-side)
  • Virtual desktops 19
  • Multi-monitor aware

✍️ Text Editing

  • Vim cursor + page navigation
  • Modifier passthrough for selection / word jumps
  • Voice dictation directly into the focused field

🖱️ Mouse Control

  • Keyboard-based cursor movement with physics-based acceleration
  • Left / right click, vertical scroll
  • Per-axis speed configurable in preferences

🤖 AI features

  • Voice dictation (local STT, optional LLM polish)
  • Brainstorm chat overlay
  • LLM agent that drives the UI via AX-tree + screenshots

Platforms

macOS

✅ Available (Rust 2.0)

Apple Silicon & Intel · CGEventTap + AppKit, code-signed binary.

Linux

✅ Available (Rust 2.0)

Wayland & X11 · evdev + uinput.

Windows

✅ Available (1.x AutoHotkey)

The original 1.x build is stable. Rust 2.0 Windows adapter is in progress.

Development

CapsLockX 2.0 is open source under GPL-3.0. The Rust workspace lives at snolab/CapsLockX with a platform-agnostic core (rs/core) and per-OS adapters (rs/adapters/{macos, linux, windows, browser}). The LLM agent's system prompt is editable at skills/clx-agent/SKILL.md — no rebuild needed.

# Build & run on macOS ./build.sh && clx # forks to background clx -f # foreground clx agent --prompt "open notes and write today's date"