Feb 28, 2026·improvement

Voice Boost

We're thrilled to announce Voice Boost — a major upgrade to OnType's speech recognition pipeline. By leveraging the latest MLX optimizations for Apple Silicon, we've achieved a 30% reduction in transcription latency across all supported models.

What's new

Faster inference — Optimized MLX graph execution reduces per-chunk processing time from ~180ms to ~125ms on M1 and later.
Streaming improvements — Reduced buffering delay means text appears on screen closer to real-time.
Lower memory footprint — Model quantization improvements cut peak memory usage by 15%.

How it works

The new engine uses an optimized attention kernel that takes full advantage of the Neural Engine on Apple Silicon. Combined with our improved voice activity detection, OnType now starts transcribing faster and finishes sooner.

This update is available now — just update OnType to v1.2 and the new engine activates automatically.