Productivity · Voice-First Workflow

I haven't really touched my keyboard in months — and my output tripled.

A quiet shift is happening in how knowledge workers actually produce writing. The keyboard is losing. The microphone — backed by a new generation of AI dictation tools — is winning on speed, quality, and, increasingly, on how search itself works.

By Tarek · Riman Agency · 11 MIN READ · MONTRÉAL

3× Speaking is faster
than typing (Stanford)

40 Avg. WPM typed
vs 150 WPM spoken

$23B Voice recognition
market by 2030

80% Of voice queries
are conversational

Most days, the only thing I type on a keyboard is my password. Everything else — emails, client briefs, blog drafts, Slack messages, Notion docs, even this article — starts as speech. I dictate. An AI cleans it up. I review it. I move on.

That shift didn't happen because I'm chasing productivity hacks. It happened because the math stopped making sense. Typing at 40 words per minute when I can speak at 150 is the equivalent of driving 30 km/h on a highway — technically permitted, deeply inefficient, and increasingly out of step with how the rest of the system is moving.

This is the article I wish someone had shown me a year ago. It covers three things: the hard data on why voice input is quantitatively better, where the market is heading (with enough conviction that the major AI players are all repositioning around it), and the specific tool — Wispr Flow — that finally made voice-first writing feel like a superpower rather than a gimmick.

The math is embarrassing for the keyboard.

Let's start with numbers that are hard to argue with. The average conversational speech rate for English speakers is 150 words per minute, confirmed by the National Center for Voice and Speech. The average typing speed, according to Ratatype's large-sample data, sits at 41.4 WPM. Professional typists reach 65–80 WPM. Even they are losing the race.

The most rigorous head-to-head comparison remains the Stanford University study led by Sherry Ruan and James Landay, which measured real users typing versus dictating identical passages. The finding: speech was 3.0× faster than a keyboard for English (161.20 vs 53.46 WPM), with a 20.4% lower error rate. Mandarin showed a similar pattern — 2.8× faster, with a 63% lower error rate.¹

Words per minute, head to head

Speed of text entry · Higher is faster

Average typist (Ratatype)

41 WPM

Stanford — keyboard (study average)

53 WPM

Professional typist (upper range)

80 WPM

Speaking rate (NCVS average)

150 WPM

Stanford — speech (study average)

161 WPM

Wispr Flow, sustained (independent testing)

184 WPM

Independent testing of Wispr Flow documented sustained output of 150–184 WPM — roughly 3× the speed of professional typists, with AI-layered formatting removing the need for manual cleanup.^{2, 3}

Now translate that into the unit that actually matters: time. An average blog post is around 1,140 words. A single page of prose is about 500. A typical knowledge worker produces roughly 2,475 words of written output per workday.⁴

Keyboard · 40 WPM

25 min to write 1,000 words

Plus time for backspaces, reformatting, and the cognitive load of translating thought into fingertips.

Microphone · 150 WPM

8 min to speak 1,000 words

17 minutes saved per 1,000 words. Across a typical workweek that compounds to 6–8 reclaimed hours.

The compounding effect is what sold me. McKinsey's widely cited research pegs knowledge workers at 28% of the workweek spent on email — roughly 11.2 hours, or 580 hours per year.⁵ Cutting input time by two-thirds doesn't just save minutes; it reshapes what's possible in a day.

What my day actually looks like now.

My keyboard-to-microphone ratio has flipped from about 90/10 to roughly 15/85. Here's the honest breakdown of what I still type vs. what I speak:

What I still type

Code. Syntax, brackets, variable names. Dictating code is a fight you don't want.
Passwords, 2FA codes, URLs. Anything where a single character matters more than speed.
Spreadsheet cells. Numbers and formulas. Voice adds friction here.
Quick one-line replies. "yes", "on it", "thanks" — the button is faster than the mic.

What I now speak

Every email over three sentences. Which is most of them.
Client briefs, proposals, SOWs. The hardest part of writing is starting. Speaking removes the blank page.
Blog posts, LinkedIn posts, article outlines. I dictate a rough pass at walking speed, then edit.
Slack and Teams messages longer than a line. Tone comes through better when I talk.
Notes during calls. I mute myself, speak my observations into Wispr, and they land formatted in my Notion inbox.
Prompts to Claude and ChatGPT. Long, detailed prompts are the whole game with LLMs. Typing them is the bottleneck.

"The keyboard is optimized for a kind of writing no one actually does anymore — slow, sequential, error-free first drafts. Real writing in 2026 is iterative and conversational. Voice matches that shape."

— Observation from 90 days of voice-first work

The market is voting with its dollars.

This isn't one writer's preference becoming a habit. It's a global realignment of how humans interact with machines — and the numbers make that clear.

Global speech & voice recognition market

USD billions · 2024–2030 forecast

19.1% CAGR through 2030. MarketsandMarkets projects the global speech and voice recognition market to roughly triple — from $9.66B in 2025 to $23.11B in 2030. Grand View Research's broader voice-and-speech recognition forecast reaches $53.67B by 2030 at 14.6% CAGR.^{6, 7}

Adoption is equally lopsided. There are now an estimated 8.4 billion active voice assistants worldwide — more devices than humans on the planet. Over 1 billion voice searches happen every month. Juniper Research and industry analysts project voice commerce to reach $80 billion in 2026 and $164 billion by 2028.^{8, 9, 10}

Voice assistants in active use, worldwide

Billions of devices · 2020–2024

Voice assistants in use doubled in four years, passing the global human population in 2023. When every phone, speaker, car, and wearable can take voice input, voice becomes the default interface by gravity.⁸

Keywords are becoming a legacy format.

This is the part marketers don't want to hear, so I'll be blunt: the keyword-centric model of SEO that defined the last fifteen years is being dismantled in real time. And voice is one of the forces dismantling it.

Traditional search rewards short, stripped queries — "best dictation app mac". Voice and AI search reward conversation — "What's the best dictation app for a marketing agency owner who writes client briefs all day on a Mac?". Those are different queries, and they surface different content.

The data points to the shift:

Around 70% of voice searches use natural, conversational language, not keyword fragments.¹¹
Voice search results share only 1.71% of keywords with their title tags — meaning classic keyword-in-title optimization is nearly irrelevant to voice results.¹²
76% of voice queries are local or "near me" intent — a completely different optimization target than volume keywords.¹²
Google AI Mode reached 75 million daily active users, and first-position CTR has collapsed on queries that trigger AI Overviews.¹³
An estimated 90% of websites will need some form of voice/AI optimization by end of 2026.¹¹

How the same intent is expressed, then and now

Traditional keyword query → Modern conversational query

Keyword-era

"montreal marketing agency"

3 words · no context · transactional

→

Voice / AI-era

"who's a good Montreal marketing agency that actually understands local Quebec SEO for restaurants?"

Full intent · constraints · audience

What wins in that world isn't keyword density. It's depth, structure, and answer-readiness — content that can be extracted into a response because it's clear, specific, and sourced. In other words: the kind of content you produce much faster when you can just say it out loud instead of typing it.

That's the connection most people miss. Voice isn't just a faster way to generate text. It's the format that matches how humans now ask — and therefore how the next decade of content needs to sound.

Why Wispr Flow, specifically.

I've tested most of the serious options — Apple's built-in dictation, Dragon, Superwhisper, AquaVoice, Willow. Here's the honest take on why Wispr Flow is what stuck.

The old generation of dictation software transcribed audio. That was all. You said "comma," it typed a comma. You said "new paragraph," it started a new paragraph. It was faster than typing for some people and a nightmare for most — because your brain had to think in speech and punctuation and formatting at the same time. The cognitive load canceled the speed gain.

The new generation — led by Wispr Flow — is structurally different. Multiple AI layers run simultaneously: one transcribes, another strips filler words ("um," "uh," "like"), another applies punctuation and paragraph breaks intelligently, another handles backtracking (when you say "meet Tuesday — wait, Wednesday," it just writes "Wednesday"), and another adapts the tone of the output to the app you're in. A casual Slack message and a formal email get different treatment from the same spoken input.³

What this looks like in practice

Universal surface. Works in Gmail, Notion, Slack, Google Docs, Figma, every code editor, every chat box, every LLM prompt field. It's OS-level, not app-specific. Currently the only major dictation tool available simultaneously on Mac, Windows, iOS, and Android.³
Context-aware formatting. Output is clean prose, not a transcript. Sentence case, paragraph breaks, em-dashes where appropriate.
Command Mode. Speak an instruction — "rewrite this in a warmer tone" / "make this a bulleted list" / "translate to French" — and the selected text is edited in place.
100+ languages including code-switching mid-sentence. As a French/English bilingual in Montréal, this alone was the feature that sold me.
Custom dictionary. Add client names, brand names, technical jargon once. It remembers.
Security posture. SOC 2 Type II + HIPAA eligibility on all plans, with a zero-retention Privacy Mode for sensitive work. That matters when you're dictating client strategy.¹⁴

Wispr raised $30M Series A from Menlo Ventures in mid-2025 and has raised over $80M total — which is why the product ships polish most dictation tools lack.³ It's free for 2,000 words per week, with a 14-day Pro trial on all paid features.

Try Wispr Flow free

How to actually make the switch.

The thing that kills most people's voice-first transition isn't the tool — it's the first three days. Dictation feels awkward. You sound weird to yourself. You second-guess every sentence. Here's the ramp that worked for me and for everyone I've onboarded at the agency:

Day 1 — Start in low-stakes channels

Dictate your Slack messages. That's it. Nothing client-facing, nothing long-form. Just the messages where a typo or clunky phrasing genuinely doesn't matter. You need your mouth to remember it's allowed to form sentences.

Day 2–5 — Move to email replies

Dictate responses to emails. Let the AI format. Review before sending. You'll catch yourself saying "um" out loud and the software will silently delete it. That's the moment you realize it's not transcription — it's ghostwriting.

Week 2 — First drafts only

Dictate the first draft of anything longer than 200 words — blog posts, briefs, proposals. Then edit by keyboard. You get the speed of voice on the generative pass and the precision of typing on the refinement pass. This is the sweet spot for most writers.

Week 3 onward — Your brain catches up

By week three, voice starts to feel like typing used to feel: invisible. You stop thinking about the tool and start thinking about the thought. That's when the 3× speed gain stops being a statistic and starts being your actual day.

Stop typing. Start talking.

The two-minute experiment that will probably change how you work.

Install Wispr Flow. Dictate one email. Send it. That's the whole test. Either you'll feel it immediately or you won't — but you'll never wonder about it again. The free tier gives you 2,000 words per week, the Pro trial runs 14 days, and no credit card is required to start.

Start free on Wispr Flow Mac · Windows · iOS · Android · Affiliate link supports this publication

The Voice-First Shift · 2026 Data Brief

The keyboard is losing.

A one-page field guide to the numbers behind voice-first work

01 Speed · The head-to-head

Avg. typist WPM
Ratatype, large-sample data

150

Avg. speaking WPM
NCVS / conversational English

3.0×

Speech vs keyboard speed
Stanford University study

02 Time · Per 1,000 words

Keyboard40 WPM

25 min

Voice150 WPM

8 min

Each block = 1 minute. You reclaim 17 minutes per 1,000 words — roughly 6–8 hours across a typical knowledge-worker week.

03 Market · Where the money is going

$23B

Voice recognition market by 2030
MarketsandMarkets · 19.1% CAGR

8.4B

Active voice assistants
More than humans on Earth

1B+

Voice searches per month
Growing 18% YoY

04 Search · The keyword is dying

70%

Of voice queries use natural language, not keywords

76%

Of voice queries have local "near me" intent

1.71%

Keyword overlap between voice results & title tags

05 Wispr Flow · What to try

Documented speed

150–184 WPM

Sustained, with AI formatting applied

Free tier

2,000 words/wk

No credit card · 14-day Pro trial

Works on Mac, Windows, iOS, Android · 100+ languages · SOC 2 Type II & HIPAA-eligible · Context-aware formatting in every app

The uncomfortable conclusion.

If you're a writer, marketer, founder, lawyer, doctor, consultant — anyone whose output is words — you are currently competing against people who are producing 2–3× your volume at the same quality, because they stopped typing. The gap will widen.

The keyboard won't disappear. But its role is shrinking to what it was always good at: precision editing, code, structured data. For generation — the actual act of turning thought into text — the microphone has already won the argument. Everything after this is just adoption.

The only real question is whether you spend the next six months doing the work, or whether you spend it watching someone else do it faster.

Start with Wispr Flow — free

I haven't really touched my keyboard in months — and my output tripled.

The math is embarrassing for the keyboard.