Skip to content
Voice Memos

How do I transcribe voice memos on iPhone for free?

Updated May 14, 2026

Free voice memo transcription on iPhone in 2026 has gotten dramatically better with iOS 18 and especially iOS 26. Here's what's free and what isn't.

Apple Voice Memos (iOS 18+) — free, on-device, decent

If you're on iOS 18 or later, every voice memo you record gets auto-transcribed on-device using Apple's speech recognition. To view:

  • Open Voice Memos app.
  • Tap any recording.
  • Tap the transcript icon (looks like text lines, near the playback controls).
  • The transcript appears synced to playback — tap any word to jump to it.

Quality is solid for clear speech in English. Mediocre for accents, multiple speakers, or background noise.

Apple Notes voice recording (iOS 17+) — free, on-device

Notes has a voice recording feature with auto-transcription. The transcription quality is similar to Voice Memos but the transcripts are searchable within Notes itself, which is a big advantage.

Némos (free) — on-device + searchable + multi-device

Némos records voice notes and runs Apple's on-device Speech framework for transcription. The advantage over Voice Memos is that transcripts are fully searchable across your whole library, syncable to iPad/Mac/Apple Watch, and you can tag/folder recordings. Free tier covers unlimited recordings.

Whisper-based open-source tools (free, off-device)

For long recordings (1+ hour meetings) or non-English audio, OpenAI's open-source Whisper model running locally via apps like MacWhisper (Mac) or AudioPen's free tier (iOS) gives near-perfect accuracy. Trade-off: setup is non-trivial on iPhone.

Free vs paid in 2026 — what's worth paying for?

  • Speaker diarization (telling speakers apart in a meeting) — paid features in Otter, Granola, Notta.
  • Real-time transcription during the call — Otter and Granola, $15-25/mo.
  • Punctuation and paragraph cleanup — most free tools dump everything in one paragraph.

For everyday voice memos, the free Apple/Némos combo is enough for 95% of users. Save the paid tools for actual meetings.

## Why this question gets asked so often

Voice memo transcription was a $20+/month feature until iOS 18 shipped on-device transcription in September 2024 — a watershed moment that wiped out the value proposition of cheaper paid transcription apps. The pre-2024 landscape included Otter ($16.99/mo), Rev ($0.25/min), Trint ($60/mo), Descript ($24/mo), and several less-known apps that charged for what Apple now does for free. Google search volume for "free voice transcription iPhone" jumped 480% between September 2024 and January 2025 as users discovered the new feature. The question keeps trending because Apple buried the transcript button in a non-obvious place (you have to tap the recording, then tap the transcript icon — the discoverability is poor). App Store reviews of Voice Memos consistently mention surprise: "Wait, this was always free?" The deeper story is that on-device transcription is genuinely difficult — Whisper-quality output requires 2-4 GB of model weights and significant Neural Engine time, and Apple invested heavily to make it work on consumer hardware.

## The deeper story

Apple's voice transcription pipeline uses a tiered approach: lightweight wake-word detection (Always-On Processor), audio recording (Audio framework), and transcription (Speech framework, now backed by a 1.2B parameter speech model on iOS 26+ devices). The transcription model is roughly equivalent in quality to OpenAI's Whisper "small" model but runs on-device in 0.4x real-time (a 60-second recording transcribes in ~24 seconds). For comparison, cloud-based services like Otter use Whisper "large" variants that achieve 0.1x real-time but require server round-trips. The accuracy gap is most noticeable on accented English (Apple's model trained heavily on US/UK/Australian/Indian English; less so on Nigerian, Pakistani, or Singaporean English) and on overlapping speakers. The 2024 Tiago Forte BASB workflow calls voice memos "the highest-leverage capture format" because speaking is 3x faster than typing — transcription democratizes that across devices and use cases.

## Edge cases and gotchas

  • Apple Watch dictation vs full recording: dictation captures text-only; full recording captures audio AND transcribes. Different use cases.
  • Lossless audio recordings: don't transcribe better than standard quality — speech is well below the audible frequency range that lossless preserves.
  • Bilingual recordings: Apple's Speech framework only transcribes the dominant language. Code-switching content (e.g., Spanglish) loses the secondary language.
  • Background music: heavy music can confuse the model, especially when lyrics overlap speech.
  • Phone call recording: iOS 18+ lets you record calls with on-device transcription, but the other party must consent (a tone plays). Transcripts are stored in the Phone app, not Voice Memos.
  • Whisper-quality fans: MacWhisper Pro ($29 one-time) gives near-perfect on-device transcription on Mac, useful for long recordings.

## What competitors say

Otter ($16.99/mo) was the category leader for years; iOS 18 transcription has eaten significant market share. Their differentiator now is real-time meeting transcription with speaker diarization. Granola ($25/mo) targets the meeting-summary niche with LLM post-processing. Notta ($14.99/mo) offers similar features with translation. MacWhisper runs OpenAI Whisper locally on Mac — best for long-form journalism and podcast workflows. Apple Notes matches Voice Memos' on-device transcription, with the added benefit of searchability within Notes. Bear doesn't natively record audio. Notion doesn't natively record audio (requires third-party integration). Obsidian uses community plugins for audio capture. Némos layers semantic search on top of Apple's Speech framework, so finding voice memos by concept (not just keyword) works.

## The 2026 verdict

If you're on iOS 18 or later, free on-device transcription via Voice Memos or Apple Notes is genuinely excellent for everyday use — buy a paid tool only if you need speaker diarization, real-time transcription during meetings, or specialized accuracy for professional content. The category leaders (Otter, Granola, Notta) all face the same pressure: Apple's free tier now does 85% of what they charge $15-25/mo for. Expect the meeting-transcription space to consolidate around LLM-based summarization and multi-speaker handling, since that's where the durable moat remains.

Related questions

More on Voice Memos

Deeper dives