Agent-readable docs index: /llms.txt. Full docs in one file: /llms-full.txt. Download /docs.zip to grep all markdown files locally.

transcribe

Transcribe audio to text using AI speech-to-text models. Supports OpenAI (whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe), ElevenLabs (scribe_v1), Deepgram (nova-3), Groq (whisper-large-v3, whisper-large-v3-turbo, distil-whisper-large-v3-en), and Cartesia (ink-whisper). Outputs JSON with text, timestamps, language, and duration.

Usage

egaki transcribe [audio]

Arguments

ArgumentRequiredDescription
[audio]Noaudio

Options

OptionDefaultDescription
-m, --model [model]-Transcription model ID. If omitted, shows an interactive picker (or uses default in non-TTY mode)
-o, --output [path]-Output file path for the JSON result
--language [lang]-ISO 639-1 language hint (e.g. en, es, fr). Helps accuracy for known languages
--stdin-Read audio from stdin instead of a file path
--stdout-Write only the plain text transcript to stdout (no JSON metadata)

Global Options

OptionDefaultDescription
-h, --help-Display this message
-v, --version-Display version number

Examples

# Transcribe an audio file with default model
egaki transcribe recording.mp3
# Use a specific model
egaki transcribe interview.wav -m gpt-4o-transcribe
# Groq for fast, cheap transcription
egaki transcribe podcast.mp3 -m whisper-large-v3
# Get plain text only
egaki transcribe recording.mp3 --stdout
# Save JSON result to file
egaki transcribe recording.mp3 -o transcript.json
# Pipe audio from another command
ffmpeg -i video.mp4 -f mp3 - | egaki transcribe --stdin -m whisper-1