transcribe

Transcribe audio to text using AI speech-to-text models. Supports OpenAI (whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe), ElevenLabs (scribe_v1), Deepgram (nova-3), Groq (whisper-large-v3, whisper-large-v3-turbo, distil-whisper-large-v3-en), and Cartesia (ink-whisper). Outputs JSON with text, timestamps, language, and duration.

Usage

1egaki transcribe [audio]

Arguments

Argument	Required	Description
`[audio]`	No	audio

Options

Option	Default	Description
`-m, --model [model]`	-	Transcription model ID. If omitted, shows an interactive picker (or uses default in non-TTY mode)
`-o, --output [path]`	-	Output file path for the JSON result
`--language [lang]`	-	ISO 639-1 language hint (e.g. en, es, fr). Helps accuracy for known languages
`--stdin`	-	Read audio from stdin instead of a file path
`--stdout`	-	Write only the plain text transcript to stdout (no JSON metadata)

Global Options

Option	Default	Description
`-h, --help`	-	Display this message
`-v, --version`	-	Display version number

Examples

1# Transcribe an audio file with default model

1egaki transcribe recording.mp3

1# Use a specific model

1egaki transcribe interview.wav -m gpt-4o-transcribe

1# Groq for fast, cheap transcription

1egaki transcribe podcast.mp3 -m whisper-large-v3

1# Get plain text only

1egaki transcribe recording.mp3 --stdout

1# Save JSON result to file

1egaki transcribe recording.mp3 -o transcript.json

1# Pipe audio from another command

1ffmpeg -i video.mp4 -f mp3 - | egaki transcribe --stdin -m whisper-1

Ask AI about this page