Translation
shh can transcribe audio in one language and translate it to another, all in a single command.
Basic Usage
Use the --translate flag with your target language:
# Transcribe French audio and translate to English
shh --translate English
# Transcribe English audio and translate to Spanish
shh --translate Spanish
Default Translation Language
Set a default translation language to avoid typing --translate every time:
# Set default translation language
shh config set default_translation_language English
# Now all recordings auto-translate to English
shh
# Override default when needed
shh --translate French
This is useful when you frequently translate to the same language.
Supported Languages
You can translate to any language supported by OpenAI's GPT models. Common examples:
- English
- Spanish (Español)
- French (Français)
- German (Deutsch)
- Italian (Italiano)
- Portuguese (Português)
- Chinese (中文)
- Japanese (日本語)
- Korean (한국어)
- Arabic (العربية)
- Russian (Русский)
- Hindi (हिन्दी)
And many more. Just specify the language name in English.
How It Works
- Whisper API transcribes the audio (in original language)
- PydanticAI translates the transcription to target language
- Result appears in terminal and clipboard
Language Detection
Whisper automatically detects the source language. You don't need to specify it - just provide the target language for translation.
Combining with Styles
Translation works seamlessly with formatting styles:
# Casual translation
shh --style casual --translate French
# Business translation
shh --style business --translate English
The order of operations:
- Transcribe (Whisper)
- Translate (if
--translatespecified) - Format (if
--styleis not neutral)
This ensures natural phrasing in the target language.
Examples
Meeting Notes (French → English)
Record a French meeting and get English business-formatted notes:
Voice Memo (English → Spanish)
Quick personal note translated to Spanish:
Interview Transcription (Chinese → English)
Transcribe a Chinese interview to English:
Language Names
You can use various forms of language names:
# These all work
shh --translate English
shh --translate french
shh --translate Español
shh --translate 中文
The AI model understands common language names in multiple forms.
Quality and Accuracy
Whisper Transcription
Whisper supports 90+ languages with high accuracy. Some languages perform better than others:
- Excellent: English, Spanish, French, German, Italian, Portuguese
- Good: Chinese, Japanese, Korean, Russian, Arabic, Hindi
- Varying: Less common languages (quality depends on audio)
Translation Quality
Translation uses OpenAI's gpt-4o-mini, which provides:
- Contextually aware translations
- Natural phrasing in target language
- Preservation of meaning and tone
- Handling of idioms and cultural context
Best Practices
Clear Audio
Translation quality depends on transcription accuracy. Ensure:
- Minimal background noise
- Clear pronunciation
- Good microphone placement
Specify Style
Combining styles with translation produces better results:
# ✅ Good - clear intent
shh --style business --translate English
# ⚠️ Acceptable - but less polished
shh --translate English
Review Output
AI translation is good but not perfect. Review important translations:
- Technical terms may need correction
- Cultural context might be lost
- Idioms may not translate directly
Common Use Cases
Multilingual Teams
Team members speak different languages:
Learning Languages
Practice speaking and get translations:
Content Creation
Transcribe and translate interviews, podcasts, or videos:
Travel Notes
Quick voice notes while traveling, translated to your native language:
Technical Details
API Calls
Translation requires two API calls:
- Whisper API for transcription
- GPT-4o-mini API for translation
Cost: Slightly higher than transcription alone, but still cost-effective with gpt-4o-mini.
Language Detection
Whisper automatically detects the source language. If you need to know what language was detected, check the Whisper API response (not currently exposed in CLI, but available in the code).
Privacy
Both Whisper and GPT API calls are processed by OpenAI. Audio and transcriptions are sent to OpenAI servers. Review OpenAI's privacy policy if privacy is a concern.
Troubleshooting
Translation Not Working
If translation fails:
- Check your API key has GPT access
- Verify the language name is correct
- Ensure you have sufficient API credits
Poor Translation Quality
If translations are inaccurate:
- Check the source transcription for errors
- Use
--styleflag for better context - Ensure clear audio quality
- Try rephrasing if speaking
Wrong Language Detected
If Whisper detects the wrong source language:
- Speak more clearly
- Reduce background noise
- Ensure sufficient audio duration