Voice Test Bench
Setups
Conversation
Anam
History
New setup
Name
Provider
OpenAI Realtime (speech-to-speech)
xAI Grok (realtime voice)
Sesame CSM (pipeline)
GPT + ElevenLabs (pipeline)
Google Gemini Live (realtime voice)
Anam (video avatar, full-stack)
Model
gpt-realtime-2 — reasoning S2S (default)
gpt-realtime-1.5 — best general voice
gpt-realtime (2025-08-28)
gpt-realtime-mini — cost-efficient
Voice
marin
cedar
alloy
ash
coral
sage
shimmer
verse
Reasoning effort (gpt-realtime-2 only)
minimal
low
medium
high
xhigh
Input noise reduction
none
near_field (headset / handset)
far_field (laptop / room mic)
Turn detection
server_vad (volume-based)
semantic_vad (model decides end-of-turn)
VAD threshold (server_vad; higher = less sensitive)
VAD silence ms (server_vad)
VAD eagerness (semantic_vad)
auto
low
medium
high
Input transcription model
gpt-4o-transcribe
gpt-4o-mini-transcribe
whisper-1
Transcription language hint (e.g. en) — blank = auto
Re-transcribe user audio offline after Stop (more accurate)
System prompt
You are a helpful, concise voice assistant. Keep replies short and natural for speech.
Create
Saved setups
No setups yet.