Supported Voice Styles

Choose from our collection of voice styles to personalize your digital human’s voice in the Real-time Digital Human API. Each voice style has unique characteristics suited for different use cases and scenarios. Click the play button to preview each voice style.

Accent and Dialect Support

You can control accents and dialects by specifying them in your instructions (system prompt). The model will match the accent/dialect you specify while using the selected voice style. Example: Control dialect for consistent persona:

instructions: `## Language

Response only in argentine spanish.`

For more details on language and dialect control, see the Building Effective Prompts guide.

How to Use

After establishing the WebSocket connection, send session configuration using realtime.input_config in the onopen event handler:

socket.onopen = () => {
  const config = {
    voice: 'echo' // OpenAI voice styles: alloy, ash, ballad, cedar, coral, echo, marin, sage, shimmer, verse
  };
  
  // Send configuration
  socket.send(JSON.stringify({
    type: 'realtime.input_config',
    data: { content: JSON.stringify(config) }
  }));
};

Note

This page lists voice styles supported by the Real-time Digital Human API. For voice styles supported by the Video Synthesis API, see the Video Synthesis Voice Styles page.

Getting Started

Real-time Digital Human API

Custom Avatar Training

Video Synthesis API

Resources

Supported Voice Styles