realtime.response.audio.delta

{
  "type": "realtime.response.audio.delta",
  "data": {
    "delta": "base64-encoded-audio-chunk"
  }
}

Triggered for each audio chunk as the AI generates speech audio. This event is optional and primarily used when receiving audio via WebSocket. For most applications, WebRTC is the recommended method for audio/video streaming. Handle this event only if you’re using WebSocket for audio streaming instead of WebRTC.

Event Properties

type

string

Event type. Always "realtime.response.audio.delta" for this event.

data

object

Event data object containing audio information.

data.delta

string

Base64-encoded audio chunk.Example: "base64-encoded-audio-chunk"

{
  "type": "realtime.response.audio.delta",
  "data": {
    "delta": "base64-encoded-audio-chunk"
  }
}

Usage Example

const NavTalkMessageType = Object.freeze({
    REALTIME_RESPONSE_AUDIO_DELTA: "realtime.response.audio.delta",
    // ... other event types
});

async function handleReceivedMessage(data) {
    const nav_data = data.data;
    
    switch (data.type) {
        case NavTalkMessageType.REALTIME_RESPONSE_AUDIO_DELTA:
            if (nav_data.delta) {
                // Process audio chunk
                processAudioChunk(nav_data.delta);
            }
            break;
    }
}

For most applications, WebRTC is recommended for audio/video streaming.

realtime.response.audio_transcript.done realtime.response.audio.done

Getting Started

Real-time Digital Human API

Custom Avatar Training

Video Synthesis API

Resources

Event Properties

Usage Example

Getting Started

Real-time Digital Human API

Custom Avatar Training

Video Synthesis API

Resources

​Event Properties

​Usage Example

Event Properties

Usage Example