Skip to main content
Triggered when the server detects that the user has started speaking via server-side voice activity detection (VAD). Handle this event to stop any current AI response playback when the user interrupts the AI’s response.

Event Properties

type
string
Event type. Always "realtime.input_audio_buffer.speech_started" for this event.
data
object
Event data object. May be empty or contain additional metadata.
{
  "type": "realtime.input_audio_buffer.speech_started",
  "data": {}
}

Usage Example

const NavTalkMessageType = Object.freeze({
    REALTIME_SPEECH_STARTED: "realtime.input_audio_buffer.speech_started",
    // ... other event types
});

async function handleReceivedMessage(data) {
    switch (data.type) {
        case NavTalkMessageType.REALTIME_SPEECH_STARTED:
            console.log("Speech started detected by server.");
            stopCurrentAudioPlayback();
            audioQueue = [];
            isPlaying = false;
            playVideo = false;
            break;
    }
}
When the user starts speaking while the AI is responding, this event will interrupt the AI response naturally. You should stop audio playback and clear the audio queue to ensure a smooth user experience.