FAQ

Account and Authentication

How do I obtain an API Key?

Please visit the console at console.navtalk.ai. After registering and logging in, you can generate your License Key on the “API Key Management” page.

Does the License have an expiration date? Can it be reset?

The License is valid indefinitely. If you believe it has been compromised, you can reset it immediately in the console.

Quick Start Questions

How can I quickly call the one-time synthesis interface?

Please refer to the “API Call Overview” section in the documentation. You just need to provide video_url and audio_url to generate the video. The example response format is:

{
  "status": "started",
  "task_id": "xxxxx"
}

What are the minimum steps required for the real-time digital person to connect for the first time?

Establish a WebSocket connection to wss://transfer.navtalk.ai/wss/v2/realtime-chat (include the license and name parameters in the URL).
Wait for conversation.connected.success event containing session ID and ICE servers.
Optionally send conversation history via conversation.item.create messages.
Capture microphone audio and send it via realtime.input_audio_buffer.append.
Receive AI response text/audio stream/video stream (WebRTC through the same connection).

You can download our complete code and run it directly.

Real-time WebSocket Connection Issues

What should I do if the WebSocket connection fails?

Please check:

Is the license valid?
Is the WebSocket address correct: wss://api.navtalk.ai/realtime-api?
Does Chrome allow microphone access?

Do I need to configure WebRTC to get the digital person's video on the webpage?

Yes, WebRTC is the only method for displaying video. Please ensure that after connecting to the WebSocket, you simultaneously establish a WebRTC video channel and bind it to the video tag to play.

Character and Behavior Settings

How do I specify the character settings and greeting of the digital person?

Please set this in the prompt field of the realtime.input_config message, for example:

socket.onopen = () => {
  const config = {
    voice: 'cedar',
    prompt: `You are a gentle psychological counselor.
Please respond in zh-CN.
Please greet with "Hello, I am your intelligent assistant."`,
    tools: []  // Optional: Function calling tools (OpenAI models only)
  };
  
  socket.send(JSON.stringify({
    type: 'realtime.input_config',
    data: { content: JSON.stringify(config) }
  }));
};

Can I specify the tone of the digital person? What are the options?

Yes, you can. Set it using voice: "nova", which supports the following 9 tones: alloy, shimmer, coral, echo, ballad, ash, sage, verse.See Voice Styles for complete descriptions and audio previews.

Context and Memory Issues

How can I make the digital person remember the user's history of conversations?

Two methods are supported:

Embed conversation context in the prompt field of realtime.input_config to simulate full context.
Use conversation.item.create to send historical messages (only supports user messages) after receiving the realtime.session.created event.

Why can't the AI remember the previous conversation?

Please confirm:

Does your realtime.input_config message include contextual content in the prompt field?
Did you send conversation history using conversation.item.create after receiving realtime.session.created?

Function Call Issues

Why is there no response after configuring the function call?

Please confirm that the tools parameter has been correctly registered.
Check if you are listening for the response.function_call_arguments.done event.
Is the backend correctly returning function_call_output?

After the function call result is pushed, why is there no response from the AI?

Please ensure to execute the following after sending the result:

socket.send({ type: "response.create" })

Media Interface Call Issues

How long will it take to receive results after synthesizing video and audio?

Generally, it can be completed within 5 to 30 seconds. Please regularly poll the query_status interface until you receive:

{
  "status": "done",
  "video_url": "xxx"
}

Can I upload files directly?

It is recommended to upload audio and video files to a public cloud and use the URL for the call. If you need to use the platform’s upload feature, please log in to the console to get the upload link.

Getting Started

Real-time Digital Human API

Custom Avatar Training

Video Synthesis API

Resources

Account and Authentication

Quick Start Questions

Real-time WebSocket Connection Issues

Character and Behavior Settings

Context and Memory Issues

Function Call Issues

Media Interface Call Issues

Getting Started

Real-time Digital Human API

Custom Avatar Training

Video Synthesis API

Resources

​Account and Authentication

​Quick Start Questions

​Real-time WebSocket Connection Issues

​Character and Behavior Settings

​Context and Memory Issues

​Function Call Issues

​Media Interface Call Issues

Account and Authentication

Quick Start Questions

Real-time WebSocket Connection Issues

Character and Behavior Settings

Context and Memory Issues

Function Call Issues

Media Interface Call Issues