Video Synthesis API NavTalk Video Synthesis API supports 9 different methods for generating digital human videos, categorized into three main types: image-driven, video-driven, and built-in character-driven.

Endpoint

All requests are sent to the same endpoint:

POST https://app.navtalk.ai/generate

API Call Overview

The following table provides a quick comparison of all 9 supported methods:

Method	Visual Source	Audio Source	Use Case
① Video + Audio URL	Video	Audio URL	Re-dub existing videos
② Video + Audio Base64	Video	Audio Base64	Re-dub with local audio
③ Video + Text (TTS)	Video	Text	Re-dub with TTS
④ Image + Text (TTS)	Image	Text	Create talking head from photo
⑤ Image + Audio URL	Image	Audio URL	Sync image with audio
⑥ Image + Audio Base64	Image	Audio Base64	Sync image with local audio
⑦ Built-in Character + Audio URL	Built-in	Audio URL	Use preset character with audio
⑧ Built-in Character + Audio Base64	Built-in	Audio Base64	Use preset character with local audio
⑨ Built-in Character + Text (TTS)	Built-in	Text	Use preset character with TTS

Detailed Examples

① Video + Audio URL

curl -X POST "https://app.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "license": "sk-xxx",
    "video_url": "https://example.com/video.mp4",
    "audio_url": "https://example.com/audio.mp3"
  }'

② Video + Audio Base64

curl -X POST "https://app.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "license": "sk-xxx",
    "video_url": "https://example.com/video.mp4",
    "audio_base64": "base64-audio-data"
  }'

③ Video + Text (TTS)

curl -X POST "https://app.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "license": "sk-xxx",
    "video_url": "https://example.com/video.mp4",
    "content": "Welcome to NavTalk.",
    "voice": "nova"
  }'

④ Image + Text (TTS)

curl -X POST "https://app.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "license": "sk-xxx",
    "image_url": "https://example.com/photo.jpg",
    "content": "Welcome to NavTalk.",
    "voice": "echo"
  }'

⑤ Image + Audio URL

curl -X POST "https://app.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "license": "sk-xxx",
    "image_url": "https://example.com/photo.jpg",
    "audio_url": "https://example.com/audio.mp3"
  }'

⑥ Image + Audio Base64

curl -X POST "https://app.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "license": "sk-xxx",
    "image_url": "https://example.com/photo.jpg",
    "audio_base64": "base64-audio-data"
  }'

⑦ Built-in Character + Audio URL

curl -X POST "https://app.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "license": "sk-xxx",
    "character_name": "navtalk.Leo",
    "audio_url": "https://example.com/audio.mp3"
  }'

⑧ Built-in Character + Audio Base64

curl -X POST "https://app.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "license": "sk-xxx",
    "character_name": "navtalk.Leo",
    "audio_base64": "base64-audio-data"
  }'

⑨ Built-in Character + Text (TTS)

curl -X POST "https://app.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "license": "sk-xxx",
    "character_name": "navtalk.Leo",
    "content": "Welcome to NavTalk.",
    "voice": "fable"
  }'

Request Parameters

license

string

required

API authorization key obtained from the NavTalk dashboard.Example: "sk-xxx"

video_url

string

Public URL to a video file in MP4 or MOV format. Required for video-driven methods (methods ①, ②, ③).The video must be publicly accessible via HTTP/HTTPS.Example: "https://example.com/video.mp4"

image_url

string

Public URL to an image file. Required for image-driven methods (methods ④, ⑤, ⑥).The image must be publicly accessible via HTTP/HTTPS.Example: "https://example.com/photo.jpg"

character_name

string

Built-in character name. Required for built-in character methods (methods ⑦, ⑧, ⑨).Available characters: navtalk.Leo and other built-in characters. See Available Avatars for the complete list.Example: "navtalk.Leo"

audio_url

string

Public URL to an audio file in MP3 or WAV format. Use this when you have a pre-recorded audio file.The audio must be publicly accessible via HTTP/HTTPS.Example: "https://example.com/audio.mp3"

audio_base64

string

Base64-encoded audio data. Use this when you want to send audio data directly without hosting it online.Example: "base64-audio-data"

content

string

Text content for text-to-speech (TTS) synthesis. The API will convert this text to speech using the specified voice style.Example: "Welcome to NavTalk. This is my first digital human video!"

voice

string

Voice style for text-to-speech synthesis. Required when using the content parameter.Supported voices: alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verseSee Voice Styles for complete descriptions and audio previews.Example: "echo"

Parameter Combinations: Choose one visual source and one audio source:

Video-driven: video_url + (audio_url OR audio_base64 OR content)
Image-driven: image_url + (audio_url OR audio_base64 OR content)
Built-in character: character_name + (audio_url OR audio_base64 OR content)

Response Handling

All requests are processed asynchronously. The API returns a task_id that you use to query the status. Submit Request:

curl -X POST "https://app.navtalk.ai/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "license": "sk-xxx",
    "character_name": "navtalk.Leo",
    "content": "Welcome to NavTalk.",
    "voice": "echo"
  }'

Response:

{
  "status": "started",
  "task_id": "14cb760f-05ac-4fd3-a82c-e841f2f005d0"
}

status

string

Initial status when the task is created. Always "started" in the initial response.

task_id

string

Unique identifier for the generation task. Use this to query the task status.

Query Status: Use the task_id to check processing status:

curl -X GET "https://api.navtalk.ai/query_status?license=YOUR_LICENSE&task_id=14cb760f-05ac-4fd3-a82c-e841f2f005d0"

Response:

{
  "status": "done",
  "video_url": "https://easyaistorageaccount.blob.core.windows.net/easyai/uploadFiles/2025/05/09/xxx.mp4"
}

status

string

Current status of the task. Possible values:

started: Task created and processing
processing: Video composition in progress
done: Completed successfully, video URL available
failed: Generation failed, check error message

video_url

string

Public URL to the generated video file. Only present when status is "done".The video URL is publicly accessible and can be embedded directly in web pages, mobile apps, or downloaded for offline use.

Generation typically takes 10-30 seconds. Keep videos under 30 seconds for faster processing.

Advanced Parameters

NavTalk supports optional advanced parameters for fine-tuning face cropping, mouth openness, and blending. These parameters are inherited from MuseTalk and should be used only when needed.

bbox_shift

number

default:"0"

Vertical movement of the face crop box. Positive values shift the crop downward (making the mouth more open), while negative values shift upward (making the mouth less open).Range: [-9, 9]Example: 0

extra_margin

number

default:"10"

Pixels of extra margin added around the face crop. Increases buffer area to prevent clipping of chin, hair, or jaw.Range: [0, 50]Example: 10

parsing_mode

string

default:"\"jaw\""

Defines how facial regions—especially around the jawline—are parsed and blended.Options: "jaw" or "raw"Example: "jaw"

left_cheek_width

number

default:"90"

Pixel width for blending region on the left cheek. Adjust wider to soften seam visibility.Range: [50, 150]Example: 90

right_cheek_width

number

default:"90"

Pixel width for blending region on the right cheek. Functions the same as left_cheek_width.Range: [50, 150]Example: 90

These parameters are optional. Default values work well for most cases. Only adjust if you observe issues like face crop being too tight/loose or visible seams along the cheeks.

Getting Started

Real-time Digital Human API

Custom Avatar Training

Video Synthesis API

Resources

API Reference

Endpoint

API Call Overview

Detailed Examples

Request Parameters

Response Handling

Advanced Parameters

Getting Started

Real-time Digital Human API

Custom Avatar Training

Video Synthesis API

Resources

​Endpoint

​API Call Overview

​Detailed Examples

​Request Parameters

​Response Handling

​Advanced Parameters

Endpoint

API Call Overview

Detailed Examples

Request Parameters

Response Handling

Advanced Parameters