Reference Materials Requirements

Video Requirements

Quality Requirements:

Resolution: 1920x1080 (1080p) or lower

Frame Rate: 24fps minimum, 30fps recommended

Format: MP4, MOV, or other standard video formats

Duration: At least 2 seconds

File Size: Recommended under 50MB for faster upload

Content Requirements:

Person’s face clearly visible throughout, facing the camera (front-facing or slight angles)

Close-up to medium shot (face should occupy a significant portion of the frame)

Stable camera work with consistent, good lighting

Face remains in focus throughout

Single person only (avoid multiple people)

Recommendations

For optimal training results:

Include natural expressions, subtle head movements, and eye contact with the camera

Person speaking or making mouth movements (ideal but not required)

Video that can naturally loop from end to beginning for smoother synthesis results

Currently, the model is still being optimized for handling complex facial details such as beards. Portraits with beards or other complex facial hair may appear blurry or distorted in the synthesis results. We recommend using portraits with little or no facial hair for best results.

AI-Generated Video Support

You can use AI-generated video characters as training materials. These videos should meet the same quality requirements as regular video footage (1920x1080 or lower, clear facial features, consistent appearance, good lighting, stable video).

AI-generated videos are useful for creating avatars for fictional characters, testing workflows, creating consistent digital personas, and prototyping.

Getting Started

Real-time Digital Human API

Video Synthesis API

Custom Avatar Training

Resources

Video Requirements

Reference Video Example

Recommendations

AI-Generated Video Support

​Video Requirements

​Reference Video Example

​Recommendations

​AI-Generated Video Support

Video Requirements

Reference Video Example

Recommendations

AI-Generated Video Support