Core Features
Multiple Input Methods
Support for images, videos, or system preset characters as the visual source with 9 different calling combinations
Flexible Audio Options
Use text-to-speech (TTS), audio URLs, or Base64-encoded audio with multiple voice styles for synchronization
Asynchronous Processing
Efficient task-based processing with status polling for scalable video generation
High-Quality Output
Professional-grade video synthesis with accurate lip sync and natural expressions
System Characters
Use built-in digital human characters or create custom avatars for your brand
Easy Integration
Simple RESTful API with standard JSON requests and responses