Must-Read Guide
Notes
Uploading Your Own Avatar
For the best results, follow these recommendations:
- Include a clear portrait
- Both video and image are supported, video is preferred
Uploading Your Own Voice
Record or upload 60 seconds of audio, and your voice style will become available quickly. For better quality, follow these suggestions:
- Clear vocals
- No distracting background noise
- At least 20 seconds in duration
Creation Methods
There are several options for digital human creation:
- Use recorded audio as the spoken content for the digital human. (The audio file should contain your voice and the spoken text.) Example:
json
{
"audio_url": "https://cdn.gptbest.vip/file/cdn/20250107/RFDQkBGVrHFydzGzzvgXGQmbuPhicK.mp3",
"avatar_url": "https://cdn.gptbest.vip/file/cdn/20250107/W213dUwMKyxZzXVvW0wUETJZ8KiUDB.mp4"
}- Create a voice style and provide read-aloud text and language. (The audio file contains your timbre; no need to speak the exact text.) Example:
json
{
"create_voice": {
"audio_url": "https://example.com/voice.mp3",
"language": "cn",
"text": "Hello, I am a digital human!"
},
"avatar_url": "https://example.com/avatar.jpg"
}- Use a preset voice style and provide read-aloud text and language. Query the default voice list to obtain
voice_engine_idandaudio_id. Example:
json
{
"use_voice": {
"voice_engine_id": "nina",
"audio_id": "b92d7e1c40d0447eb563a13ec92d9042",
"language": "en",
"text": "Hello, world!"
},
"avatar_url": "https://example.com/avatar.mp4"
}