Skip to content

Must-Read Guide

Notes

Uploading Your Own Avatar

For the best results, follow these recommendations:

  • Include a clear portrait
  • Both video and image are supported, video is preferred

Uploading Your Own Voice

Record or upload 60 seconds of audio, and your voice style will become available quickly. For better quality, follow these suggestions:

  • Clear vocals
  • No distracting background noise
  • At least 20 seconds in duration

Creation Methods

There are several options for digital human creation:

  1. Use recorded audio as the spoken content for the digital human. (The audio file should contain your voice and the spoken text.) Example:
json
{
    "audio_url": "https://cdn.gptbest.vip/file/cdn/20250107/RFDQkBGVrHFydzGzzvgXGQmbuPhicK.mp3",
    "avatar_url": "https://cdn.gptbest.vip/file/cdn/20250107/W213dUwMKyxZzXVvW0wUETJZ8KiUDB.mp4"
}
  1. Create a voice style and provide read-aloud text and language. (The audio file contains your timbre; no need to speak the exact text.) Example:
json
{
    "create_voice": {
        "audio_url": "https://example.com/voice.mp3",
        "language": "cn",
        "text": "Hello, I am a digital human!"
    },
    "avatar_url": "https://example.com/avatar.jpg"
}
  1. Use a preset voice style and provide read-aloud text and language. Query the default voice list to obtain voice_engine_id and audio_id. Example:
json
{
    "use_voice": {
        "voice_engine_id": "nina",
        "audio_id": "b92d7e1c40d0447eb563a13ec92d9042",
        "language": "en",
        "text": "Hello, world!"
    },
    "avatar_url": "https://example.com/avatar.mp4"
}