Audio Generating
This API is used to generate speech audio from input text, with controllable speech speed, volume, and emotion.
POST https://api.vidu.com/ent/v2/audio-tts
Field | value | Description |
|---|---|---|
Content-Type | application/json | Data Exchange Format |
Authorization | Token {your api key} | Replace {} with your API key |
Field | Type | Required | Description |
|---|---|---|---|
text | String | Required | The text to be synthesized into speech. 1. Length must be less than 10,000 characters. 2. Paragraph breaks are marked by newline characters. 3. Pause control: supports custom time intervals between text segments to achieve customized speech pauses. - Usage: insert <#x#> in the text, where x is the pause duration (unit: seconds), range [0.01, 99.99], up to two decimal places.Pause markers must be placed between two pronounceable text segments and cannot be used consecutively. - Example: Hello<#2#>I am vidu<#2#>Nice to meet you |
voice_setting_voice_id | String | Required | Voice ID used for synthesis.See the Voice List for all available voices. |
voice_setting_speed | Float | Optional | Speech speed, default is 1.0. 1.0 means normal speed; range [0.5, 2]. A value of 0.5 is the slowest, and 2 is the fastest. |
voice_setting_volume | Int | Optional | Volume level. Range 0–10, default is 0 (normal volume). Higher values increase volume. |
voice_setting_pitch | Int | Optional | Pitch of the synthesized audio. Range [-12, 12], default 0. 0 represents the original pitch. |
voice_setting_emotion | String | Optional | Emotion of the synthesized voice. 1. Possible values: ["happy", "sad", "angry", "fearful", "disgusted", "surprised", "calm"], corresponding to 7 emotions: happy, sad, angry, fearful, disgusted, surprised, and neutral. 2. The model will automatically match an appropriate emotion based on the text, so manual selection is usually unnecessary. |
payload | String | Optional | Pass-through parameter. No processing, used for data transmission only.Max length: 1,048,576 characters. |
Field | Type | Description |
|---|---|---|
task_id | String | Task ID generated by Vidu |
state | String | It will be returned to a specific processing state: - queueing task in queue - success generation successful - failed task failed |
file_url | String | URL of the generated audio file |
credits | Int | Number of credits consumed for this request |
payload | String | Pass-through parameter value from the request |
created_at | String | Task creation time |
{ "task_id": "your_task_id_here", "state": "success", "file_url": "your_file_url_here", "credits": , "payload":"", "created_at": "2025-01-01T15:41:31.968916Z" }