Audio Generating

This API is used to generate speech audio from input text, with controllable speech speed, volume, and emotion.

POST https://api.vidu.com/ent/v2/audio-tts
Field
value
Description
Content-Type
application/json
Data Exchange Format
Authorization
Token {your api key}
Replace {} with your API key
Field
Type
Required
Description
text
String
Required
The text to be synthesized into speech.
1. Length must be less than 10,000 characters.
2. Paragraph breaks are marked by newline characters.
3. Pause control: supports custom time intervals between text segments to achieve customized speech pauses.
- Usage: insert <#x#> in the text, where x is the pause duration (unit: seconds), range [0.01, 99.99], up to two decimal places.Pause markers must be placed between two pronounceable text segments and cannot be used consecutively.
- Example: Hello<#2#>I am vidu<#2#>Nice to meet you
voice_setting_voice_id
String
Required
Voice ID used for synthesis.See the Voice List for all available voices.
voice_setting_speed
Float
Optional
Speech speed, default is 1.0.
1.0 means normal speed; range [0.5, 2]. A value of 0.5 is the slowest, and 2 is the fastest.
voice_setting_volume
Int
Optional
Volume level.
Range 0–10, default is 0 (normal volume). Higher values increase volume.
voice_setting_pitch
Int
Optional
Pitch of the synthesized audio.
Range [-12, 12], default 0. 0 represents the original pitch.
voice_setting_emotion
String
Optional
Emotion of the synthesized voice.
1. Possible values: ["happy", "sad", "angry", "fearful", "disgusted", "surprised", "calm"], corresponding to 7 emotions: happy, sad, angry, fearful, disgusted, surprised, and neutral.
2. The model will automatically match an appropriate emotion based on the text, so manual selection is usually unnecessary.
payload
String
Optional
Pass-through parameter.
No processing, used for data transmission only.Max length: 1,048,576 characters.
Field
Type
Description
task_id
String
Task ID generated by Vidu
state
String
It will be returned to a specific processing state:
- queueing task in queue
- success generation successful
- failed task failed
file_url
String
URL of the generated audio file
credits
Int
Number of credits consumed for this request
payload
String
Pass-through parameter value from the request
created_at
String
Task creation time
{
  "task_id": "your_task_id_here",
  "state": "success",
  "file_url": "your_file_url_here",
  "credits": ,
  "payload":"",
  "created_at": "2025-01-01T15:41:31.968916Z"
}