GETTING STARTED

Vidu currently provides multiple models including: ViduQ2, ViduQ1, Vidu2.0 — covering video generation, audio generation, and audio-video direct output. Their parameters and capabilities are listed below:

Model
viduq3-pro
viduq3-turbo
Overview
Supports simultaneous audio and visual output, and supports intelligent mirror cutting
Supports simultaneous audio and visual output, and supports intelligent mirror cutting
Resolution
540p, 720p, 1080p
540p, 720p, 1080p
Frame Rate
24fps
24fps
Duration
1-16S
1-16S
Image to Video
✔️
✔️
Reference to Video
-
-
Start End to Video
✔️
✔️
Text to Video
✔️
✔️
Templates
-
-
Model
viduq2-pro-fast
viduq2-turbo
viduq2-pro
viduq2
viduq1
viduq1-classic
Overview
Fast speed, low price, stable effect
New model, fast speed
New model, rich details
New model, rich details
Clear visuals
Stable, cinematic camera motion
Resolution
720p, 1080p
540p, 720p, 1080p
540p, 720p, 1080p
540p, 720p, 1080p
1080p
1080p
Frame Rate
24fps
24fps
24fps
24fps
24fps
24fps
Duration
1–10S
1–10S
1–10S
1–10S
5S
5S
Image to Video
✔️
✔️
✔️
✔️
✔️
✔️
Reference to Video
-
-
✔️ support reference videos
✔️
✔️
-
Start End to Video
✔️
✔️
✔️
-
✔️
✔️
Text to Video
-
-
-
✔️
✔️
-
Templates
-
-
-
-
✔️
-
Model
vidu2.0
Overview
Fast generation speed
Resolution
360p, 720p, 1080p
Frame Rate
32fps
Duration
4S, 8S
Image to Video
✔️
Reference to Video
✔️
Start End to Video
✔️
Text to Video
-
Templates
✔️
Capability/Model
viduq1
viduq2
Text to Image
-
✔️
Image Edit
-
✔️
Reference to Image
✔️
✔️
Templates
✔️
✔️
Resolution
1080p
1080p,2K,4K
Supported Image Count
1-7 images
0-7 images
Aspect Ratio
16:9, 9:16, 1:1
16:9,9:16,1:1, 3:4,4:3,21:9,2:3,3:2,auto
Capability
Supported
Text to Audio
✔️
Timing to Audio
✔️
Text to Speech
✔️
Voice Clone
✔️
Duration
2-10S
Capability
Supported
Description
Lip Sync
✔️
Generate lip-sync videos driven by text or audio
Digital Human
✔️
Generate digital-human speech or broadcast videos from an image
Video Extension
✔️
Extend video duration from 1~7s
Multi-Frame
✔️
Generate long, high-quality videos using multiple keyframes
Audio-Video Direct Output
✔️
Directly generate synchronized audio–video output
Smart Super-Resolution - Premium
✔️
Enhance video resolution, supports 1080p, 2K, 4K
Templates
✔️
Stitch multiple effects into a standard video template
Recommended Prompts
✔️
Based on 1–7 input images, recommend suitable prompts