
Viduq2 (Image Generation)
API Overview
Vidu Q2 Image Generation Model is a professional-grade image generation tool featuring efficient generation, precise replication, and flexible editing capabilities. Relying on an officially optimized algorithm architecture, it enables rapid conversion from text prompts and reference images to high-quality images, covering diverse styles and scenarios for creative needs.
Core Image Generation Functional Modules
The Vidu Q2 Image Generation Model encompasses three core modules: Reference-Based Image Generation, Text-to-Image Generation, and Image Editing. Each module is backed by clear official performance data:
- Reference-Based Image Generation: Supports multi-dimensional derivative creations based on one or more reference images, including various fixed aspect ratio generations, precise replication of composition and motion in different positions, and scene transitions. It can accurately replicate the composition structure of sketches or line drawings, generating different shot types from the same subject.
- Text-to-Image Generation: Allows generating continuous scenes with simple prompts, covering over a hundred art styles—including mainstream and non-mainstream genres such as Chinese traditional painting, Japanese manga, American comics, and retro styles. It maintains consistent character appearances across different shots, such as wide shots and close-ups.
- Image Editing: Supports adding, replacing, and removing content locally, enabling style transformations and adjustments for seasonal and temporal changes. During editing, it ensures consistency of the main subject and stability of the background structure.
Key Performance Indicators
The model boasts clear official data on generation efficiency, output quality, and creative freedom:
- Generation Speed: Can generate a single image in as little as 5 seconds, representing a significant improvement over the previous version.
- Output Resolution: Supports 4K ultra-high-definition output, meeting the demands of professional creative materials.
- Bulk Generation Capability: Supports inputting up to 7 reference images at once for “Reference-Based Image Generation,” optimizing workflows for batch creation scenarios such as comics and picture books.
- Aspect Ratio Adaptability: Can generate images in multiple aspect ratios from the same reference image, adapting to various material specifications for advertising, e-commerce, short dramas, anime, and other scenarios.
Core Advantages and Application Scenarios
The model demonstrates clear advantages in consistency and workflow integration, making it applicable across multiple fields:
- Consistency: Compared to the previous version, it has improved in semantic understanding, style support, aesthetic standards, and flexibility in referencing elements, maintaining subject consistency while offering greater creative freedom.
- Ease of Use: Enables photo editing and creation through text prompts, eliminating the need for multiple tools and reducing the barrier to professional-level creation.
- Application Scenarios: Officially adapted for fields such as short drama anime, advertising and e-commerce, general entertainment, film and television production, and cultural tourism and education. It can support specific needs like character IP development, continuous scene creation, and mass comic production.
Effect Demonstration
prompt (Reference-Based Image Generation—5 images, 4K—auto): "Short-focus fisheye, upward-looking extreme close-up shot, @Image 1 wearing @Image 3 on its head, draped with @Image 2 on its body, holding @Image 5 in its hand and putting it into its mouth, set in @Image 4 where it's raining, with @Image 1’s eyes facing the camera."
For reference, see the official documentation:https://platform.vidu.cn/docs/reference-to-image
API Console
Log in to explore more features! Click to Log In