happyhorse-1.0-r2v

happyhorse-1.0-r2v

Alibaba Group’s next-generation cutting-edge AI video generation model
2026-04-27
Video Generation
Pricing:
$0.156/second

starting from

Bulk order? Contact your manager for exclusive deals

API Overview

HappyHorse-1.0 (also known as “Joyful Horse”) is a next-generation cutting-edge AI video generation model launched by the ATH Innovation Business Unit of Alibaba Group. As the world’s first open-source model to achieve native joint modeling of text, video, and audio from scratch, HappyHorse-1.0 made a powerful debut in early April 2026, soaring to the top of the authoritative Artificial Analysis AI Video Arena rankings with an impressive score of 1333 Elo. It instantly outperformed Seedance 2.0, Kling 3.0, Veo 3, and Sora 2 Pro, becoming the world’s highest-performing open-source video generation model.

───────────────────────────────────────────────────────────────────

Core Capabilities

True Native Audio-Video Joint Modeling: This breaks through the traditional AI video model’s limitation of “video first, then voiceover” stitching. HappyHorse-1.0 adopts a 15-billion-parameter, 40-layer single-stream Self-Attention Transformer architecture, placing text, video, and audio tokens into the same sequence for joint pre-training. This architectural design ensures ultra-high synchronization between visual dynamics and sound rhythm, truly achieving integrated audiovisual generation.

Outstanding Physical Consistency and Narrative Capability: When handling complex commercial shooting requirements, the model demonstrates exceptional ability to simulate physical laws. Whether it’s cinematic-level transitions in short films or multi-camera coordinated shoots, it can precisely follow intricate text instructions while maintaining rigorous physical consistency. Its deep performance in multi-camera control and long instruction adherence makes it a “productivity powerhouse” in the hands of professional creators.

Ultimate Generation Efficiency: Thanks to DMD-2 distillation technology, HappyHorse-1.0 can generate high-quality images in just 8 denoising steps, significantly reducing inference costs. With a response speed of about 38 seconds to generate 1080p HD dynamic footage, combined with its versatility in the open-source community, it not only excels at rigorous commercial storytelling but also meets creators’ demands for frequent creative iterations.

Benchmarking Leadership in Open Source: With its fully open-source nature and commercial-use Apache 2.0 license, HappyHorse-1.0 completely breaks the monopoly of closed-source models on high-performance video generation. It has already been deployed on Alibaba’s Bailian platform and provides global developers with the key to accessing top-tier AI image creation, earning it the reputation of being a milestone open-source breakthrough in the field of video generation.


API Console

Log in to explore more features! Click to Log In

API Analytics

API Reference (2)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
R2V (Reference-Generated Video)
POST
Stable
View Details
Tasks (Get Task Results)
GET
Stable
View Details

API Pricing

$
ModelDescription302.AI Price

wan2.7-r2v(Reference1-to-video)

720p

$0.156/second

wan2.7-r2v(Reference1-to-video)

1080p

$0.276/second

Tasks

Fetch Task Results

Free