stable-audio-2.0

stable-audio-2.0

Text-to-audio technology and models launched by Stability AI
2025-09-29
Audio-Video Processing
Pricing:
$0.02/credits

starting from

Bulk order? Contact your manager for exclusive deals

API Overview

Stable Audio can generate high-quality music and sound effects lasting up to three minutes based on text descriptions, with a sampling rate of 44.1 kHz stereo. Refer to our prompt guide to learn how to write effective prompts for optimal generation results. Stable Audio 2.5: Fast, High-Quality, Long-Form Music and Audio Generation Our state-of-the-art audio generation model is capable of producing works up to 3 minutes long in 44.1 kHz stereo. Stable Audio 2.5 supports text-to-audio, audio-to-audio, and audio-patching workflows—allowing creators to upload sounds and transform them into new instruments, styles, or genres using natural language prompts. It’s perfect for music production, cinematic sound design, and mixing. Stable Audio 2.0: High-Quality Audio Generation Built specifically for text-to-audio and audio-to-audio tasks, Stable Audio 2.0 also generates up to 3 minutes of 44.1 kHz stereo audio. Ideal for creative ideation, music demos, and atmospheric soundscapes, it’s optimized for professional creators seeking detailed, longer outputs from simple prompts.


Stable Audio 2 typically consumes 20–23 credits, equivalent to $0.4–$0.46 USD.

Stable Audio 2.5 is set to consume a fixed 20 credits, or $0.4 USD.

API Console

Log in to explore more features! Click to Log In

API Reference (2)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Text-to-Audio (Text-generated Music)
POST
Stable
View Details
Audio-to-Audio (Reference-based Music Generation)
POST
Stable
View Details

API Pricing

$
ModelDescription302.AI Price

Text-to-Audio (Text-generated Music)

-

$0.02/credits

Audio-to-Audio (Reference-based Music Generation)

-

$0.02/credits