Qwen3-TTS-Flash(Speech Synthesis)

Qwen3-TTS-Flash(Speech Synthesis)

Speech synthesis model from Tongyi Wanxiang
2025-09-24
Audio-Video Processing
Model capability: audio
Input:
$15/1M character
Output:
Free
Bulk order? Contact your manager for exclusive deals

API Overview

Qwen3-TTS-Flash is a speech synthesis model from the Qwen series. Qwen3-TTS features 17 voice tones and supports multiple languages and dialects.

Supported languages and dialects: Chinese (Mandarin, Beijing, Shanghai, Sichuan, Nanjing, Shaanxi, Minnan, Tianjin, Cantonese), English, Spanish, Russian, Italian, French, Korean, Japanese, German, Portuguese.

Reference: https://bailian.console.aliyun.com/?spm=5176.28197581.0.0.1e7d29a4JqZcpM&tab=doc#/doc/?type=model&url=2879134

API Console

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Qwen3-TTS-Flash(Speech Synthesis)
POST
Stable
View Details

API Pricing

$
ModelDescriptionOfficial Price302.AI Price

Qwen3-TTS-Flash

-

Input$15 / 1M character
Output0 / 1M character

Input$15/ 1M character
OutputFree
Original Price