speech-2.8-turbo

speech-2.8-turbo

High-performance text-to-speech model launched by MiniMax
2026-02-06
Audio-Video Processing
Pricing:
$30/1M characters

starting from

Bulk order? Contact your manager for exclusive deals
稳定性
Stable

API Overview

MiniMax Speech 2.8 Turbo is a high-performance text-to-speech product launched by MiniMax, primarily positioned as a broadcast-grade voice synthesis API that supports emotional and onomatopoeic expressions. It meets multi-scenario voice generation needs with naturalness, control, and low cost.

  • Key Upgrades: Supports over 17 preset voices and custom cloned voices; adds native parsing capabilities for emotional control and onomatopoeic words (e.g., (laughs), (sighs)).
  • Applicable Scenarios: Audiobook production, video dubbing, podcast creation, educational materials, in-game NPC voiceovers, and accessibility content narration.
  • Product Value: Offers fine-grained audio control (speed, pitch, volume, sampling rate, bitrate) and delivers ready-to-use, production-quality audio output.
  • Cost Advantage: Only 0.03 PTC per thousand characters—more cost-effective than most high-fidelity TTS services.
  • Technical Features: Supports English number normalization (english_normalization) and pronunciation dictionaries (pronunciation_dict), ensuring accurate pronunciation of brand names and technical terms.

───────────────────────────────────────────────────────────────────

Core Capabilities

🎙️ Rich Voice Library Over 17 preset voices cover various genders, ages, and styles (such as Deep_Voice_Man, Lively_Girl, Abbess); custom cloned voices can also be integrated. 💬 Support for Onomatopoeic Expressions Natively recognizes 22 onomatopoeic words, including (laughs), (coughs), (gasps), and (sighs), making the speech more human-like. 😊 Emotion Control Allows specifying emotion modes such as happy, calm, etc., to match the emotional tone of the content. 🎛️ Full Parameter Control Freely adjust speed, pitch, volume, audio format (e.g., MP3/WAV), sampling rate, bitrate, and channel. 🔤 Precise Pronunciation Customization Define the pronunciation of proper nouns through the pronunciation_dict; enable english_normalization to optimize the reading of English numbers and dates.

API Console

Log in to explore more features! Click to Log In

API Analytics

API Reference (4)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
T2A(Speech Generation - Synchronous)
POST
Stable
View Details
T2A(Async extra content generation)
POST
Stable
View Details
T2A(Status Inquiry)
GET
Stable
View Details
Files(Audio File Download)
GET
Stable
View Details

API Pricing

$
ModelDescription302.AI Price

speech-2.8-turbo

T2A (voice generation-synchronization)

$30/1M characters

speech-2.8-turbo

Asynchronous Long-form Text-to-Speech Generation

$30/1M characters

T2A

Status Inquiry

Free

Files(Audio File Download)

-

Free