
MiniMax-M2.5-highspeed
API Overview
MiniMax M2.5 is MiniMax’s flagship product at the large language model level, primarily positioned as a cutting-edge agentic model “built for real-world productivity.” It achieves or surpasses industry-leading SOTA performance in complex task scenarios such as programming, tool invocation, search, and office work, while also delivering extremely low costs and high inference efficiency.
- Performance Breakthrough: It scores 80.2% on SWE-Bench Verified, 51.3% on Multi-SWE-Bench, and 76.3% on BrowseComp—outperforming both open-source and mainstream closed-source models across the board; its task completion speed is 37% faster than M2.1.
- Architecture Optimization: Through hundreds of thousands of real-environment reinforcement learning iterations, it optimizes the ability to decompose complex tasks and reduces token consumption during the reasoning process, enabling efficient agentic execution.
- Cost Revolution: The cost of continuously generating 100 tokens per second for one hour is extremely low. The 50 TPS version costs only 1/10–1/20 of Opus/Gemini 3 Pro/GPT-5, making it possible to “build and operate agents with virtually unlimited economic scalability.”
- Inference Speed: It delivers high-speed inference at 100 TPS—nearly twice as fast as mainstream models—and supports parallel tool invocation, significantly reducing end-to-end task completion time.
- Real-World Productivity Integration: It has been deeply integrated into MiniMax Agent, covering functions such as R&D, product management, sales, HR, and finance. 30% of overall tasks are autonomously completed by M2.5, and 80% of newly submitted code is generated by it.
───────────────────────────────────────────────────────────────────
Core Capabilities
💻 Think and Build Like an Architect
It proactively decomposes features, structures, and UI designs before coding, with native Spec behavior; supports over 10 languages (including Python, Java, Go, Rust, TS, etc.) and full-stack development for Web, Android, iOS, Windows, and Mac, covering the entire workflow from 0-1 system design to 90-100 Code Review stages.
🔍 Expert-Level Search and Tool Invocation
It excels in BrowseComp, Wide Search, and self-built RISE (Real Professional Search Evaluation); compared to M2.1, it saves about 20% of search rounds, reaching results with a more streamlined path.
📑 High-End Office Delivery Capability
It collaborates with experts from fields such as finance, law, and social sciences to co-build training data, significantly improving the quality of deliverables in scenarios like Word formatting, PPT editing, and Excel financial modeling; in internal GDPval-MM evaluations, it achieves an average win rate of 59.0%.
⚡ Ultimate Efficiency and Cost Advantage
The average time for SWE-Bench Verified drops from 31.3 minutes to 22.8 minutes, and token consumption decreases from 3.72M to 3.52M. Its speed is roughly on par with Claude Opus 4.6.
🛠️ Native Agentic Framework Support
Built on the self-developed Forge Agent RL framework, it leverages the CISPO algorithm and process reward mechanisms, trained on hundreds of thousands of real-agent scaffolds, achieving strong generalization capabilities across environments and tools.
───────────────────────────────────────────────────────────────────
Selected Test Data

Playground
Log in to explore more features! Click to Log In