inclusionAI/Ring-flash-2.0

inclusionAI/Ring-flash-2.0

A high-performance thinking model launched by Alibaba
2025-09-19
LLM
Model capability: thinking
Input:
$0.143/1M tokens
Output:
$0.572/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Ring-flash-2.0 is a high-performance reasoning model deeply optimized based on Ling-flash-2.0-base. It adopts a Mixture-of-Experts (MoE) architecture with a total parameter count of 100B, yet only 6.1B parameters are activated during each inference. By leveraging the novel icepop algorithm, the model addresses the instability challenges inherent in MoE large models during reinforcement learning (RL) training, enabling its advanced reasoning capabilities to steadily improve even over extended training periods. Ring-flash-2.0 has achieved remarkable breakthroughs across multiple demanding benchmarks, including math competitions, code generation, and logical reasoning—outperforming state-of-the-art dense models with fewer than 40B parameters while rivaling larger-scale open-source MoE models and proprietary high-performance reasoning systems. Despite its focus on complex reasoning tasks, the model also excels in creative writing and other applications. Moreover, thanks to its efficient architectural design, Ring-flash-2.0 delivers robust performance alongside rapid inference speeds, significantly reducing the deployment costs of reasoning models in high-concurrency scenarios.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(SiliconFlow)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

inclusionAI/Ring-flash-2.0

-
128000

Input$0.143 / 1M tokens
Output$0.572 / 1M tokens

Input$0.143/ 1M tokens
Output$0.572/ 1M tokens
Original Price