
inclusionAI/Ling-flash-2.0
API Overview
Ling-flash-2.0 is the third model in the Ling 2.0 architecture series, released by Ant Group’s Bailing team. It is a Mixture-of-Experts (MoE) model with a total parameter size of 100 billion, yet each token activates only 6.1 billion parameters—specifically, 4.8 billion non-word-vector activations. As a lightly configured model, Ling-flash-2.0 has demonstrated performance in multiple authoritative benchmarks that rivals or even surpasses that of 40-billion-parameter Dense models and larger-scale MoE models. Designed to explore a high-efficiency path under the prevailing consensus that "larger models equate to more parameters," this model leverages cutting-edge architectural design and advanced training strategies.
Playground
Log in to explore more features! Click to Log In