
Qwen/Qwen3-Next-80B-A3B-Thinking
API Overview
Qwen3-Next-80B-A3B-Thinking is the next-generation foundational model released by Alibaba's Tongyi Qianwen team, specifically designed for complex reasoning tasks. Built on the innovative Qwen3-Next architecture, which integrates a hybrid attention mechanism (Gated DeltaNet combined with Gated Attention) and a highly sparse Mixture of Experts (MoE) structure, this model aims to deliver unparalleled training and inference efficiency. As a sparse model with a total of 80 billion parameters, it activates only about 3 billion parameters during inference, significantly reducing computational costs. When handling long-context tasks involving more than 32K tokens, its throughput is over 10 times higher than that of the Qwen3-32B model. This "Thinking" version is optimized specifically for tackling challenging multi-step tasks such as mathematical proofs, code synthesis, logical analysis, and planning—and it defaults to outputting the reasoning process in a structured "chain-of-thought" format. In terms of performance, it not only outperforms costlier models like Qwen3-32B-Thinking but also surpasses Gemini-2.5-Flash-Thinking across multiple benchmark tests.
Playground
Log in to explore more features! Click to Log In