Qwen/Qwen3-Next-80B-A3B-Thinking

Qwen/Qwen3-Next-80B-A3B-Thinking

A large language model from Alibaba designed for complex reasoning tasks
2025-09-10
LLM
Model capability: thinkingModel capability: function_call
Input:
$0.143/1M tokens
Output:
$0.572/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Qwen3-Next-80B-A3B-Thinking is the next-generation foundational model released by Alibaba's Tongyi Qianwen team, specifically designed for complex reasoning tasks. Built on the innovative Qwen3-Next architecture, which integrates a hybrid attention mechanism (Gated DeltaNet combined with Gated Attention) and a highly sparse Mixture of Experts (MoE) structure, this model aims to deliver unparalleled training and inference efficiency. As a sparse model with a total of 80 billion parameters, it activates only about 3 billion parameters during inference, significantly reducing computational costs. When handling long-context tasks involving more than 32K tokens, its throughput is over 10 times higher than that of the Qwen3-32B model. This "Thinking" version is optimized specifically for tackling challenging multi-step tasks such as mathematical proofs, code synthesis, logical analysis, and planning—and it defaults to outputting the reasoning process in a structured "chain-of-thought" format. In terms of performance, it not only outperforms costlier models like Qwen3-32B-Thinking but also surpasses Gemini-2.5-Flash-Thinking across multiple benchmark tests.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(SiliconFlow)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

Qwen/Qwen3-Next-80B-A3B-Thinking

-
256000

Input$0.143 / 1M tokens
Output$0.572 / 1M tokens

Input$0.143/ 1M tokens
Output$0.572/ 1M tokens
Original Price