sophnet/Qwen3-Next-80B-A3B-Thinking

sophnet/Qwen3-Next-80B-A3B-Thinking

Next-generation efficient mixture-of-experts (MoE) inference model
2025-09-24
LLM
Model capability: thinking
Input:
$0.143/1M tokens
Output:
$1.43/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Qwen3-Next-80B-A3B-Thinking is the next-generation efficient Mixture-of-Experts (MoE) inference model released by Alibaba’s Tongyi Lab. Its core positioning is as a new-generation general-purpose thinking engine characterized by “high intelligence density, low activation cost, and strong chain-of-thought capabilities.”

  • Advanced MoE Architecture: With a total of 80 billion parameters, it activates only 3 billion parameters (A3B), significantly reducing computational overhead and inference latency while maintaining powerful overall performance.
  • Deep Integration with Thinking Mode: It natively supports automatic step-by-step reasoning (Chain-of-Thought), enabling planning, verification, and integrated output for complex problems.
  • Ultra-long Context of 128K Tokens: It supports extremely long text inputs, making it ideal for multi-document analysis, complex instruction following, and deep dialogue scenarios.
  • Multi-language and Code Enhancement: It excels in tasks involving Chinese, English, and mainstream programming languages, balancing natural language understanding with structured generation.

───────────────────────────────────────────────────────────────────

Core Capabilities

🧠 Efficient Deep Reasoning: With computational costs close to those of a 7-billion-parameter dense model, it achieves logic decomposition and multi-hop question-answering capabilities at the 80-billion-parameter scale. ⚡ Ultimate Energy Efficiency: Featuring few activated parameters and high throughput, it is well-suited for high-concurrency agents, customer service systems, or edge deployments on mobile devices. 🧩 Agent-Ready Architecture: It supports Function Calling and structured outputs, serving as a core engine for autonomous task planning and tool invocation. 🌍 Expert-Level Bilingual Expression in Chinese and English: In scenarios such as technical writing, policy analysis, and creative generation, it produces content that is rigorous, fluent, and contextually appropriate.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(SophNet)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

sophnet/Qwen3-Next-80B-A3B-Thinking

-
128000

Input$0.143 / 1M tokens
Output$1.43 / 1M tokens

Input$0.143/ 1M tokens
Output$1.43/ 1M tokens
Original Price