Qwen/Qwen3-30B-A3B-Thinking-2507

Qwen/Qwen3-30B-A3B-Thinking-2507

Efficient Mixture-of-Experts (MoE) Inference Model
2025-07-30
LLM
Model capability: thinkingModel capability: function_call
Input:
$0.1/1M tokens
Output:
$0.4/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Qwen3-30B-A3B-Thinking-2507 is an efficient Mixture-of-Experts (MoE) inference model released by Alibaba’s Tongyi Lab. Its core positioning is as a “lightweight deep-thinking engine,” specifically designed for scenarios that require multi-step logical reasoning but are constrained by computing resources.

  • Exquisite MoE Architecture: With a total parameter count of approximately 30 billion, it activates only 3 billion parameters (A3B), achieving advanced reasoning capabilities at a computational cost comparable to that of a dense model with around 7 billion parameters.
  • Native Integration of Thinking Mode: Automatically enables Chain-of-Thought reasoning, enabling step-by-step decomposition of complex problems, intermediate verification, and integration of conclusions.
  • Ultra-long Context of 128K Tokens: Supports long-text inputs, making it suitable for multi-document analysis, complex instruction parsing, and deep dialogue scenarios.
  • Multilingual and Code-Enhanced Performance: Delivers robust performance in Chinese, English, and mainstream programming language tasks, balancing natural language understanding with structured output generation.

───────────────────────────────────────────────────────────────────

Core Capabilities

🧠 Autonomous Step-by-Step Reasoning: When faced with composite tasks such as “explaining quantum entanglement and simulating it using Python,” it first explains the underlying principles and then generates runnable code.

High Energy-Efficiency Inference: The low number of activated parameters results in high throughput and low latency, making it ideal for edge devices, highly concurrent agents, or cost-sensitive applications.

🧩 Tool-Compatibility Friendly: It can call calculators, code interpreters, or search modules to verify intermediate results, ensuring the reliability of the final output.

🛡️ Safe and Controllable Output: Equipped with built-in content filtering mechanisms, supports audit logs and format constraints, meeting enterprise compliance deployment requirements.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(SiliconFlow)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

Qwen/Qwen3-30B-A3B-Thinking-2507

-
256000

Input$0.1 / 1M tokens
Output$0.4 / 1M tokens

Input$0.1/ 1M tokens
Output$0.4/ 1M tokens
Original Price