qwen3-vl-235b-a22b-thinking

qwen3-vl-235b-a22b-thinking

The flagship multimodal mixture-of-experts (MoE) model launched by Tongyi Qianwen
2025-09-24
LLM
Model capability: imageModel capability: thinking
Input:
$0.286/1M tokens
Output:
$2.86/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Qwen3-VL-235B-A22B-Thinking is a super-large-scale multimodal reasoning model launched by Alibaba’s Tongyi Lab, with a core mission as a “deep-thinking engine for vision and language,” specifically designed for challenging, multi-step visual-textual joint reasoning tasks.

  • Exclusive Optimization for Thinking Mode: Enhances chain-of-thought reasoning capabilities based on standard VL models, supporting automatic step-by-step thinking, intermediate verification, and result integration.
  • High Efficiency with MoE Architecture: With a total of 235 billion parameters, only 22 billion are activated, significantly reducing inference costs while maintaining top-tier performance.
  • Leading in Complex Visual Tasks: Capable of parsing high-information-density images such as charts, GUI interfaces, and technical drawings, and performing logical reasoning in conjunction with contextual information.
  • Long-Context Multimodal Fusion: Supports mixed inputs of images, videos, PDFs, and ultra-long texts, enabling deep cross-modal correlation analysis.

───────────────────────────────────────────────────────────────────

Core Capabilities

🧠 Autonomous Step-by-Step Reasoning: For tasks such as “calculating profit margins from financial report screenshots and comparing them with industry averages,” it can automatically break down the process into multiple steps—including recognition, calculation, retrieval, and summarization.

👁️ Advanced Visual Understanding: Not only does it understand image content but also reasons about relationships among elements, such as interface interaction logic, data trends, and spatial structures.

🧩 Tool-Coordinated Closed Loop: It can call calculators, code interpreters, or search tools to verify intermediate results, ensuring that the final output is accurate and reliable.

📊 Professional Scenario Adaptation: In fields requiring rigorous visual analysis—such as finance, scientific research, and engineering—it provides AI-assisted decision-making capabilities approaching expert-level expertise.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat (Tongyi Qianwen-OCR)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

qwen3-vl-235b-a22b-thinking

-
126976

Input$0.286 / 1M tokens
Output$2.86 / 1M tokens

Input$0.286/ 1M tokens
Output$2.86/ 1M tokens
Original Price