Qwen/Qwen3-VL-30B-A3B-Thinking

Qwen/Qwen3-VL-30B-A3B-Thinking

A visual language model from Alibaba
2025-09-23
LLM
Model capability: imageModel capability: thinkingModel capability: function_call
Input:
$0.1/1M tokens
Output:
$0.4/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Qwen3-VL is the most powerful visual language model in the Qwen series to date. The model has undergone a comprehensive upgrade, featuring enhanced text understanding and generation, deeper visual perception and reasoning capabilities, extended context length, improved spatial and video dynamic understanding, and stronger agent interaction abilities. This "Thinking" version, with its enhanced reasoning capabilities, is built on a Mixture-of-Experts (MoE) architecture and excels at tasks such as interacting with graphical user interfaces of PC/mobile devices, generating code from images, and performing advanced multimodal reasoning in STEM fields. It natively supports a context length of 256K and boasts expanded OCR capabilities that cover 32 languages.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(SiliconFlow)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

Qwen/Qwen3-VL-30B-A3B-Thinking

-
64000

Input$0.1 / 1M tokens
Output$0.4 / 1M tokens

Input$0.1/ 1M tokens
Output$0.4/ 1M tokens
Original Price