qwen2.5-32b-instruct

qwen2.5-32b-instruct

The 3.2-billion-parameter version of the Qwen2.5 series
2024-09-19
LLM
Model capability: function_call
Input:
$0.29/1M tokens
Output:
$0.86/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Qwen2.5 is a series of general-purpose large language models launched by Alibaba, featuring "ultra-long context support up to 128K tokens" and "structured data processing capabilities." With a deeply optimized inference architecture, it provides industry-leading solutions for handling complex tasks.

  • Comprehensive Performance Leadership: The flagship model Qwen2.5-72B outperforms competitors such as Llama-3.1-70B and Mistral-Large-V2 in benchmark tests including programming (LiveCodeBench 55.5), mathematics (MATH 83.1), and general abilities (MMLU-Pro 71.1).
  • Ultra-Long Context Support: Natively supports a context of 128K tokens, and with YaRN technology, this can be extended to 131K tokens, making it easy to handle ultra-long text tasks.
  • Structured Data Processing: Significantly enhances the ability to understand structured data such as tables, supports JSON-formatted outputs, and is well-suited for enterprise-level data interaction scenarios.
  • Multi-Language Coverage: Supports over 29 languages (including Chinese, English, French, Spanish, Arabic, and more), with localized optimizations that improve cross-language understanding accuracy.
  • Open Source and Open Access: The entire series of models has been open-sourced on Hugging Face and ModelScope Community, offering API calls and local deployment options, and supporting mainstream frameworks such as SGLang and vLLM.

───────────────────────────────────────────────────────────────────

Core Capabilities

🚀 Performance Leap: Outperforms top competitors in programming, mathematics, and multi-language tasks, setting a new benchmark for enterprise-level AI applications.

📏 Ultra-Long Text Processing: Natively supports a context of 128K tokens, which can be extended to 131K tokens via YaRN technology.

🛠️ Structured Output: Supports JSON-formatted outputs, making it easy to parse structured data such as tables and enhancing enterprise data interaction efficiency.

🌐 Multi-Language Expert: Deeply optimized for low-resource languages such as Chinese and Arabic, improving cross-language understanding accuracy by 15%.

Cost-Effective Architecture: Achieves “large-scale strength with high performance” using 72 billion parameters, reducing the entry barrier for enterprise AI applications by 60%.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(Qwen2.5)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

qwen2.5-32b-instruct

-
128000

Input$0.29 / 1M tokens
Output$0.86 / 1M tokens

Input$0.29/ 1M tokens
Output$0.86/ 1M tokens
Original Price