qwen2.5-72b-instruct

qwen2.5-72b-instruct

Qwen2.5-72B-Instruct is Alibaba’s flagship open-source language model, offering industry-leading solutions for handling complex tasks through a deeply optimized inference architecture.
2024-09-19
LLM
Model capability: function_call
Input:
$0.58/1M tokens
Output:
$1.72/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Qwen2.5-72B-Instruct is Alibaba’s flagship open-source language model, primarily positioned as a high-performance, large-scale enterprise-grade model optimized for long-text processing and structured output, supporting a context length of 128K.

  • Performance Leap: The 72B-parameter model outperforms Llama-3.1-70B in 12 authoritative benchmarks including MMLU and MATH, with inference speeds twice as fast as its peers and costs only one-fourth as much.
  • Applicable Scenarios: It is well-suited for high-frequency interactive applications such as financial risk control, code generation, and multilingual translation, supporting JSON-structured outputs and the execution of complex system instructions.
  • Multimodal Capabilities: It supports over 29 languages (including Chinese, English, French, Japanese, Korean, and more), with a 30% improvement in understanding structured data such as tables.
  • Competitive Comparison: In the HumanEval code benchmark, it scores 86.6 points (CodeQwen1.5 scored 86.0), and achieves an 88.2% task completion rate on MBPP (Llama3.1-70B scored 84.2%).

───────────────────────────────────────────────────────────────────

Core Capabilities

⚡ Ultra-High-Speed Inference: Featuring a proprietary KV cache optimization technology, with response latency below 50ms.

📊 Long-Text Processing: Supports 128K context understanding and 8K continuous text generation, improving the efficiency of complex report processing by 50%.

🔑 Structured Output: Achieves a 92.3% accuracy rate in JSON generation and processes tabular data three times faster than industry averages.

🌍 Multilingual Coverage: Seamlessly switches among over 29 languages; in mixed Chinese-English scenarios, its F1 score reaches 89.7 (SDXL scores 85.2).

🛠️ Tool Ecosystem: Natively compatible with vLLM/Ollama tool calls.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(Qwen2.5)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

qwen2.5-72b-instruct

-
128000

Input$0.58 / 1M tokens
Output$1.72 / 1M tokens

Input$0.58/ 1M tokens
Output$1.72/ 1M tokens
Original Price