Pro/Qwen/Qwen2.5-7B-Instruct

Pro/Qwen/Qwen2.5-7B-Instruct

The 700-million-parameter version of the Qwen2.5 series
2024-09-19
LLM
Model capability: function_call
Input:
$0.05/1M tokens
Output:
$0.05/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Qwen2.5 is a general-purpose large language model series launched by Alibaba, featuring "ultra-long context support of 128K tokens" and "structured data processing capabilities." With a deeply optimized inference architecture, it provides industry-leading solutions for handling complex tasks.

  • Comprehensive performance leadership: The flagship model Qwen2.5-72B outperforms competitors such as Llama-3.1-70B and Mistral-Large-V2 in benchmark tests including programming (LiveCodeBench 55.5), mathematics (MATH 83.1), and general abilities (MMLU-Pro 71.1).
  • Ultra-long context support: Natively supports a context of 128K tokens, and with YaRN technology, this can be extended to 131K tokens, making it easy to handle ultra-long text tasks.
  • Structured data processing: Significantly enhances the ability to understand structured data such as tables, supports JSON-formatted output, and is well-suited for enterprise-level data interaction scenarios.
  • Multi-language coverage: Supports over 29 languages (including Chinese, English, French, Spanish, Arabic, and more), with localized optimizations that improve cross-language understanding accuracy.
  • Open-source and open: The entire series of models has been open-sourced on Hugging Face and ModelScope Community, offering API access and local deployment options, and supporting mainstream frameworks such as SGLang and vLLM.

───────────────────────────────────────────────────────────────────

Core Capabilities

🚀 Performance leap: Outperforms top competitors in programming, mathematics, and multi-language tasks, setting a new benchmark for enterprise-level AI applications.

📏 Ultra-long text processing: Natively supports a context of 128K tokens, which can be extended to 131K tokens via YaRN technology.

🛠️ Structured output: Supports JSON-formatted output, making it easy to parse structured data such as tables and enhancing enterprise data interaction efficiency.

🌐 Multi-language expert: Deeply optimized for low-resource languages such as Chinese and Arabic, improving cross-language understanding accuracy by 15%.

Cost-effective architecture: Achieves “large-scale strength with high performance” using 72 billion parameters, reducing the entry barrier for enterprise AI applications by 60%.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(SiliconFlow)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

Pro/Qwen/Qwen2.5-7B-Instruct

-
32000

Input$0.05 / 1M tokens
Output$0.05 / 1M tokens

Input$0.05/ 1M tokens
Output$0.05/ 1M tokens
Original Price