qwen2.5-7b-instruct-1m

qwen2.5-7b-instruct-1m

Compared to Qwen2, Qwen2.5 has acquired significantly more knowledge and achieved substantial improvements in programming and mathematical abilities. It supports a context length of 1 million tokens.
2025-01-30
LLM
Model capability: function_call
Input:
$0.072/1M tokens
Output:
$0.143/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Qwen2.5-7B-Instruct-1M is a lightweight, long-context text instruction fine-tuned model from Alibaba’s Tongyi Qwen2.5 series. Its core positioning is as a “low-threshold, ultra-long-text processing assistant,” achieving 1 million-token-level context support with a lightweight parameter count, striking a balance between long-text capabilities and deployment costs.

  • Ultra-long-context support: Natively supports up to 1,010,000 tokens as input and can generate texts of up to 8,192 tokens, easily handling ultra-long documents such as lengthy reports and codebases.
  • Lightweight and efficient architecture: With a total of 7.62 billion parameters, it adopts a Transformer architecture combined with the GQA attention mechanism, integrating optimization techniques such as RoPE and SwiGLU.
  • Full capability compatibility: It inherits extensive knowledge reserves, supports multiple languages (29+), and maintains strong foundational coding and mathematical abilities, while enhancing its capacity for long-text information extraction and logical coherence.
  • Flexible application scenarios: Suitable for summarizing long documents, handling ultra-long conversations, and performing code audits; maintains stable accuracy within 262,144 tokens, and is adaptable to small-to-medium-sized hardware configurations.

───────────────────────────────────────────────────────────────────

Core Capabilities

📚 Ultra-long-text parsing: Accurately extracts key information, logical relationships, and core conclusions from texts spanning up to 1 million tokens.

🧠 Long-context reasoning: Based on cross-paragraph contextual information, it performs question answering, analysis, and summarization while preserving fundamental logical capabilities.

🌍 Multilingual adaptation: Supports translation, interpretation, and analysis of long texts in over 29 languages.

📊 Structured processing: Understands ultra-long tables and multi-module documents, generating integrated analytical conclusions.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(Qwen2.5)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

qwen2.5-7b-instruct-1m

-
1000000

Input$0.072 / 1M tokens
Output$0.143 / 1M tokens

Input$0.072/ 1M tokens
Output$0.143/ 1M tokens
Original Price