deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

DeepSeek has launched a large-scale distillation language model that combines high-performance inference with cost-effectiveness, based on the Qwen-32B architecture.
2025-02-04
LLM
Model capability: function_call
Input:
$0.18/1M tokens
Output:
$0.18/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

DeepSeek-R1-Distill-Qwen-32B is a large-scale distilled language model launched by DeepSeek, with a core focus on achieving a balance between high-performance inference and cost-effective deployment. Based on the Qwen-32B architecture, this model was trained through distillation using reinforcement learning data from DeepSeek-R1, enabling it to deliver inference capabilities close to those of ultra-large-scale models while maintaining a moderate number of parameters.

  • Outstanding performance: On benchmarks for mathematics (MATH), coding (HumanEval), and general reasoning, its performance surpasses that of Llama-3.1-70B, and even rivals some of Mixtral-8x22B. It stands out as one of the top open-source distilled models available today.
  • Superior inference capability: Thanks to the high-quality distillation data from DeepSeek-R1, this model excels in logical reasoning and complex problem-solving, demonstrating “thinking” abilities comparable to those of large-scale models.
  • Cost-effective: Compared to MoE models with hundreds of billions of parameters (such as DeepSeek-R1), this model has lower inference costs and requires less GPU memory, making it ideal for enterprises and individual developers looking to achieve strong inference capabilities at a lower cost.
  • Bilingual advantage: Inheriting the Qwen series’ excellent native support for both Chinese and English, this model can handle complex bilingual tasks seamlessly.

───────────────────────────────────────────────────────────────────

Core Capabilities

🚀 Ultra-high throughput: Compared to full-featured large models at the same performance level, this model offers faster inference speeds and lower latency, making it well-suited for applications with strict response-time requirements.

🧠 Deep structured reasoning: It performs exceptionally well on mathematical proofs and logical deduction tasks, capable of handling complex structured data and multi-step reasoning processes.

⌨️ Intelligent code generation: Equipped with powerful programming capabilities, it can understand complex algorithmic logic and assist developers in code generation and debugging.

📉 Low-cost deployment: As a dense model, its deployment threshold is significantly lower than that of hundred-billion-parameter MoE models. A single GPU with 80GB of memory (such as A100 or H100) can easily deploy this model.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(SiliconFlow)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

deepseek-ai/DeepSeek-R1-Distill-Qwen-32B

-
64000

Input$0.18 / 1M tokens
Output$0.18 / 1M tokens

Input$0.18/ 1M tokens
Output$0.18/ 1M tokens
Original Price