deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

The distillation-based open-source language model launched by DeepSeek is primarily positioned for "high-performance, lightweight inference."
2025-02-04
LLM
Model capability: function_call
Input:
$0.1/1M tokens
Output:
$0.1/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

DeepSeek-R1-Distill-Qwen-14B is a distilled open-source language model released by DeepSeek, primarily designed for **high-performance, lightweight inference**. Based on the Qwen-14B architecture, this model was trained via distillation using reinforcement learning data from DeepSeek-R1, aiming to deliver inference capabilities close to those of large models at a lower cost.

  • Strong performance: In multiple benchmark tests, it outperforms both the native Qwen-14B and Llama-3.1-14B, achieving the effect of “using a distilled small model to rival flagship large models.”
  • Outstanding inference capability: Thanks to the distillation data from DeepSeek-R1, this model excels in complex tasks such as mathematical reasoning and code generation, with significantly enhanced logical thinking abilities.
  • High cost-effectiveness: As a 14B-parameter model, its inference costs are much lower than those of MoE models with tens of billions of parameters (such as DeepSeek-R1 itself), making it an ideal choice for users seeking cost-efficient solutions.
  • Bilingual optimization: Inheriting the Qwen series’ excellent support for both Chinese and English, it can handle bilingual tasks seamlessly.

───────────────────────────────────────────────────────────────────

Core Capabilities

⚡ Ultra-fast response: With a moderate model size and rapid inference speed, it’s suitable for deployment on consumer-grade GPUs or cloud servers, meeting the demands of low-latency applications.

🧠 Deep reasoning: It performs exceptionally well in mathematical benchmarks such as GSM8K and MATH, capable of solving complex logical and mathematical problems.

⌨️ Code generation: Trained on high-quality code datasets, it scores highly in tests like HumanEval and can assist developers in programming and debugging.

📉 Low-cost deployment: Compared to full-fledged large models with hundreds of billions of parameters, this distilled model maintains high performance while dramatically reducing hardware resource consumption and operational maintenance costs.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(SiliconFlow)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

-
64000

Input$0.1 / 1M tokens
Output$0.1 / 1M tokens

Input$0.1/ 1M tokens
Output$0.1/ 1M tokens
Original Price