deepseek/deepseek-v3.2

deepseek/deepseek-v3.2

The open-source large model with a multi-expert mixture-of-experts (MoE) architecture launched by DeepSeek.
2025-12-02
LLM
Model capability: function_call
Input:
$0.286/1M tokens
Output:
$0.429/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

DeepSeek-V3.2 is the flagship open-source general-purpose language model launched by DeepSeek (DeepSeek), with a core focus on delivering exceptional performance—surpassing comparable dense models—while maintaining extremely low inference costs through state-of-the-art matrix multiplication optimization and a multi-expert mixture-of-experts (MoE) architecture.

  • Outstanding cost-effectiveness: Open-sourced under the MIT License, with fully public weights, allowing commercial use. API call costs are exceptionally low, making it one of the most cost-efficient flagship models currently available on the market.
  • Top-tier performance: It comprehensively outperforms Llama-3.1/3.2-405B in multiple benchmarks such as MMLU and MATH-500, achieving top-level inference and coding capabilities, with overall performance approaching that of GPT-4o.
  • Ultra-large-scale architecture: Featuring a MoE architecture with 2.168 trillion total parameters and 416 billion activated parameters, it supports a context length of 256k, enabling it to handle massive amounts of information and complex tasks.
  • Efficient inference: Leveraging FP8 quantization technology and extreme matrix multiplication (GEMM) optimizations, it delivers ultra-fast inference speeds, supporting smooth interactions even in high-concurrency scenarios.
  • Multi-language capabilities: Particularly outstanding in Chinese and English tasks, while also demonstrating strong multilingual understanding and generation abilities, making it well-suited for globalized application scenarios.

───────────────────────────────────────────────────────────────────

Core Capabilities

⚡ Extreme Matrix Optimization

Extreme inference optimization based on FP8 and GEMM. Through deep refinement of underlying operators, it achieves extremely high computational density and throughput, enabling ultra-large-scale models to run efficiently even with limited computing power.

🧠 Powerful Mixture-of-Experts

Adopting a 2.168T MoE architecture with up to 416B activated parameters, it maintains enormous model capacity while requiring only a small number of experts to be activated to complete tasks, striking the perfect balance between “large model” and “low cost.”

🌐 Superior Inference and Coding

It sets new SOTA records in MATH-500 and code-generation tasks. Equipped with top-notch logical reasoning and coding capabilities, it can deliver high-quality solutions whether tackling complex mathematical proofs or full-stack software development.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(PPIO)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

deepseek/deepseek-v3.2

-
163840

Input$0.286 / 1M tokens
Output$0.429 / 1M tokens

Input$0.286/ 1M tokens
Output$0.429/ 1M tokens
Original Price