
baidu/ernie-4.5-21b-a3b-thinking
API Overview
ERNIE-4.5-21B-A3B-Thinking is a 21-billion-parameter open-source mixture-of-experts model released by Baidu, with a core focus on a high-performance, lightweight inference engine built on Baidu’s self-developed PaddlePaddle framework. It is specifically designed for complex tasks such as logical reasoning, mathematical problem-solving, and academic analysis.
- Efficient Sparse Architecture: Adopting the MoE architecture, it has a total of 21 billion parameters yet activates only 3 billion parameters per token, significantly reducing computational overhead and achieving extremely high parameter efficiency (performance comparable to larger models with more parameters).
- Ultra-long Context Support: Supporting a context window of up to 128K tokens, this model has been specially trained to handle massive amounts of information reliably, greatly reducing hallucination issues and making it ideal for complex, long-text tasks.
- Self-developed Framework Foundation: Unlike mainstream models, it is trained and optimized based on Baidu’s self-developed PaddlePaddle framework—a model adopted globally only by Baidu and Google, offering strong technological independence.
───────────────────────────────────────────────────────────────────
Core Capabilities
🧠 Deep Logical and Mathematical Reasoning: Equipped with an efficient tool-call capability, it is specially designed for logical reasoning and mathematical problem-solving, delivering outstanding performance in benchmarks such as BBH and CMATH—approaching or even surpassing industry-leading competitors.
🚀 Exclusive MoE Sparse Activation Architecture: With a total of 21 billion parameters yet activating only 3 billion parameters per token, it maintains high performance while dramatically reducing computational costs, striking the perfect balance between high performance and low cost.
📜 Precise Handling of Massive Long Texts: Supporting a context window of up to 128K tokens, this model has undergone specialized academic-level training, enabling it to handle complex long-text tasks (such as academic analysis) reliably and effectively avoiding information loss and hallucinations.
🛠️ Powerful Tool Calling and Integration: It supports structured function calls and integration with external APIs, perfectly adapting to scenarios such as program synthesis, symbolic reasoning, and multi-agent workflows, offering exceptional scalability.
🌐 Chinese-English Bilingual and Multimodal Compatibility: Optimized for both Chinese and English, it boasts excellent compatibility with multimodal tasks. Built on the PaddlePaddle framework, it ensures efficient hardware adaptation and broad applicability for developers worldwide.
Playground
Log in to explore more features! Click to Log In