
baidu/ernie-4.5-0.3b
API Overview
ERNIE-4.5-0.3B is a lightweight dense language model released by Baidu. As the smallest open-source model in the Wenxin 4.5 series, its core positioning is as a low-resource, high-efficiency general-purpose language understanding and generation engine, suitable for edge-side deployment and low-cost inference scenarios.
- Ultra-lightweight: With only 300 million parameters, it can run in resource-constrained environments such as mobile phones and embedded devices.
- High-efficiency inference: Optimized based on the PaddlePaddle framework, it supports ultra-long contexts (32K tokens) and delivers rapid responses.
- Ready-to-use: Available in both PyTorch and PaddlePaddle formats, and can be deployed with a single line of code using FastDeploy.
- Protocol compatibility: Its API interface is compatible with OpenAI, enabling seamless integration into existing LLM application ecosystems.
- Completely open-source: Released under the Apache 2.0 license, it supports both commercial and academic use, and comes with the ERNIEKit fine-tuning toolchain.
───────────────────────────────────────────────────────────────────
Core Capabilities
📱 Edge-friendly: With an extreme compression of 300 million parameters and low memory usage, it’s ideal for deployment on mobile devices and edge devices.
⚡ Long-context support: Up to 32,768 tokens, easily handling scenarios such as long document summarization and conversation history.
🔌 OpenAI compatibility: After deployment with FastDeploy, it provides a standard OpenAI API, allowing zero-cost migration of existing applications.
🛠️ Full-stack toolchain: ERNIEKit supports post-training techniques such as LoRA, DPO, and quantization, enabling rapid customization of domain-specific models.
🔓 Commercially friendly and open-source: Released under the Apache 2.0 license, with no usage restrictions, supporting private deployment and secondary development.
Playground
Log in to explore more features! Click to Log In