
Baichuan-M3
API Overview
Baichuan-M3 is the flagship medical-enhanced large language model launched by Baichuan Intelligence, with a core positioning as “high-reliability medical AI driven by clinical decision-making process modeling.” It is specifically designed to deliver trustworthy consultation and decision-support capabilities tailored for real-world clinical scenarios.
- Key Upgrades: Compared to Baichuan-M2, it achieved a 28-percentage-point improvement in the HealthBench-Hard benchmark, scoring 44.4 points—completely surpassing GPT-5.2.
- Applicable Scenarios: Medical education, health consultations, clinical decision support, end-to-end electronic health record simulation, and diagnostic assistance.
- Product Value: It proactively collects key information and builds verifiable reasoning paths, significantly reducing ambiguous recommendations and ineffective responses.
- Competitor Comparison: It outperforms OpenAI’s latest models across all four dimensions: HealthBench, HealthBench-Hard, SCAN-bench, and hallucination rate.
- Test Data: In the clinical consultation phase of SCAN-bench, it leads the next-best model by 12.4 points; its hallucination rate without tool assistance is lower than that of GPT-5.2.
───────────────────────────────────────────────────────────────────
Core Capabilities
🩺 Fully Functional Consultation Capability: The only model ranked first in all three dimensions of the SCAN-bench—clinical consultation, laboratory tests, and disease diagnosis.
🧠 Low Hallucination and High Reliability: Powered by the Fact-Aware RL framework, it verifies medical claims in real time, ensuring that answers are based on authoritative evidence.
⚡ Efficient Inference Deployment: With W4 quantization requiring only 26% of the memory, and Gated Eagle3 speculative decoding accelerating inference by 96%, it supports large-scale deployment.
🏆 The World’s Strongest Medical Model: Ranked No. 1 overall in the HealthBench benchmark, validated through 5,000 rounds of high-fidelity conversations constructed by 262 practicing physicians.
🔬 Clinical Process Modeling: Adopting SPAR segmented reinforcement learning, it precisely trains the consultation process into four stages: medical history collection → differential diagnosis → examinations → final diagnosis.
───────────────────────────────────────────────────────────────────
Selected Test Data

Playground
Log in to explore more features! Click to Log In