
Dots.OCR
The OCR large model independently deployed by the 302.AI team
2025-08-23
Pricing:
Bulk order? Contact your manager for exclusive deals
API Overview
dots.ocr is a powerful, multilingual document parser that unifies layout detection and content recognition within a single vision-language model while maintaining good reading order. Despite its compact 1.7B-parameter LLM foundation, it achieves state-of-the-art(SOTA) performance.
- Powerful Performance: dots.ocr achieves SOTA performance for text, tables, and reading order on OmniDocBench, while delivering formula recognition results comparable to much larger models like Doubao-1.5 and gemini2.5-pro.
- Multilingual Support: dots.ocr demonstrates robust parsing capabilities for low-resource languages, achieving decisive advantages across both layout detection and content recognition on our in-house multilingual documents benchmark.
- Unified and Simple Architecture: By leveraging a single vision-language model, dots.ocr offers a significantly more streamlined architecture than conventional methods that rely on complex, multi-model pipelines. Switching between tasks is accomplished simply by altering the input prompt, proving that a VLM can achieve competitive detection results compared to traditional detection models like DocLayout-YOLO.
- Efficient and Fast Performance: Built upon a compact 1.7B LLM, dots.ocr provides faster inference speeds than many other high-performing models based on larger foundations.
API Console
Log in to explore more features! Click to Log In
API Reference (2)
API Pricing
$¥ 円 ₽