doubao-seed-1-8-251215

doubao-seed-1-8-251215

Model optimized for multi-modal agent scenarios. Enhanced agent capabilities, upgraded multi-modal understanding, and more flexible context management.
2025-12-19
LLM
Model capability: imageModel capability: thinkingModel capability: function_call
Input:
$0.1143/1M tokensstarting from
Output:
$0.286/1M tokensstarting from
Bulk order? Contact your manager for exclusive deals

API Overview

Doubao-Seed-1.8 (DouBao Large Model 1.8) is ByteDance’s flagship multimodal language model, primarily positioned as an all-purpose reasoning engine for the Agent era. It has been deeply optimized for complex task planning, tool invocation, and multimodal understanding.

  • Comprehensively enhanced Agent capabilities: Significantly upgraded tool invocation, complex instruction following, and OS-level operational abilities, enabling autonomous completion of cross-app, multi-step tasks.
  • Breakthrough in video understanding: Supports parsing of up to 1280 frames per video in a single session, allowing for fast browsing at low frame rates combined with focused analysis at high frame rates—ideal for applications such as surveillance and quality inspection.
  • Intelligent context management: Natively supports dynamic clearing of low-value historical information, ensuring stable execution of long-running tasks without crashes.
  • Leading multimodal understanding: Outperforms top global models in multiple public benchmarks; ranks first globally in the BrowserComp agent evaluation.

───────────────────────────────────────────────────────────────────

Core Capabilities

🧠 Autonomous task planning: When faced with complex, multi-scenario demands, it can automatically decompose tasks, invoke tools, and integrate results.

👁️ Upgraded multimodal understanding: Greatly enhances foundational visual comprehension capabilities, enabling low-frame-rate understanding of ultra-long videos. Additionally, it shows improvements in video motion understanding and complex spatial comprehension abilities.

📄 Deep document collaboration: Simultaneously parses over 10 types of long texts—including email attachments, strategic documents, and industry reports—to generate structured decision recommendations.

🧩 Native Agent architecture: From perception and reasoning to execution, the entire process is completed within a single model, avoiding distortion and latency caused by stitching together multiple modules.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat (ByteDance Doubao)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

doubao-seed-1-8-251215

The price of input length [0, 32]k and output length [0, 0.2]k
256000

Input$0.1143 / 1M tokens
Output$0.286 / 1M tokens

Input$0.1143/ 1M tokens
Output$0.286/ 1M tokens
Original Price

doubao-seed-1-8-251215

The price of input length [0, 32] k and output length (0.2,+∞) k
256000

Input$0.1143 / 1M tokens
Output$1.143 / 1M tokens

Input$0.1143/ 1M tokens
Output$1.143/ 1M tokens
Original Price

doubao-seed-1-8-251215

The price of input length (32, 128] k
256000

Input$0.1715 / 1M tokens
Output$2.286 / 1M tokens

Input$0.1715/ 1M tokens
Output$2.286/ 1M tokens
Original Price

doubao-seed-1-8-251215

The price of input length (128, 256] k
256000

Input$0.343 / 1M tokens
Output$3.43 / 1M tokens

Input$0.343/ 1M tokens
Output$3.43/ 1M tokens
Original Price