
gemini-2.5-flash-deepsearch
Possesses efficient retrieval and independent research capabilities.
2025-06-08
Input:
$5/1M tokens
Output:
$25/1M tokens
Bulk order? Contact your manager for exclusive deals
API Overview
Basic Information
- Gemini 2.5 Flash belongs to the Gemini model family and is an optimized version designed for “high efficiency, low latency, and low cost” tasks.
- It supports multiple input modalities, including text, images, audio, and video.
- This model offers an exceptionally large context window—capable of handling long texts or long-context inputs of up to approximately 1,000,000 tokens, making it ideal for analyzing larger or more complex datasets or documents.
Core Features
- High Speed and Low Cost: Designed to be an “efficient workhorse for everyday tasks,” it strikes a fine balance between response speed and resource consumption, making it highly suitable for applications that require high volume, frequent use, or real-time performance.
- Multi-modal Input Understanding: Beyond just text, it can understand and process images, audio, video, and mixed-modal inputs, supporting cross-modal tasks.
- Tool/Plugin Integration (Tool use / function calling): In conversations or tasks, it can call external tools or custom functions, making it well-suited for agent-based workflows—for example, retrieving information, executing code, or integrating third-party data.
- Flexible Control of “Thinking Budget”: Developers can set a “thinking budget”—controlling the number of tokens the model uses for internal reasoning—thus allowing flexible trade-offs between response speed and result quality.
Technical Highlights
- Long Context Capability: A 1,000,000-token context window means it can handle extremely long documents, dialogues, or multi-hour video content without losing critical information from earlier parts of the context due to its length.
- Multi-modal + Real-time Performance: By combining text, image, audio, and video input/output capabilities, it excels in scenarios such as summarization, chat, data extraction, subtitle generation, and automatic captioning, while maintaining low latency and reasonable costs.
- Well-suited for Agent-Based Workflows / Multi-step Reasoning: Although Flash is a “lightweight / efficient” version, it still retains strong reasoning, tool-call, and multi-step logical processing capabilities. In many benchmark tests, its performance on tasks such as reasoning, multimodal processing, and code generation comes close to that of its “Pro” counterpart.
- Cost-effective and Scalable Deployment: Compared to the premium Pro model, Flash places greater emphasis on resource utilization and cost control, making it ideal for scalable, frequently called, or scenarios sensitive to response speed and operational costs—a highly cost-effective choice for practical deployment.
Gemini‑2.5‑Flash‑DeepSearch introduces a step-by-step reasoning mechanism that performs internal thinking before providing answers, thereby enhancing logical consistency and accuracy and demonstrating agent-like task execution capabilities, helping enterprises build complex automated workflows.
Playground
Log in to explore more features! Click to Log In
API Analytics
API Reference (1)
API Pricing
$¥ 円 ₽