
grok-4-fast-reasoning
Grok-4-Fast’s deep reasoning mode, used for step-by-step chain-of-thought analysis.
2025-09-23
Input:
$0.2/1M tokens
Output:
$0.5/1M tokens
Bulk order? Contact your manager for exclusive deals
API Overview
Grok 4 Fast is a high-performance inference model launched by xAI, designed to deliver cutting-edge performance for both enterprise and consumer applications. It places particular emphasis on improving token efficiency, driving the miniaturization and acceleration of AI development, and making high-quality inference capabilities accessible to more users and developers.
- Ultimate Cost Efficiency: In inference benchmarks, Grok 4 Fast outperforms Grok 3 Mini, reducing token costs. While matching the performance of Grok 4, it uses 40% fewer thinking tokens and achieves a cost reduction of 98% compared to Grok 4.
- Native Tools and Top-Tier Search: It supports code execution, web browsing, and other features, with real-time data search and analysis capabilities. It leads in multilingual search benchmarks.
- State-of-the-Art General Post-Training: It ranks at the top in multiple benchmarks, especially excelling in the LMArena search arena, where it outperforms other similar models.
- Unified Architectural Design: Adopting a unified model architecture, it can handle both inference and fast-response tasks without distinguishing between task types, reducing latency and token costs while supporting real-time application scenarios.
- Ultra-Large Context Window: It supports a context window of up to 2 million tokens, enabling it to process large-scale texts in a single go and ensuring stability and coherence in processing.
───────────────────────────────────────────────────────────────────
Core Capabilities
⚡ Multi-Platform and Version Compatibility
- Unrestricted Access for All Users: Available on grok.com, as well as in iOS and Android apps. Free users enjoy unlimited access, with support for both quick and automatic modes, enhancing the search and information-query experience.
- Developer-Friendly Version: Provides API access, including both inference and non-inference versions, supporting a context window of up to 2 million tokens. Developers can adjust computing resources according to their needs.
🛠️ Performance in Key Scenarios
- Academic and Mathematical Reasoning: It performs exceptionally well in multiple high-difficulty benchmarks, demonstrating outstanding ability to handle complex mathematical and academic tasks.
- Real-Time Information Retrieval and Analysis: It can handle real-time data-dependent tasks, such as multi-round searches and information integration, efficiently completing large-scale computations and validations.
Playground
Log in to explore more features! Click to Log In
API Analytics
API Reference (4)
API Pricing
$¥ 円 ₽