grok-4-fast-reasoning

grok-4-fast-reasoning

Grok-4-Fast’s deep reasoning mode, used for step-by-step chain-of-thought analysis.
2025-09-23
LLM
Model capability: imageModel capability: thinkingModel capability: function_call
Input:
$0.2/1M tokens
Output:
$0.5/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Grok 4 Fast is a high-performance inference model launched by xAI, designed to deliver cutting-edge performance for both enterprise and consumer applications. It places particular emphasis on improving token efficiency, driving the miniaturization and acceleration of AI development, and making high-quality inference capabilities accessible to more users and developers.

  • Ultimate Cost Efficiency: In inference benchmarks, Grok 4 Fast outperforms Grok 3 Mini, reducing token costs. While matching the performance of Grok 4, it uses 40% fewer thinking tokens and achieves a cost reduction of 98% compared to Grok 4.
  • Native Tools and Top-Tier Search: It supports code execution, web browsing, and other features, with real-time data search and analysis capabilities. It leads in multilingual search benchmarks.
  • State-of-the-Art General Post-Training: It ranks at the top in multiple benchmarks, especially excelling in the LMArena search arena, where it outperforms other similar models.
  • Unified Architectural Design: Adopting a unified model architecture, it can handle both inference and fast-response tasks without distinguishing between task types, reducing latency and token costs while supporting real-time application scenarios.
  • Ultra-Large Context Window: It supports a context window of up to 2 million tokens, enabling it to process large-scale texts in a single go and ensuring stability and coherence in processing.

───────────────────────────────────────────────────────────────────

Core Capabilities

⚡ Multi-Platform and Version Compatibility

  • Unrestricted Access for All Users: Available on grok.com, as well as in iOS and Android apps. Free users enjoy unlimited access, with support for both quick and automatic modes, enhancing the search and information-query experience.
  • Developer-Friendly Version: Provides API access, including both inference and non-inference versions, supporting a context window of up to 2 million tokens. Developers can adjust computing resources according to their needs.

🛠️ Performance in Key Scenarios

  • Academic and Mathematical Reasoning: It performs exceptionally well in multiple high-difficulty benchmarks, demonstrating outstanding ability to handle complex mathematical and academic tasks.
  • Real-Time Information Retrieval and Analysis: It can handle real-time data-dependent tasks, such as multi-round searches and information integration, efficiently completing large-scale computations and validations.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (4)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(grok)
POST
Stable
View Details
Chat(grok-vision)
POST
Stable
View Details
Asynchronous request to chat
POST
Stable
View Details
Async Get Result
GET
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

grok-4-fast-reasoning

-
2000000

Input$0.2 / 1M tokens
Output$0.5 / 1M tokens

Input$0.2/ 1M tokens
Output$0.5/ 1M tokens
Original Price