grok-4-1-fast-non-reasoning

grok-4-1-fast-non-reasoning

The non-inference mode of Grok-4-1-Fast, designed for real-time response scenarios that require extremely low latency.
2025-11-20
LLM
Model capability: imageModel capability: function_call
Input:
$0.2/1M tokens
Output:
$0.5/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Grok 4.1 Fast is xAI’s flagship tool-call model, launched on November 19, 2025. Paired with the Agent Tools API, it is primarily designed to deliver powerful long-context reasoning in an efficient and cost-effective manner.

  • Outstanding Performance: It excels in benchmark tests such as τ² - bench Telecom and Berkeley Function Calling v4, demonstrating stable multi-turn long-context performance and halving the hallucination rate.
  • Powerful Capabilities: With a 2-million-token context window and integrated with the Agent Tools API, it enables parallel invocation of multiple tools without requiring developers to manage infrastructure.
  • Cost-Effective: It boasts high tool-call accuracy and low inference costs, making it more cost-efficient than competitors in research-oriented scenarios.

───────────────────────────────────────────────────────────────────

Core Capabilities

📊 Precise Testing: Achieves 72% accuracy in the Berkeley Function Calling v4 benchmark and leads in the τ² - bench Telecom score, showcasing its exceptional strength.

💪 Long-Text Processing: With a 2-million-token context window, it delivers high accuracy in multi-turn long-context tasks, effortlessly handling complex task planning.

🛠️ Rich Tool Ecosystem: The Agent Tools API provides a wide range of tools, supporting real-time search, file retrieval, code execution, and more—just a few lines of code are all it takes to get started.

💰 Low Costs: In research-oriented scenarios, its average cost is lower, and the unit price for cached input tokens is highly affordable.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (4)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(grok)
POST
Stable
View Details
Chat(grok-vision)
POST
Stable
View Details
Asynchronous request to chat
POST
Stable
View Details
Async Get Result
GET
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

grok-4-1-fast-reasoning

-
2000000

Input$0.2 / 1M tokens
Output$0.5 / 1M tokens

Input$0.2/ 1M tokens
Output$0.5/ 1M tokens
Original Price