gpt-4.1-nano-2025-04-14

gpt-4.1-nano-2025-04-14

Fastest, most cost-efficient version of GPT-4.1
2025-04-14
LLM
Model capability: imageModel capability: function_call
Input:
$0.1/1M tokens
Output:
$0.4/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

GPT-4.1 nano model, version 2025-04-14

GPT‑4.1 nano is the ultra‑lightweight member of the GPT‑4.1 family, designed to maximize speed and cost efficiency rather than absolute peak capability. Compared with the flagship GPT‑4.1, it trades some top‑end reasoning and coding power for dramatically lower price and latency, making it an ideal default for high‑traffic, real‑time workloads. Both models share a 1M‑token context window and a June 2024 knowledge cutoff, but GPT‑4.1 targets “highest overall performance” and complex agentic work, whereas GPT‑4.1 nano is optimized for “fastest and cheapest production use.”

On academic benchmarks, GPT‑4.1 nano reaches 80.1% on MMLU and 50.3% on GPQA, clearly outperforming GPT‑4o mini and even scoring 9.8% on Aider polyglot coding—strong results for such a small model. However, compared with GPT‑4.1, its ceiling on difficult tasks like SWE‑bench, multi‑step reasoning, and complex function calling is noticeably lower. As a result, GPT‑4.1 nano shines on classification, autocomplete, lightweight conversation, and rule‑driven workflows where you need “fast and good enough,” while leaving the hardest problems to GPT‑4.1 or reasoning‑focused models.

Pricing highlights the gap even more: GPT‑4.1 is billed at $2.00 / $8.00 per 1M input/output tokens, whereas GPT‑4.1 nano is just $0.10 / $0.40—around one‑twentieth of the flagship’s cost. With 75% discounts on cached input and no surcharge for long‑context requests, GPT‑4.1 nano enables cheap 1M‑token applications at scale. Combined with an optimized inference stack that returns the first token in under ~5 seconds for many 128K‑token queries, GPT‑4.1 nano is a strong fit for embedded intelligence, massive background workloads, low‑value but high‑frequency tasks, and latency‑sensitive front‑end features, complementing GPT‑4.1 in a “flagship + nano” tiered architecture.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (15)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(Talk)
POST
Stable
View Details
Chat (gpt-4o Image Analysis)
POST
Stable
View Details
Chat (gpt-4o Structured Output)
POST
Stable
View Details
Chat (gpt-4o function call)
POST
Stable
View Details
Chat (gpt-4-plus image analysis)
POST
Unstable
View Details
Chat (gpt-4-plus image generation)
POST
Unstable
View Details
Chat (gpts model)
POST
Unstable
View Details
Chat (chatgpt-4o-latest)
POST
Stable
View Details
Chat (o1 Series Model)
POST
Unstable
View Details
Chat(o3 Series Model)
POST
Unstable
View Details
Chat(gpt-4o audio model)
POST
Stable
View Details
Chat(gpt-4o-image-generation modify image)
POST
Stable
View Details
o4
POST
Stable
View Details
Responses
POST
Stable
View Details
Responses(Deep-Research)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

gpt-4.1-nano-2025-04-14

-
1000000

Input$0.1 / 1M tokens
Output$0.4 / 1M tokens

Input$0.1/ 1M tokens
Output$0.4/ 1M tokens
Original Price