1. Function Introduction
Generate one or more high-quality images based on text prompts, supporting:
- GPT-Image Series Models (gpt-image-1.5 / gpt-image-1 / gpt-image-1-mini)
- Fine-grained control over multi-size, multi-quality, transparent background, streaming output, etc.
- Comes with built-in content moderation, allowing customizability of moderation strictness
- Asynchronous / Synchronous dual mode, adaptable to different business scenarios
2. Request Parameters
| Field |
Type |
Required |
Description |
Remarks |
| prompt |
string |
✅ |
Description text |
Length ≤ 32k (gpt-image-1.5) |
| model |
string |
✅ |
Model |
Optional: gpt-image-1.5 / gpt-image-1 / gpt-image-1-mini |
| n |
int |
|
Number of images to generate |
1-10; dall-e-3 only supports 1 |
| size |
string |
|
Dimensions |
1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), auto. When the dimension selection is auto, the model will automatically provide the optimal ratio based on the prompt. |
| quality |
string |
|
Quality |
low / medium / high / auto |
| background |
string |
|
Background |
transparent / opaque / auto |
| output_format |
string |
|
Output format |
png / jpeg / webp |
| output_compression |
int |
|
Compression ratio |
0-100; only effective for jpeg/webp |
| moderation |
string |
|
moderation strictness |
low (lenient) / auto |
| stream |
bool |
|
Streaming output |
Only supported by the GPT-Image series |
| partial_images |
int |
|
Number of streaming shards |
0-3; 0 = only return the final image |
| async |
query |
|
Whether asynchronous |
Returns task_id when passing ?async=true |
3. Precautions
- ✅ Please strictly adhere to the prompt character length limit according to the selected model
- 📏 Different models support different size and quality parameters, so compatibility needs to be confirmed in advance
- 🔒 Sensitive content will be intercepted by the system, and it is recommended to keep the moderation parameter in the default auto mode
- 📊 Pricing is calculated based on the Tokens of the generated images, with different models/qualities consuming different amounts of Tokens (see the Pricing section for details)
- 🔄 The generation progress of asynchronous tasks can be actively checked via the task query interface
4. Price
| Model |
Text Input |
Text Output |
Image Output |
Remarks |
| gpt-image-1 |
5 PTC/1M Tokens |
|
40 PTC/1M Tokens |
High-quality model |
| gpt-image-1-mini |
2 PTC/1M Tokens |
|
8 PTC/1M Tokens |
Efficient and Economical Model |
| gpt-image-1.5 |
5 PTC/1M Tokens |
10 PTC/1M Tokens |
32 PTC/1M Tokens |
Balanced Model |
The final price shall be based on the number of Tokens consumed as returned by the request
gpt-image-1: Price Reference

Request Parameters Header ParametersAuthorizationstringOptional Example Value:Bearer {{YOUR_API_KEY}} Query Parametersresponse_formatstringOptional Example Value:url Request Body application/jsonA text description of the required image. The more detailed the description, the better the expected result.
Character length limits:
- GPT-Image series models: maximum 32000 characters
- DALL-E-2: maximum 1000 characters
- DALL-E-3: maximum 4000 characters
Writing suggestions:
- Include the subject (e.g., “baby sea otter”), action (e.g., “playing”), scene (e.g., “in the ocean waves”)
- Specify the style (e.g., “watercolor style”), lighting (e.g., “soft light”), and color (e.g., “blue and white”)
- Avoid vague descriptions; use specific words to improve generation accuracy.
The generated image size varies depending on the model:
- GPT-Image series: 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), auto (automatic adaptation)
- DALL-E-2: 256x256, 512x512, 1024x1024
- DALL-E-3: 1024x1024, 1792x1024 (landscape), 1024x1792 (portrait)
Enum Value:1024x10241536x10241024x1536auto
backgroundenum<string>Optional Image background settings, applicable only to GPT-Image series models:
- transparent: Transparent background (PNG/WEBP format only supported)
- opaque: Opaque background
- auto: Automatic judgment (default, intelligently set according to prompt word scenario)
Note: When transparent is selected, output_format must be set to png or webp
Enum Value:transparentopaqueauto
moderationenum<string>Optional Content moderation level, controlling the compliance of generated images:
- low: lenient moderation, allowing slight creative expression
- auto: automatic moderation (default), balancing compliance and creativity
Number of images generated:
- Value range: 1-10 (GPT-Image/DALL-E-2)
- DALL-E-3 only supports n=1
Note: The more images generated, the more tokens are consumed and the longer the generation time
qualityenum<string>Optional The generated image quality. ‘auto’ (default) will automatically select the best quality for the given model.
- GPT-Image model: Supports high, medium, and low modes.
- DALL-E-3: Supports hd and standard image quality.
- DALL-E 2: Only standard can be selected.
Enum Value:autohighmediumlow
modelenum<string>Required Enum Value:gpt-image-1gpt-image-1-minigpt-image-1.5
output_compressionintegerOptional Image compression level (GPT-Image only, default 100):
- Value range: 0-100 (percentage)
- 0: No compression (largest file size, best quality)
- 100: Maximum compression (smallest file size, quality may be compromised)
Only applicable to webp/jpeg formats, png format does not support compression
output_formatenum<string>Optional Image output format (GPT-Image only):
- png: Supports transparent backgrounds, lossless compression
- jpeg: Good compatibility, smaller file size
- webp: Balances compression and quality, supports transparency
When selecting transparent background, png or webp format must be used
partial_imagesintegerOptional Number of partial images generated (GPT-Image only):
- Value range: 0-3
- 0: Do not return partial images, return the complete image after generation (default)
- 1-3: Return partial image previews in stages, and finally return the complete image
Only effective when stream=true, suitable for scenarios that require fast preview
Enable streaming generation (GPT-Image only):
- false (default): Return all data at once after generation
- true: Return image data in stages as streaming events
Combined with the partial_images parameter, progressive image preview can be achieved
|
1. Function Introduction
Upload the original image and the mask image (optional), combine them with text prompts, and let the AI model replace the content of the specified area in the image to generate an edited image.
Supports GPT-Image series and DALL-E 2 model, suitable for scenarios such as image restoration, content replacement, and style adjustment.
- Precise regional editing: Specify the editing area through a mask image to achieve local content replacement
- Multi-model adaptation: Supports GPT-Image series (efficient and flexible) and DALL-E 2 (classic and stable)
- Rich parameter control: Customizable image size, quality, output format, compression level, etc.
- Batch generation: Supports generating 1-10 edited images at once to meet diverse needs
- Transparent background support: The GPT-Image model can generate images with transparent backgrounds, suitable for design scenarios
2. Request Parameters
| Field |
Type |
Required |
Description |
Remarks |
| Image |
File |
✅ |
Original image file |
≤ 50 MB; Supports PNG/JPEG/WebP |
| mask |
file |
|
Mask Image |
Must be the same size as the original image and contain an alpha channel |
| prompt |
string |
✅ |
Edit prompt |
Length ≤ 32 k (gpt-image-1.5) |
| model |
string |
✅ |
Model |
gpt-image-1.5 / gpt-image-1 / gpt-image-1-mini |
| n |
int |
|
Number of output sheets |
1-10 |
| size |
string |
|
Output size |
Same as the generation interface, see the size table |
| quality |
string |
|
Quality |
low / medium / high / auto |
| background |
string |
|
Background |
transparent / opaque / auto |
| input_fidelity |
string |
|
Input Fidelity |
low / high; high better preserves face/Logo |
| output_format |
string |
|
Output format |
png / jpeg / webp |
| output_compression |
int |
|
Compression ratio |
0-100; only for jpeg/webp |
| moderation |
string |
|
moderation strictness |
low / auto |
| stream |
bool |
|
Streaming |
Only for GPT-Image series |
| partial_images |
int |
|
Streaming sharding |
0-3 |
| async |
query |
|
Whether asynchronous |
?async=true returns task_id |
3. Precautions
- ✅ The original image and mask image must meet the format and size restrictions of the corresponding model (see parameter description for details)
- 🎭 The mask image must have the same dimensions as the original image, with the transparent area (alpha=0) being the editing area
- 🔒 Editing sensitive content is not supported, and the system will conduct content moderation
- ⚠️ APIFOX does not support online debugging of this API and requires testing via code calls
- 🚫 Uploading infringing or illegal images is prohibited
4. Price
| Model |
Text Input |
Image Input |
Image Output |
| gpt-image-1 |
5 PTC/1M Tokens |
10 PTC/1M Tokens |
40 PTC/1M Tokens |
| gpt-image-1-mini |
2 PTC/1M Tokens |
2.5 PTC/1M Tokens |
8 PTC/1M Tokens |
| gpt-image-1.5 |
5 PTC/1M Tokens |
8 PTC/1M Tokens |
32 PTC/1M Tokens |
The final price shall be based on the number of Tokens consumed as returned by the request
gpt-image-1: Price Reference

Request Parameters Header ParametersAuthorizationstringOptional Example Value:Bearer {{YOUR_API_KEY}} Content-TypestringOptional Example Value:multipart/form-data Query Parametersresponse_formatstringOptional Example Value:url Request Body multipart/form-dataOriginal image to be edited (required):
- GPT-Image Series (gpt-image-1/1-mini/1.5):
- Supported formats: PNG, WEBP, JPG
- Single file size: ≤50MB
- Quantity limit: up to 16 pieces
- DALL-E 2:
- Supported format: only PNG (square)
- Single image size: ≤4MB
- Quantity limit: Only 1 piece
Note: When uploading multiple images, the mask only applies to the first image
Text description of the editing effect (required):
- Describe details such as the content, style, color, etc. to be replaced
- Character length limit:
- GPT-Image Series: Up to 32,000 characters
- DALL-E 2: Up to 1000 characters
- Writing Suggestion: Describe clearly and specifically in combination with the original image scene
gpt-image-1、gpt-image-1-mini、gpt-image-1.5
Example Value:gpt-image-1.5 Mask Image (Optional):
- Function: Transparent area (alpha=0) indicates the position in the original image that needs to be edited
- Requirements:
- Format: PNG only (supports transparency)
- Size: ≤4MB (all models)
- Dimensions: Must be exactly the same as the first original image
- Quantity: Only 1 piece; when there are multiple original images, it only applies to the first one
- Applicable scenarios: Precisely locating the editing area, such as replacing a specific object in an image.
Quality level of the generated image:
- GPT-Image Series: Supports auto/high/medium/low
- DALL-E 2: Only supports standard
Example Value:auto Size of the generated image:
- GPT-Image Series: 1024x1024, 1536x1024, 1024x1536, auto
- DALL-E 2:256x256、512x512、1024x1024
Example Value:1024x1024 Number of edited images generated:
- Value range: 1-10
- The greater the quantity, the more Tokens are consumed, and the longer the generation time
Example Value:1 Image background settings (only applicable to GPT-Image model):
- transparent: Transparent background (only supported in PNG/WEBP formats)
- opaque: opaque background
- auto: intelligently set based on the prompt and the original image
Example Value:auto output_formatstringOptional Output image format (only applicable to GPT-Image model):
- When choosing a transparent background, you need to use the png or webp format
Example Value:png output_compressionstringOptional Image Compression Level (only applicable to GPT-Image model):
- Value range: 0-100 (percentage)
- 0: No compression (largest file size, best quality)
- 100: Maximum compression (smallest file size, quality may be compromised)
- Only applicable to webp or jpeg formats; png format does not support compression
Example Value:100 partial_imagesintegerOptional Partial Image Generation Quantity (only applicable to GPT-Image model):
- Value range: 0-3
- 0: Do not return partial images; return the complete image after generation is completed (default)
- 1-3: Return partial image previews in stages, and finally return the complete image
- Only takes effect when stream=true
Whether to enable streaming generation (only applicable to GPT-Image model):
- false (default): Return all at once after generation is completed
- true: Return image data in stages as streaming events
|