wavespeed-ai/image-captioner

wavespeed-ai/image-captioner

High-precision image understanding and description models
2025-11-21
Image Processing
Pricing:
$0.001/call
Bulk order? Contact your manager for exclusive deals
稳定性
Stable

API Overview

image-captioner is a high-precision image caption generator that produces detailed, human-like descriptions from images. It’s ideal for content understanding, accessibility, dataset annotation, SEO, and multimodal AI workflows. It offers an out-of-the-box REST inference API with optimal performance, no cold starts, and an affordable pricing model.


Key Features:

1.Generates accurate and natural image descriptions

2.Supports detailed object recognition and scene understanding

3.Ideal for labeling, accessibility (alt-text), and visual search

4.Works in automated workflows and REST API pipelines


Usage Steps:

1. Upload an image

2. Select the “detail_level,” defaulting to “medium”

3. Fill in the “focus” field to specify the area of primary interest (optional)

4. Set “enable_sync_mode”—default is false, requiring a second API call to retrieve task results; set to true to wait for the description to be generated and uploaded before returning the response


Price: $0.001 per request


Result:A young woman with wavy brown hair gazes directly at the camera in a dimly lit, cool-toned environment. Her piercing blue eyes and neutral expression convey intensity against the blurred, shadowy background. The moody lighting emphasizes her features, creating a contemplative and mysterious atmosphere.

API Console

Log in to explore more features! Click to Log In

API Reference (2)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
wavespeed-ai/image-captioner
POST
Stable
View Details
Wavespeed Retrieval Task
GET
Stable
View Details

API Pricing

$
ModelDescription302.AI Price

wavespeed-ai/image-captioner

wavespeed-ai/image-captioner

$0.001/call

Wavespeed Retrieval Task

-

Free