
wavespeed-ai/image-captioner
API Overview
image-captioner is a high-precision image caption generator that produces detailed, human-like descriptions from images. It’s ideal for content understanding, accessibility, dataset annotation, SEO, and multimodal AI workflows. It offers an out-of-the-box REST inference API with optimal performance, no cold starts, and an affordable pricing model.
Key Features:
1.Generates accurate and natural image descriptions
2.Supports detailed object recognition and scene understanding
3.Ideal for labeling, accessibility (alt-text), and visual search
4.Works in automated workflows and REST API pipelines
Usage Steps:
1. Upload an image
2. Select the “detail_level,” defaulting to “medium”
3. Fill in the “focus” field to specify the area of primary interest (optional)
4. Set “enable_sync_mode”—default is false, requiring a second API call to retrieve task results; set to true to wait for the description to be generated and uploaded before returning the response
Price: $0.001 per request
Result:A young woman with wavy brown hair gazes directly at the camera in a dimly lit, cool-toned environment. Her piercing blue eyes and neutral expression convey intensity against the blurred, shadowy background. The moody lighting emphasizes her features, creating a contemplative and mysterious atmosphere.
API Console
Log in to explore more features! Click to Log In