pixtral-large-2411

pixtral-large-2411

Built on Mistral Large 2, with frontier-level image understanding capabilities
2024-11-18
LLM
Model capability: imageModel capability: function_call
Input:
$2.2/1M tokens
Output:
$6.6/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Pixtral Large Instruct 2411 is Mistral AI’s flagship multimodal language model, primarily positioned as a professional-grade vision-language engine that delivers “high-precision image-and-text understanding plus efficient reasoning.”

  • Native multimodal architecture: Directly processes mixed inputs of images at arbitrary resolutions and long texts, without the need for additional visual encoders
  • Ultra-long context support: Supports up to 128K tokens of context, enabling simultaneous parsing of multiple high-resolution images and lengthy documents
  • Leading performance in complex visual tasks: Outperforms most open-source and closed-source models in challenging tasks such as chart comprehension, interface parsing, and multi-image comparison
  • Multilingual image-and-text capabilities: Supports image description, question answering, and content generation in mainstream languages including English, Chinese, French, and German

───────────────────────────────────────────────────────────────────

Core Capabilities

👁️ Pixel-level image-and-text alignment: Accurately locates text, icons, and table regions within images and performs semantic reasoning based on contextual information

📊 Structured visual parsing: Automatically extracts buttons, menus, and data charts from screenshots, converting them into actionable UI descriptions or code

🌍 Cross-language visual understanding: Understands Chinese posters, German manuals, or Arabic interfaces and accurately interprets their content in the corresponding languages

🧩 Agent-ready design: Supports combined image-and-text instructions (e.g., “Write an e-commerce product detail page based on these three product images”), seamlessly integrating into automated workflows

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(Pixtral-Large-2411multimodal)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

pixtral-large-2411

-
128000

Input$2 / 1M tokens
Output$6 / 1M tokens

Input$2.2/ 1M tokens
Output$6.6/ 1M tokens
10%