llama-4-maverick

llama-4-maverick

Flagship-level Mixture-of-Experts (MoE) Multimodal Open-Source Model
2025-05-07
LLM
Model capability: function_call
Input:
$1/1M tokens
Output:
$1/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Llama-4-Maverick is Meta’s flagship multi-modal language model featuring a hybrid expert (MoE) architecture, primarily positioned as a top-tier AI engine with “ultra-large-scale capabilities plus native text-and-image understanding.”

  • Ultra-large MoE architecture: With approximately 400 billion total parameters and 17 billion activated parameters, it integrates 128 experts, striking a balance between maximum capability and inference efficiency.
  • Native multi-modal support: Through an early fusion architecture, it directly processes both text and image inputs, enabling true joint text-and-image reasoning.
  • Remarkable context length: The instruction-tuned version supports up to 1 million tokens of context, effortlessly handling ultra-long content such as entire books or lengthy video scripts.
  • Unprecedentedly rich training data: Pre-trained on 40 trillion tokens, covering 200 languages and specially optimized for 12 major languages.
  • Production-ready deployment: Supports BF16/FP8 formats and is compatible with Transformers, TGI, and vLLM—ready to use out-of-the-box.

───────────────────────────────────────────────────────────────────

Core Capabilities

👁️ Deep visual semantic understanding: It not only recognizes image content but also performs cross-modal reasoning by integrating text—for example, analyzing charts, interpreting interfaces, and comparing image differences.

🧠 Ultra-long-range logical coherence: Leveraging the NoPE layer and iRoPE architecture, it maintains precise positional awareness and information correlation even in contexts spanning millions of tokens.

🌍 True mastery of global languages: From Hindi to Arabic, it supports multilingual text-and-image generation and comprehension, producing outputs that align with local cultural contexts.

🧩 Native agent architecture: It enables sophisticated multi-modal instruction parsing and tool collaboration, providing a powerful foundation for next-generation AI agents.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(LLaMA4)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

llama-4-maverick

-
128000

Input$1 / 1M tokens
Output$1 / 1M tokens

Input$1/ 1M tokens
Output$1/ 1M tokens
Original Price