claude-3-haiku-20240307

claude-3-haiku-20240307

A faster visual text model that delivers a seamless human-machine interaction experience in near real time.
2024-03-07
LLM
Model capability: image
Input:
$0.25/1M tokens
Output:
$1.25/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

Basic Information

  • Claude 3 Haiku is the "lightweight/ultra-fast" model in Anthropic's Claude 3 family (which includes Claude 3 Opus, Sonnet, and Haiku).
  • It is positioned as the fastest and most cost-effective model option, offering exceptional value for money at its intelligence level.
  • According to the model card, it supports a context window of up to 200K tokens.
  • In terms of pricing, the input token cost is approximately $0.25 per million input tokens, while the output token cost is $1.25 per million output tokens.

Key Features

  • Ultra-Fast Response: Haiku is one of the quickest models in the Claude 3 family, capable of generating responses in remarkably short timeframes. Officially, it delivers near-"real-time" responses when handling lightweight queries.
  • High Throughput: Reports indicate that Haiku can process around 21,000 tokens per second under specific conditions—such as when prompts are shorter than 32K tokens.
  • Advanced Context Understanding: With a context capacity of 200K tokens, Haiku can handle extremely long and complex information inputs with ease.
  • Visual Capabilities: Despite being a lightweight model, Haiku retains the ability to process images, including charts, photos, and technical diagrams.
  • Reduced Rejection Rates: Compared to earlier models, Haiku demonstrates improved contextual understanding of guardrail prompts, resulting in fewer unnecessary rejections.

Technical Highlights

  • Efficient Cost Structure: Haiku's pricing strategy makes it ideal for large-scale, high-frequency use cases—particularly applications where speed and cost efficiency are critical.
  • Strong Recall Capability: The entire Claude 3 family excels in "long-context — recalling information." While Opus offers the strongest recall capability, Haiku still benefits significantly from this architectural design.
  • Training and Safety Mechanisms: Leveraging Anthropic's Constitutional AI training approach and safety mechanisms designed to minimize inappropriate content, both Haiku and the broader Claude 3 family have been enhanced in terms of reliability and security.
  • Multi-Language Fluency: Haiku can handle multiple languages—including Spanish, Japanese, French, and others—with remarkable fluency, making it highly versatile for multilingual applications.

Applicable Scenarios

  • Real-Time Customer Interactions: For instance, customer service chatbots, real-time FAQ responses, or user support systems can all benefit from Haiku's rapid response capabilities.
  • Content Moderation / Content Interruption Detection: Due to its fast response and low costs, Haiku is well-suited for real-time risk assessment of user-generated content, such as content moderation tasks.
  • Lightweight Text Processing: Examples include extracting unstructured data or pulling key insights from documents—for applications like logistics or inventory management.
  • Translation and Multilingual Interaction: Ideal for quick translations, cross-language conversations, or other multilingual tasks, thanks to Haiku's support for multiple languages and swift response times.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (7)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(Talk)
POST
Stable
View Details
Chat(Analyze image)
POST
Stable
View Details
Chat(Function Call)
POST
Stable
View Details
Messages(Original format)
POST
Stable
View Details
Messages(Function Call)
POST
Stable
View Details
Messages(Thinking mode)
POST
Stable
View Details
Messages(128k output)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

claude-3-haiku-20240307

Cache write: $0.5 / 1M tokens, Cache read: $0.05 /1M tokens
200000

Input$0.25 / 1M tokens
Output$1.25 / 1M tokens

Input$0.25/ 1M tokens
Output$1.25/ 1M tokens
Original Price