glm-5-turbo

glm-5-turbo

GLM-5-Turbo is a foundation model deeply optimized for the OpenClaw scenario.
2026-03-16
LLM
Model capability: thinkingModel capability: function_call
Input:
$0.72/1M tokensstarting from
Output:
$3.2/1M tokensstarting from
Bulk order? Contact your manager for exclusive deals
稳定性
Stable

API Overview

GLM-5-Turbo is a high-performance, optimized version built on the flagship GLM-5 model, specifically designed for business scenarios that demand “ultra-fast inference” and “high-frequency agent execution.” While retaining GLM-5’s original strengths in long-form logical reasoning and system engineering capabilities, GLM-5-Turbo achieves a significant boost in inference speed through architectural optimization. It is the ideal choice for building automated agent workflows, high-response real-time interactive applications, and large-scale programming collaboration tasks. ───────────────────────────────────────────────────────────────────

Core Capabilities


Ultra-Fast Inference Speed: Deeply optimized for agentic (agent) invocation scenarios, dramatically reducing inference latency and ensuring AI can execute multi-step tasks with lightning-fast responsiveness. Agent Workflow-Specific Optimization: Specially tuned to enhance response efficiency for tool use and multi-stage, long-cycle tasks, perfectly compatible with intelligent agent orchestration frameworks such as OpenClaw, enabling second-level feedback for complex tasks. Leading Cost-Effectiveness: Achieves processing performance nearly equivalent to GLM-5 while consuming fewer resources, making it an ideal solution for cost-effective large-scale AI automation without compromising business quality. Accelerator for Programming Collaboration: In coding agent applications like Claude Code and OpenCode, it significantly boosts the iteration efficiency of code generation and debugging, serving as a powerful tool for developers to enhance their coding productivity.

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat (Zhipu GLM Multimodal)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

glm-5-turbo

Input length [0, 32k]
200000

Input$0.72 / 1M tokens
Output$3.2 / 1M tokens

Input$0.72/ 1M tokens
Output$3.2/ 1M tokens
Original Price

glm-5-turbo

Input length [32k, 200k]
200000

Input$1.1 / 1M tokens
Output$3.8 / 1M tokens

Input$1.1/ 1M tokens
Output$3.8/ 1M tokens
Original Price