zai-org/glm-4.7-flash

zai-org/glm-4.7-flash

The efficient version of the GLM-4.7 series language model launched by Zhipu AI balances performance and efficiency.
2026-01-23
LLM
Model capability: thinkingModel capability: function_call
Input:
$0.0715/1M tokens
Output:
$0.429/1M tokens
Bulk order? Contact your manager for exclusive deals

API Overview

GLM-4.7-Flash is a free, high-performance text language model launched by Zhipu AI, with a core positioning as a “lightweight flagship that strikes a balance between state-of-the-art performance and high efficiency at the 30B parameter level.” It provides powerful support for agentic coding, deep research, and high-frequency collaboration scenarios.

  • Key Upgrades: As an efficient version of the GLM-4.7 series, it achieves leading performance among open-source models of the same size in benchmarks such as SWE-bench Verified and τ²-Bench.
  • Applicable Scenarios: Agentic Coding (end-to-end runnable code generation), Deep Research (multi-source information integration), automatic frontend/PPT generation, role-playing creation, intelligent customer service, and more.
  • Product Value: Supports up to 200K context tokens and 128K output tokens, significantly reducing development and deployment costs for complex tasks.
  • Enhanced Capabilities: Greatly enhanced coding abilities, enabling autonomous requirement breakdown and multi-technology stack integration; optimized frontend aesthetics, making PPT/webpage generation closer to “ready-to-use” standards.
  • Improved Interaction Experience: Maintains context stability throughout multi-turn conversations, continuously clarifies objectives when dealing with complex issues, and behaves more like a “problem-solving partner.”

───────────────────────────────────────────────────────────────────

Core Capabilities

Efficient Long-Context Processing: 200K input window + intelligent context caching ensures smooth, non-forgetting long conversations.

🧠 Deep Thinking Mode: Supports enabling the thinking mode, allowing for a reasoning chain of “think first, then act,” improving accuracy in complex tasks.

🛠️ Powerful Tool Collaboration: Natively supports Function Call and MCP protocols, enabling flexible invocation of external tools and data sources.

🎨 Frontend Aesthetic Upgrade: More aesthetically pleasing layouts, color schemes, and component styles; default configurations reduce the cost of repeated fine-tuning.

📊 Structured Output: Supports formats such as JSON, facilitating direct system parsing and seamless integration into automated workflows.

💬 Streaming Real-Time Responses: Supports streaming output, returning results word by word, delivering low-latency, highly immersive interaction experiences.

───────────────────────────────────────────────────────────────────

Test Data

  • In mainstream benchmarks such as SWE-bench Verified and τ²-Bench, GLM-4.7-Flash reaches state-of-the-art levels among open-source models of the same size.
  • It boasts leading frontend and backend development capabilities and performs exceptionally well in internal programming tests.
  • Strong general-purpose capabilities, recommended for scenarios including Chinese writing, translation, long-text processing, sentiment expression, and role-playing.

Description

Playground

Log in to explore more features! Click to Log In

API Analytics

API Reference (1)

API DescriptionAPI EndpointRequest MethodStabilityParameter Description
Chat(PPIO)
POST
Stable
View Details

API Pricing

$
ModelDescriptionContextOfficial Price302.AI Price

zai-org/glm-4.7-flash

-
200000

Input$0.0715 / 1M tokens
Output$0.429 / 1M tokens

Input$0.0715/ 1M tokens
Output$0.429/ 1M tokens
Original Price