Z.AI iconZ.AI: GLM 4.5 Air

Model Type

Proprietary model icon

Proprietary Model

API access only

Recommended Use Cases

Text Generation

Try GLM 4.5 Air

GLM-4.5 Air is Z.AI's efficient agent model, offering strong performance at 106B total parameters with only 12B active—designed for teams needing agent capabilities without the full GLM-4.5 footprint.

GLM-4.5-Air delivers exceptional performance among 100B parameter-scale models, establishing itself as a leading model in its parameter category. — Z.AI

Overview

Released July 28, 2025 alongside GLM-4.5, the Air variant uses the same MoE architecture but with a more compact design: 106B total parameters with 12B active (vs GLM-4.5's 355B/32B). It supports the same hybrid thinking modes and agent capabilities while requiring significantly fewer resources.

Key Capabilities

  • 106B total / 12B active parameters (MoE)
  • 128K context window
  • Hybrid thinking modes: Same as GLM-4.5
  • Agent-native design: Planning, tool use, multi-step execution
  • Faster inference: Higher throughput than full GLM-4.5
  • Generation rate: 100+ tokens/second

Performance

GLM-4.5 Air ranked 6th overall across benchmarks, outperforming many models of similar or larger scale:

  • Competitive with larger models on agent and reasoning tasks
  • Strong tool-calling and function execution
  • Efficient token consumption

When to Use GLM-4.5 Air

Choose GLM-4.5 Air when you need:

  • Agent capabilities with lower resource requirements
  • High-throughput inference for real-time applications
  • Cost-effective deployment at scale
  • Similar architecture to GLM-4.5 but smaller footprint
  • Free-tier API access for development and testing

Choose GLM-4.5 (full) when you need:

  • Maximum capability within the 4.5 generation
  • Complex reasoning tasks requiring more parameters

Choose GLM-4.7 Flash when you need:

  • Even smaller footprint (30B/3B active)
  • Consumer GPU deployment
  • Coding-focused workloads

Role in Series

GLM efficient models:

  1. GLM-4-9B (Apr 2025): 9B dense, specialized tasks
  2. GLM-4.5 Air (Jul 2025): 106B/12B MoE, balanced efficiency (this model)
  3. GLM-4.7 Flash (Jan 2026): 30B/3B MoE, coding-focused

Links