Z.AI: GLM 4.5 Air
Model Type
Proprietary Model
API access only
Recommended Use Cases
Try GLM 4.5 Air
GLM-4.5 Air is Z.AI's efficient agent model, offering strong performance at 106B total parameters with only 12B active—designed for teams needing agent capabilities without the full GLM-4.5 footprint.
GLM-4.5-Air delivers exceptional performance among 100B parameter-scale models, establishing itself as a leading model in its parameter category. — Z.AI
Overview
Released July 28, 2025 alongside GLM-4.5, the Air variant uses the same MoE architecture but with a more compact design: 106B total parameters with 12B active (vs GLM-4.5's 355B/32B). It supports the same hybrid thinking modes and agent capabilities while requiring significantly fewer resources.
Key Capabilities
- 106B total / 12B active parameters (MoE)
- 128K context window
- Hybrid thinking modes: Same as GLM-4.5
- Agent-native design: Planning, tool use, multi-step execution
- Faster inference: Higher throughput than full GLM-4.5
- Generation rate: 100+ tokens/second
Performance
GLM-4.5 Air ranked 6th overall across benchmarks, outperforming many models of similar or larger scale:
- Competitive with larger models on agent and reasoning tasks
- Strong tool-calling and function execution
- Efficient token consumption
When to Use GLM-4.5 Air
Choose GLM-4.5 Air when you need:
- Agent capabilities with lower resource requirements
- High-throughput inference for real-time applications
- Cost-effective deployment at scale
- Similar architecture to GLM-4.5 but smaller footprint
- Free-tier API access for development and testing
Choose GLM-4.5 (full) when you need:
- Maximum capability within the 4.5 generation
- Complex reasoning tasks requiring more parameters
Choose GLM-4.7 Flash when you need:
- Even smaller footprint (30B/3B active)
- Consumer GPU deployment
- Coding-focused workloads
Role in Series
GLM efficient models:
- GLM-4-9B (Apr 2025): 9B dense, specialized tasks
- GLM-4.5 Air (Jul 2025): 106B/12B MoE, balanced efficiency (this model)
- GLM-4.7 Flash (Jan 2026): 30B/3B MoE, coding-focused