Try That LLM

Z.AI (formerly Zhipu AI) is a Chinese AI company building the GLM family of foundation models, with a focus on agentic capabilities, coding, and reasoning. The company rebranded internationally as Z.AI in July 2025 and became the first major LLM company to go public via Hong Kong IPO in January 2026.

The first-principles approach to measuring AGI is to integrate more general intelligent capabilities without losing existing ones. GLM-4.5 is our first complete realization of this concept. — Zhang Peng, CEO of Z.AI

Company Background

Founded in 2019 as a spinoff from Tsinghua University, Z.AI has grown into one of China's "AI Tiger" companies. The company is backed by Alibaba, Tencent, Meituan, Ant Group, Xiaomi, and HongShan. OpenAI has identified Z.AI as one of the few global companies capable of building competitive models.

Z.AI was the first among Chinese AI companies to sign the Frontier AI Safety Commitments and is listed in Stanford's AI Index Report 2025 as developing "notable AI models."

Current Models

Model	Released	Parameters	Context	Best For
GLM-5	Feb 2026	744B / 40B active	200K	Flagship agentic engineering
GLM-4.7	Dec 2025	355B / 32B active	200K	Production coding workflows
GLM-4.7 Flash	Jan 2026	30B / 3B active	128K	Lightweight local deployment
GLM-4.6	Sep 2025	357B / 32B active	200K	Balanced coding and reasoning
GLM-4.5	Jul 2025	355B / 32B active	128K	Agent-native applications
GLM-4.5 Air	Jul 2025	106B / 12B active	128K	Efficient agent tasks
GLM-4-32B	Apr 2025	32B (dense)	128K	Cost-effective general use

Model Selection Guide

For maximum capability:

GLM-5: Best open-weight model for complex systems engineering and long-horizon agentic tasks

For production coding:

GLM-4.7: Optimized for multi-step coding workflows with Claude Code, Cline, Roo Code
GLM-4.7 Flash: Budget-friendly option for local deployment on consumer GPUs

For balanced performance:

GLM-4.6: Strong coding with 200K context, first to run on Chinese domestic chips
GLM-4.5: Native agent capabilities with thinking modes

For efficiency:

GLM-4.5 Air: 106B parameters with only 12B active—strong performance at lower cost
GLM-4-32B: Dense architecture for simpler deployment

Key Technologies

Mixture-of-Experts (MoE): Most GLM models use MoE architecture, activating only a fraction of total parameters per inference for efficiency.

Interleaved Thinking: Models think before every response and tool call, improving instruction following and generation quality.

Preserved Thinking: In coding scenarios, models retain thinking blocks across turns, avoiding repeated reasoning and information loss.

Turn-level Thinking Control: Enable or disable reasoning per turn to balance accuracy vs. latency.

DeepSeek Sparse Attention: Integrated in GLM-5 to reduce deployment costs while maintaining long-context performance.

Other Products

AutoGLM: Smartphone AI agent using voice commands
CogVideoX: Text-to-video generation
CodeGeeX: Code assistant
GLM-4.5V / GLM-4.6V: Vision-language models

Z.AI

Company Background

Current Models

Model Selection Guide

Key Technologies

Other Products

Links

Models in this family