Z.AI

Z.AI (formerly Zhipu AI) is a Chinese AI company building the GLM family of foundation models, with a focus on agentic capabilities, coding, and reasoning. The company rebranded internationally as Z.AI in July 2025 and became the first major LLM company to go public via Hong Kong IPO in January 2026.

The first-principles approach to measuring AGI is to integrate more general intelligent capabilities without losing existing ones. GLM-4.5 is our first complete realization of this concept. — Zhang Peng, CEO of Z.AI

Company Background

Founded in 2019 as a spinoff from Tsinghua University, Z.AI has grown into one of China's "AI Tiger" companies. The company is backed by Alibaba, Tencent, Meituan, Ant Group, Xiaomi, and HongShan. OpenAI has identified Z.AI as one of the few global companies capable of building competitive models.

Z.AI was the first among Chinese AI companies to sign the Frontier AI Safety Commitments and is listed in Stanford's AI Index Report 2025 as developing "notable AI models."

Current Models

ModelReleasedParametersContextBest For
GLM-5Feb 2026744B / 40B active200KFlagship agentic engineering
GLM-4.7Dec 2025355B / 32B active200KProduction coding workflows
GLM-4.7 FlashJan 202630B / 3B active128KLightweight local deployment
GLM-4.6Sep 2025357B / 32B active200KBalanced coding and reasoning
GLM-4.5Jul 2025355B / 32B active128KAgent-native applications
GLM-4.5 AirJul 2025106B / 12B active128KEfficient agent tasks
GLM-4-32BApr 202532B (dense)128KCost-effective general use

Model Selection Guide

For maximum capability:

  • GLM-5: Best open-weight model for complex systems engineering and long-horizon agentic tasks

For production coding:

  • GLM-4.7: Optimized for multi-step coding workflows with Claude Code, Cline, Roo Code
  • GLM-4.7 Flash: Budget-friendly option for local deployment on consumer GPUs

For balanced performance:

  • GLM-4.6: Strong coding with 200K context, first to run on Chinese domestic chips
  • GLM-4.5: Native agent capabilities with thinking modes

For efficiency:

  • GLM-4.5 Air: 106B parameters with only 12B active—strong performance at lower cost
  • GLM-4-32B: Dense architecture for simpler deployment

Key Technologies

Mixture-of-Experts (MoE): Most GLM models use MoE architecture, activating only a fraction of total parameters per inference for efficiency.

Interleaved Thinking: Models think before every response and tool call, improving instruction following and generation quality.

Preserved Thinking: In coding scenarios, models retain thinking blocks across turns, avoiding repeated reasoning and information loss.

Turn-level Thinking Control: Enable or disable reasoning per turn to balance accuracy vs. latency.

DeepSeek Sparse Attention: Integrated in GLM-5 to reduce deployment costs while maintaining long-context performance.

Other Products

  • AutoGLM: Smartphone AI agent using voice commands
  • CogVideoX: Text-to-video generation
  • CodeGeeX: Code assistant
  • GLM-4.5V / GLM-4.6V: Vision-language models

Links

Models in this family

Feb 11, 2026
Jan 19, 2026
Dec 22, 2025
Sep 30, 2025
Jul 25, 2025
Jul 25, 2025
Jul 24, 2025