Skip to content
Learn · Guide · Zhipu AI (China)

GLM (Z.ai)

Zhipu AI's open-weight GLM family, now led by the agentic coding model GLM-5.2: a permissively licensed, self-hostable model built for long-horizon, repository-scale software engineering.

Zhipu AI (China)8 min readz.ai

What is GLM (Z.ai)?

GLM is the family of large language models from Zhipu AI (which presents its assistant and API under the Z.ai brand), a Chinese lab spun out of Tsinghua University. Its current flagship, GLM-5.2, is an open-weight model built for agentic, repository-scale software engineering rather than chat.

GLM-5.2 is a Mixture-of-Experts model (around 753B total parameters with roughly 40B active per token) released under the permissive MIT license on Hugging Face. It pairs a usable one-million-token context window with a dual thinking-effort system (High and Max modes), so it can plan and execute long tool-using runs across a whole codebase.

On public coding benchmarks the model is competitive with frontier closed models at a fraction of the cost, which is the reason to evaluate it: capable agentic and coding ability you can host yourself, weighed against the governance questions of a China-based hosted service for sensitive data.

Strengths

What it's best for

  • Agentic software engineering: long, tool-using runs that plan and edit across many files.
  • Repository-scale work, where the one-million-token context holds a large codebase in view.
  • Self-hosting: the MIT-licensed weights let you run inference entirely in your own environment.
  • Cost-sensitive teams: API pricing lands well below the frontier closed models for similar work.
  • Tuning the effort: High mode for everyday tasks, Max mode for the hardest reasoning.
Limits

Where it falls short

  • Sensitive or regulated data on the hosted service, which runs on China-based infrastructure. Self-hosting the open weights avoids this.
  • Topics subject to Chinese content restrictions on the hosted assistant.
  • Teams wanting Western enterprise support and a mature consumer feature set.
How to use it

Ways in

Use the Z.ai chat assistant for the hosted experience. For building, Zhipu's API is the path to GLM-5.2, and most code that targets an OpenAI-compatible endpoint adapts with a base-URL and model-name change.

For full control, download the MIT-licensed weights from Hugging Face and self-host. Plan for the compute: a 753B-parameter Mixture-of-Experts model needs serious GPU memory even with only 40B active per token.

How to use it

Getting the most out of it

Treat it as an agent, not a chatbot: give it the goal, the tools it can call, and the relevant files, then let it plan and execute the steps. The long context is there to hold real repository structure, so include it.

Choose the thinking effort deliberately. Use High mode for routine changes and Max mode for the hardest reasoning, where the extra compute pays off.

For data-sensitive work, prefer the open weights over the hosted service and confirm the MIT license terms cover your use.

Pricing

What GLM (Z.ai) costs

Approximate, in USD, as of June 2026. Prices change often. Confirm on the official site before you rely on them.

Z.ai assistant

$0

Free chat assistant, subject to limits.

Open weights

$0 (self-host)

GLM-5.2 weights are published under the MIT license on Hugging Face; you pay only your own compute.

API

~$0.95-2 / 1M in, ~$3-6 / 1M out

Usage-based on Zhipu's platform, roughly 80-90% below the leading closed models for comparable work. Confirm current rates on the official site.

Visit the official GLM (Z.ai) site
Try it

Example prompts

Copy these into GLM (Z.ai) as starting points, then adapt them to your task.

Agentic coding runCopy prompt
Here is the repository. You can read, edit, and run files. Implement this feature end to end, list every file you changed and why, and run the tests before you finish.
Long-context reviewCopy prompt
I have pasted the whole module. Trace how data flows from the API route to the database, and flag any place where an error is swallowed silently.
Swap-in testCopy prompt
Adapt this OpenAI-style API call to use the Zhipu (GLM-5.2) endpoint, changing only the base URL and model name.
Governance checkCopy prompt
We are evaluating GLM-5.2 for an internal tool that touches customer data. List the questions to resolve before using the hosted API, and what changes if we self-host the MIT-licensed weights.
FAQ

GLM (Z.ai)
common questions.

Direct answers to the questions we get asked the most. If yours isn't covered, write to the team.

Work with SDEN

Putting AI into production?

We help teams choose the right models and ship them securely, self-hosted when data demands it. And we hand you the keys to run them in-house.