China’s latest AI model GLM-4.6 is finally here after the success of GLM-4.5 [Full Report]

China is just on a rocket right now, you guys remember just a few days ago we talked about DeepSeek, right?

And now boom, there is another headline: China’s latest AI model GLM-4.5.

The speed at which large language models are emerging from China is truly staggering, making the global AI race feel like a daily event.

The company at the center of this newest frenzy is Z.ai, formerly known as Zhipu AI, which has just released what many are calling the most disruptive model of the year, GLM-4.5.

China’s latest AI model GLM-4.5

This model is not just powerful, but it is strategically designed for efficiency, affordability, and open deployment. Z.ai’s move is creating a major disturbance in the global agentic AI market, forcing rivals in Silicon Valley and elsewhere to rethink their pricing and accessibility strategies.

Z.ai’s approach emphasizes that the future of artificial intelligence does not just belong to the company with the biggest model, but to the company with the most efficient and accessible model. This efficiency is the true innovation that makes the model trend right now, marking a critical turning point for open source technology globally.

The Rise of GLM-4.5

GLM-4.5 has taken the AI community by storm, and to understand why, we need to look at both the model itself and the company that built it. This model is much more than a simple iteration, it represents a fundamental shift in how large language models are constructed and priced for global deployment.

What GLM-4.5 Is and Who Built It

The story starts with Z.ai, which was originally founded as Zhipu AI in 2019. The company has deep academic roots, having originated from the technological achievements developed at Tsinghua University, one of China’s top research institutions. This strong research foundation laid the groundwork for the rapid development cycles we see today. The company’s founders are Tang Jie and Li Juanzi, and the current CEO guiding this growth is Zhang Peng.

The company quickly gained massive financial backing, raising more than 1.5 billion dollars from influential Chinese investors, including technology giants like Alibaba and Tencent, as well as firms like Qiming Venture Partners. This massive funding accelerated the startup’s transformation from a research lab into one of China’s most prominent AI contenders. By 2024, Z.ai was recognized by investors as one of China’s “AI Tiger” companies, achieving the position of the third largest large language model market player in the country. This level of strategic investment shows that Z.ai’s mission is about more than just profit, it is about building national competence to rival global leaders in frontier technology, explaining why they can afford to adopt such an aggressive, low-cost strategy.

GLM-4.5 itself is a highly advanced large language model designed specifically for intelligent agents. It uses a sophisticated architecture known as Mixture of Experts, or MoE. The MoE design works by having a large team of specialized subnetworks, or “experts,” available for processing information. Although the total model size is immense—we are talking about 355 billion parameters—only a small portion of that, specifically 32 billion parameters, is actively engaged for any single input or task. This sparsity allows the model to leverage the deep knowledge of a huge model without incurring the prohibitive computational cost of running every parameter for every task.

The core goal of GLM-4.5 is to unify three major capabilities that define modern AI agents: complex reasoning, code generation, and advanced tool use. Crucially, the GLM-4.5 series is released under the permissive MIT open-source license. This is a huge deal because it means that developers and companies everywhere can download and use the weights of the model without licensing fees or platform restrictions, unlocking tremendous potential for commercial application and custom development worldwide.

What really sets GLM-4.5 apart is not just its size, but the clever way Z.ai engineered its problem-solving process. The model introduces a core innovation that Z.ai calls the dynamic “thinking mode”.

Large language models often struggle when asked to complete extremely complex, multi-step tasks, such as solving novel programming problems or performing long, deductive reasoning chains. This is because they typically try to generate the final answer in one shot. Z.ai’s dynamic hybrid reasoning model tackles this problem head on. It is able to switch automatically between a fast, direct response mode, which is suitable for simple questions, and the deliberate, step-by-step reasoning mode, which is activated for complex tasks.

When the model enters the deliberate reasoning process, or “thinking mode,” it systematically breaks down the complex request into smaller, manageable subtasks. It then decides which external tools or functions are necessary to solve those subtasks, executes them, and finally synthesizes all the results into a coherent solution. This explicit breakdown and systematic tool use dramatically improves accuracy on difficult tasks and provides transparency, allowing users to see the model’s logical steps, similar to how a human researcher shows their work. This step-by-step approach is what is often referred to as agentic AI, and it delivers more accurate and efficient outcomes by approaching problems sequentially.

This specialized design has made GLM-4.5 an exceptional foundation model for agentic applications, which are becoming the next wave in AI innovation. The model supports native function calling and has recorded a very high average tool calling success rate, hitting 90.6 percent when tested against similar models like Qwen3-Coder. This focus on robustness in agent workflows means it is reliable for high-stakes applications, such as end-to-end full-stack development, where the model can automatically produce complete web applications from the front end to backend deployment, or sophisticated artifact creation. The ability to effectively interact with external systems is why GLM-4.5 performs so strongly on industry benchmarks, where it has achieved a score of 63.2 across twelve industry-standard tests covering reasoning, coding, and agentic capabilities, ranking third globally among all proprietary and open-source models.

Under the hood, this performance is stabilized by high-level engineering. The model was trained in a multi-step process, starting with pre-training on a massive dataset of 23 trillion tokens. Z.ai also incorporated architectural optimizations like Grouped-Query Attention, QK-norm to stabilize attention logits, and the use of MoE layers as MTP layers, which supports faster inference techniques like speculative decoding. These technical foundations ensure that the high performance seen in benchmarks translates into reliable, stable, and fast real-world performance, confirming its status as a critical China new AI model release.

Cost, Access, and Geopolitical Strategy

The reason GLM-4.5 truly captured the global spotlight and went viral is rooted in economics and clever engineering: its disruptive affordability and open-source strategy.

Z.ai is using its advanced efficiency to fundamentally undercut the market. The company is charging exceptionally low rates for API usage. For example, Z.ai plans to charge just 11 cents per million input tokens. This is already cheaper than competitors like DeepSeek R1, which charges 14 cents. The difference becomes even more pronounced with output tokens, where GLM-4.5 costs only 28 cents per million, compared to DeepSeek’s $2.19 per million. This massive cost saving creates an unfair competitive advantage, with GLM-4.5 offering 136 times lower operational costs than some premium models. This pricing could reshape large language model deployment expectations, especially in resource-limited environments.

This affordability is paired with an aggressive open-source approach. Releasing the model weights under the MIT license means that anyone can self-host, customize, and deploy GLM-4.5 at scale for virtually nothing, avoiding subscription fees, usage limits, and vendor lock-in. This democratization of access levels the playing field, allowing solo entrepreneurs and small development agencies to access frontier AI capabilities that were previously reserved only for large corporations with massive budgets.

What is particularly compelling is that this market advantage was born directly out of necessity. The Chinese AI sector has faced significant restrictions, including US export controls designed to limit access to the highest-end Nvidia chips. Rather than simply accepting these limitations, Z.ai focused intensely on engineering a model that was hyper-efficient within those constraints. The CEO, Zhang Peng, pointed out that GLM-4.5 runs effectively on only eight Nvidia H20 chips, which were specifically developed to comply with US export rules, while some competing models require sixteen chips. This optimization significantly lowers the model’s operational footprint, making it incredibly cheap to run and serve at scale.

This shows that the trade restrictions inadvertently forced Z.ai to prioritize efficiency and novel architecture choices, resulting in a product that is perfectly positioned to disrupt the global market on price and accessibility. If Z.ai had easy access to the most powerful chips, they might have pursued a larger, less efficient dense model architecture. The constraint of using H20 chips directly led to architecture choices, like optimized MoE implementation, that dramatically lower operational footprints. This efficiency is the core reason why GLM-4.5 is the trending China’s latest AI model. Despite the US Commerce Department blacklisting the company in January 2025 due to national security concerns, which restricts American companies from working directly with Z.ai, the company’s open-source strategy provides a way to mitigate this impact. By attracting a massive international developer community through powerful, free tools, Z.ai builds ecosystem trust and global traction, ensuring its continued relevance despite geopolitical tensions. The combination of open-source access, extremely low API costs, and high agentic performance is why GLM-4.5 has become the trending conversation starter globally.

Z.ai Just Launched GLM-4.6 and GLM-4.6V

While GLM-4.5 was capturing headlines for its price and agentic prowess, Z.ai was already moving forward at incredible speed. The company is known for its “breakneck” pace of iteration. Just two months after the release of GLM-4.5, Z.ai launched two more advanced variations: the GLM-4.6 text model and the multimodal GLM-4.6V.

Z.ai Just Launched GLM-4.6 and GLM-4.6V

GLM-4.6: The Clear Performance Upgrade

GLM-4.6 is a direct and substantial upgrade to GLM-4.5, moving the model forward in several key areas and positioning it as a dedicated coding and agentic specialist. This is not an experimental variation; it is the next iteration of the flagship text model.

First, Z.ai slightly increased the total parameter count to 357 billion, two billion more than GLM-4.5. More importantly, the model’s working memory, or context window, has been significantly expanded. It increased from 128 thousand tokens to a massive 200 thousand tokens. This is vital for complex agentic tasks, which often require tracking long histories, maintaining context across multiple tool calls, or processing very large code repositories.

The most notable improvement is in its coding performance. GLM-4.6 has been explicitly tuned for code generation and debugging, and the performance gains are dramatic. On the LiveCodeBench v6, GLM-4.6 scores 82.8, which is a huge jump compared to GLM-4.5’s score of 63.3. It also consistently outperforms 4.5 across most other coding and reasoning benchmarks, including a higher score on SWE-Bench Verified (68.0 vs 64.2). On web browsing tasks, GLM-4.6 almost doubles GLM-4.5’s performance. This focus makes it incredibly powerful for developers, earning the model the nickname of a “Coding Monster”. Developers who have tested GLM-4.6 report noticeably better front-end output, suggesting its code generation is not only syntactically correct but also better aligned with human expectations for polished design.

Furthermore, GLM-4.6 is a superior agent. It shows clear improvement in reasoning performance and more accurate tool use during inference, allowing it to interact with external systems more smoothly. In real-world agent evaluations, such as CC-Bench-V1.1, which simulates multi-turn development tasks, GLM-4.6 consistently outperforms GLM-4.5, winning 50 percent of the comparative cases. The speed of this iteration, moving from 4.5 to 4.6 in only two months, highlights Z.ai’s dedication to using innovation velocity as a primary competitive weapon against global rivals. This relentless pace suggests a highly integrated and efficient research and deployment team.

GLM-4.6V: The Vision Leap with Multimodal Agents

The “V” in GLM-4.6V stands for vision, marking Z.ai’s leap into multimodal AI. This series handles not just text, but also images, screenshots, and complex document pages.

The GLM-4.6V series is aimed squarely at advanced visual agentic workflows. It achieves state-of-the-art performance in visual understanding compared to models of similar parameter scales. It also features an impressive context window of up to 128 thousand tokens during training, which is crucial for deep comprehension of long, visually rich documents.

The biggest technological jump in GLM-4.6V is the introduction of native multimodal function calling. Typically, if you want an AI to analyze an image and then use a tool, the image must first be converted into a text description, which results in significant signal loss and ambiguity. GLM-4.6V avoids this critical failure point. It allows visual data—images, screenshots, or even whole PDF pages—to be passed directly as parameters to a tool.

This means the agent can visually comprehend results returned by tools, such as statistical charts, rendered web screenshots, or retrieved product images, and ingest those visual outputs back into its reasoning chain without lossy text conversion. This capability effectively bridges the gap between the model’s visual perception and its executable action, creating a unified foundation for complex, real-world multimodal agents. This integrated design directly addresses a key failure point of earlier multimodal systems.

This enables extremely powerful use cases. For example, in Design2Code tasks, the model can convert a UI screenshot directly into executable frontend code. It also excels at long-document understanding by interpreting pages as images, processing the layout, tables, charts, and text jointly, which eliminates the need for cumbersome and error-prone preprocessing steps like Optical Character Recognition. This feature is particularly valuable for businesses dealing with large amounts of mixed-media documents, allowing for complex tasks like visual web searching and interleaved image-text generation.

Comparing GLM with DeepSeek, ChatGPT, and Gemini

The arrival of the GLM-4 series has reshaped the landscape, creating a competitive environment where open access and cost efficiency are as important as peak performance. We can compare these models against the established global leaders: DeepSeek, OpenAI’s ChatGPT (representing the GPT series), and Google Gemini.

Performance, Reasoning Power, and the Coding Battle

In terms of raw, high-stakes reasoning performance, the global proprietary models often maintain a slight edge. Google’s Gemini 3 Pro, for example, is highly regarded for complex analytical tasks and benchmark performance. Comparative data shows Gemini 3 Pro outperforming GLM-4.5 significantly in pure reasoning metrics like GPQA (91.9 percent versus 79.1 percent) and coding performance on SWE-Bench Verified (76.2 percent versus 64.2 percent). OpenAI’s ChatGPT, currently represented by models like GPT-5.1, remains the industry’s most polished all-around choice, boasting the largest user base and ecosystem.

However, the GLM models and DeepSeek are formidable challengers in specialized domains like agentic workflow and coding. GLM-4.6, with its dedicated tuning, is extremely competitive. Its coding scores are nearly on par with advanced proprietary models like Claude Sonnet 4 in places, though Claude Sonnet 4.5 still leads for the hardest coding problems. The latest DeepSeek V4 is also known for achieving state-of-the-art results in mathematical and coding tasks. The key difference here is the strategic tradeoff: Z.ai and DeepSeek sacrifice a tiny margin in peak benchmark performance to achieve monumental gains in efficiency and cost. They are not striving to be the absolute best model, but the best value or the best agent model for the vast majority of commercial applications.

Technology Stack, Cost Dynamics, and Openness

The architectural choices made by the Chinese leaders are defining the new market dynamics. Both GLM-4.5 and DeepSeek V4 rely heavily on the Mixture of Experts (MoE) architecture. DeepSeek V4 is currently the largest open MoE model, with an estimated total parameter count of approximately 1 trillion. Despite this enormous size, it only activates around 32 billion parameters per token, aligning perfectly with GLM-4.5’s strategy of leveraging large knowledge capacity while maintaining a manageable, affordable operational cost. This rise of open-source MoE giants fundamentally redefines what “frontier” AI means, shifting the focus to the most efficient active parameter count deployed under a permissive license. Traditional proprietary models often used dense, fully activated architectures, which are inherently more costly to run.

This efficiency directly translates into the defining characteristic of the new competition: the cost chasm. Proprietary models like Gemini 3 Pro and ChatGPT are expensive, often costing dollars per million tokens. For example, Gemini 3 Pro costs five times more than GLM-4.5 for input processing and 7.5 times more for output processing. DeepSeek V3 is already very affordable, but GLM-4.5 goes even further, pushing pricing expectations down to pennies.

Furthermore, the openness strategy is vital. Both GLM-4.5 and DeepSeek V4 are released under permissive open-source licenses, inviting broad use and experimentation. This is in sharp contrast to OpenAI and Google, whose models are closed source and proprietary. The commitment to open source democratizes the technology, allowing developers to fine-tune GLM-4.6 on their proprietary data, empowering local innovation and avoiding vendor lock-in.

Innovation Speed, Hype, and Current Limitations

Z.ai is using speed as its primary strategic tool. The company’s rapid iteration—releasing GLM-4.5 and then swiftly following up with the enhanced 4.6 and multimodal 4.6V within months—demonstrates an aggressive development cadence that is faster than most global competitors. This velocity is necessary for Z.ai to compensate for the incumbents’ decades-long advantage in market presence and ecosystem maturity.

While ChatGPT and Gemini maintain the largest user base and ecosystem maturity, the hype surrounding the GLM models is focused on economic disruption. It is generating significant excitement in developer communities because it is a “coding monster that is free”.

Z.ai’s primary limitations are rooted in ecosystem maturity and geopolitical reality. Although Z.ai cleverly engineered its models to run efficiently on export-compliant hardware, the company faces direct restrictions from US authorities. The ecosystem supporting GLM models outside of China is still weaker compared to the massive third-party integration infrastructure built around the OpenAI and Google APIs. This means that while the core model performance is world-class, developers might still find fewer established tools and integrations when working with the Western proprietary platforms. The aggressive innovation speed is Z.ai’s way of ensuring that these models quickly become too powerful and too cost-effective to ignore, even with these limitations in place.

Frequently Asked Questions

What is the main focus of China’s latest AI model, GLM-4.5?

The primary focus of the China new AI model GLM-4.5 is unifying agentic capabilities, complex reasoning, and coding into a single, highly efficient foundation model. It is designed specifically for intelligent agents that need to break down problems and use tools systematically.

Is GLM-4.5 truly open source and free for commercial use?

Yes, GLM-4.5 is released under the MIT open-source license. This license permits self-hosting, customization, and unrestricted commercial use, making it an extremely popular choice for businesses looking to avoid subscription costs and vendor lock-in.

How does GLM-4.5’s pricing compare to models like DeepSeek or ChatGPT?

GLM-4.5 is one of the most affordable frontier models available. Its API input token price is significantly lower than rivals like DeepSeek R1 and dramatically cheaper than proprietary models like Google Gemini 3 Pro, costing pennies per million tokens where others cost dollars.

What is the primary difference between GLM-4.5 and GLM-4.6?

GLM-4.6 is a direct upgrade focused on coding and context capacity. It features a significantly improved context window, expanded to 200 thousand tokens from 4.5’s 128 thousand. It also demonstrates superior coding performance on critical benchmarks.

What does “Agentic AI” mean in the context of Z.ai’s models?

In Z.ai’s models, Agentic AI refers to the model’s ability to operate in a “thinking mode”. This allows the model to decompose complex requests into smaller steps, select and use external tools, and present a structured output, mimicking human problem-solving processes.

What is GLM-4.6V’s key technological innovation?

The key innovation of GLM-4.6V is its native multimodal function calling. This allows the model to pass visual data like images or charts directly to tools, and ingest visual results back into its reasoning chain without losing important information through conversion to text.

How is Z.ai dealing with US export controls and restrictions?

Z.ai has engineered its models for extreme hardware efficiency. GLM-4.5 is optimized to run effectively on export-compliant Nvidia H20 chips, minimizing its hardware operational footprint and turning geopolitical constraints into an advantage in cost and efficiency.

Conclusion

The release of the GLM-4 series, spearheaded by the massively disruptive GLM-4.5, confirms that the pace of AI innovation coming out of China is incredibly high and strategically focused on accessibility and efficiency. Z.ai has managed to deliver powerful, frontier-level AI capabilities particularly in complex agentic workflows and coding at a fraction of the cost of legacy systems. The company’s successful use of the Mixture of Experts architecture and its strategic commitment to open-source licensing have democratized advanced AI, shifting the market focus from pure parameter count to performance-per-dollar. The rapid follow-up with GLM-4.6 and the multimodal GLM-4.6V shows that Z.ai is not resting on its achievements, but is actively pushing the boundaries of what open and efficient AI can achieve globally.

It is truly impressive to watch Z.ai, DeepSeek, and other rising stars challenge the established order with such speed and strategic intelligence. The entire AI community benefits when companies like these introduce such competitive pricing and powerful, open-source technology. The work being done by innovators at Z.ai, along with industry leaders such as OpenAI, Google, and DeepSeek, is driving remarkable progress, ensuring that sophisticated AI tools are not just for the elite few but are becoming accessible to everyone who wants to build the future. They are all doing a great job pushing technology forward.

Aditya Gupta
Aditya Gupta
Articles: 460
Review Your Cart
0
Add Coupon Code
Subtotal