ผู้นำ AI ของจีนเปลี่ยนโฟกัสจากเกณฑ์มาตรฐานสู่ธุรกิจ: การปรับเปลี่ยนครั้งใหญ่สู่ Agent และ World Models

A closed-door discussion among China's foremost large language model developers has surfaced a critical consensus: the industry's primary challenge is no longer building bigger models, but making them useful, reliable, and economically viable. Chaired by Yang Zhilin, founder of Moonshot AI, the roundtable included influential figures such as Zhang Peng and Luo Fuli, representing a cross-section of the nation's top AI talent. The conversation revealed a unified pivot toward three concrete frontiers: the engineering of robust AI agents capable of executing complex, multi-step tasks in real-world environments; the foundational research into "world models" that enable AI to develop a coherent, persistent understanding of physical and social dynamics; and the urgent, collective pursuit of sustainable business models beyond venture capital subsidies. This represents a maturation of the Chinese AI ecosystem, moving from a phase of technological demonstration to one of deep integration and value creation. The implications are significant: companies that fail to navigate this transition from lab-grade models to production-grade services face obsolescence, while those that master the integration of technical depth, product design, and commercial logic will define the next era of AI-powered productivity.

Technical Deep Dive

The pivot discussed is not merely philosophical; it demands concrete technical evolution across three axes: Agent Architecture, World Model Foundations, and Inference Economics.

Agent Architecture: Moving from a single LLM call to a persistent, tool-using agent requires a fundamental shift in system design. The core challenge is reliability in long-horizon tasks. Current approaches involve sophisticated orchestration frameworks that manage planning, tool execution, memory, and self-correction. Key architectural patterns include:
- ReAct (Reasoning + Acting): Interleaving reasoning traces with actionable steps.
- Reflexion: Equipping agents with a self-critique and memory loop to learn from past failures.
- Hierarchical Task Decomposition: Breaking down complex user requests into manageable sub-tasks executed by specialized sub-agents or tools.

The open-source ecosystem is critical here. Projects like `LangChain` and `LlamaIndex` provide foundational frameworks for chaining LLM calls. However, for production-grade agents, more robust systems are emerging. Microsoft's `AutoGen` framework enables the creation of multi-agent conversations where different agents (e.g., a planner, a coder, a critic) collaborate. A notable Chinese-led project is `DB-GPT`, an open-source effort that integrates LLMs with databases and tools to create domain-specific agents, recently surpassing 20k stars on GitHub. Its evolution mirrors the industry's focus: moving from a simple Q&A interface to a full-featured agent platform with RAG, plugin support, and multi-agent orchestration.

World Model Exploration: This is the most ambitious technical frontier. A "world model" in the AI context refers to an internal representation that allows an AI to predict the outcomes of actions, understand object permanence, and reason about cause and effect—capabilities innate to humans but absent in today's LLMs. Researchers like Luo Fuli are exploring pathways that blend LLMs with other paradigms:
1. Neuro-Symbolic Integration: Combining neural networks (for pattern recognition) with symbolic AI (for logical reasoning and explicit knowledge representation).
2. Video Foundation Models: Training on massive video datasets (e.g., LLaVA-NeXT, VideoPoet) to instill intuitive physics and temporal understanding.
3. Embodied AI Simulation: Using platforms like NVIDIA's Isaac Sim or Meta's Habitat to train AI in simulated 3D environments, a crucial step toward physical world understanding.

The technical hurdle is creating a model that can update its internal state consistently based on new observations, a problem known as state estimation. Current LLMs are stateless by default; each prompt starts from scratch. Building a persistent, updatable world model is a prerequisite for agents that operate over extended periods.

Inference Economics & Optimization: The commercial imperative demands drastic cost reduction. This fuels innovation in:
- Mixture of Experts (MoE): Models like Moonshot AI's Kimi and DeepSeek's models employ MoE architectures, where only a subset of neural network "experts" are activated per token, reducing compute costs by 2-4x during inference while maintaining model capacity.
- Quantization & Speculative Decoding: Techniques like GPTQ, AWQ, and speculative decoding (using a small, fast "draft" model to propose tokens verified by the larger model) are essential for deploying billion-parameter models on affordable hardware.

| Optimization Technique | Typical Latency Reduction | Typical Cost Reduction | Key Trade-off |
|---|---|---|---|
| 4-bit Quantization (GPTQ) | 20-30% | 60-75% | Minor accuracy loss on complex reasoning
| Speculative Decoding | 2-3x (for suitable drafts) | ~60% | Requires well-aligned draft model
| Mixture of Experts (Inference) | Similar to dense | 60-70% | Higher memory bandwidth usage
| Model Distillation | 2-10x | 70-90% | Significant capability loss vs. original model

Data Takeaway: No single optimization is a silver bullet. Production deployments will stack multiple techniques—quantized MoE models with speculative decoding—to achieve the sub-$0.10 per million token inference cost required for mass-market agent applications.

Key Players & Case Studies

The roundtable participants represent distinct strands of the new pragmatic approach.

Yang Zhilin (Moonshot AI): His company's Kimi Chat is a case study in the agent-first pivot. Initially notable for its long context window (now exceeding 1 million tokens), Moonshot has aggressively marketed Kimi's ability to handle complex, multi-file tasks—a direct agent-like capability. Their strategy appears to be owning the "heavy-lifting" agent for knowledge workers, integrating deeply with documents, spreadsheets, and web search.

Luo Fuli & The Research Vanguard: Representing the academic and long-term research wing, her work underscores the investment in world models as the next paradigm. While commercially behind, institutions like Shanghai AI Laboratory and Tsinghua University are laying the groundwork for the next leap. Their focus is not on next-quarter's revenue but on the foundational models that will power agents 5-10 years from now.

Zhang Peng & The Enterprise Integrators: Figures from companies like Zhipu AI or Baidu's AI Cloud represent the B2B engine of the pivot. Their playbook involves embedding LLM capabilities into existing enterprise software suites (ERP, CRM) and cloud services. The agent here is less a standalone chatbot and more an automated workflow within a business process. For them, the key is fine-tuning, secure deployment, and SLA guarantees.

| Company / Project | Primary Agent Focus | Key Product/Strategy | Commercial Stage |
|---|---|---|---|
| Moonshot AI (Kimi) | Long-context knowledge work agent | Super long context, file processing, web search | Consumer-facing, seeking premium subscriptions
| Zhipu AI | Enterprise workflow automation | GLM series models, deep integration with enterprise software & cloud | B2B, API and solution sales
| DeepSeek | Open-source & cost-effective agents | Open-sourcing powerful MoE models (DeepSeek-V2), low-cost API | Hybrid: open-source mindshare + low-cost API revenue
| Alibaba Cloud | Industry-specific vertical agents | "Tongyi Qianwen" models tailored for finance, healthcare, logistics | B2B via cloud platform bundling
| Stepfun (由幻) | Social & creative agents | Focus on AI characters, emotional interaction, user-generated agents | Consumer-facing, virtual social platform

Data Takeaway: The market is already stratifying. Moonshot and Stepfun target horizontal consumer/prosumer agents, while Zhipu and Alibaba dominate vertical enterprise integration. DeepSeek's open-source strategy uniquely pressures the cost structure of the entire industry.

Industry Impact & Market Dynamics

This collective pivot will trigger a brutal but necessary industry shakeout, reshaping competition, investment, and adoption.

From Capex to Opex Model: The initial phase was characterized by massive capital expenditure (Capex) on GPU clusters for training. The new phase is about operating expenditure (Opex)—reducing the cost per query to sustainable levels. This favors companies with:
1. Superior inference optimization engineering.
2. Access to affordable compute (e.g., via strategic partnerships with domestic chip makers like Cambricon or Biren).
3. High-margin use cases that can tolerate current costs while scaling down.

The Vertical Integration Imperative: The "pure-play" foundational model API company faces extreme pressure. Winners will be those who control the full stack: model, agent framework, and end-user application interface. We will see a wave of mergers and acquisitions as model companies acquire vertical SaaS companies to gain domain-specific data and distribution, and as application companies build or buy model teams to control their destiny.

Funding Winter for the Undifferentiated: Venture capital will flow away from teams promising "a better GPT-4" toward those with demonstrable paths to:
- Revenue per Agent Session: Clear metrics on how much value an agent creates per interaction (e.g., customer service cost savings, sales lead qualification).
- Agent Success Rate: Moving beyond token accuracy to measuring task completion fidelity in noisy, real-world settings.
- Vertical Domain Depth: Proprietary data pipelines and workflows for specific industries.

| Market Segment | 2024 Estimated Size (China) | Growth Driver | Key Success Metric |
|---|---|---|---|
| Foundational Model APIs | $800M | Replacement of older ML models | Tokens consumed, API reliability
| Enterprise AI Solutions (B2B Agents) | $2.5B | Workflow automation, data analysis | ROI, process efficiency gain (%)
| Consumer AI Agents/Apps | $300M | Premium subscriptions, virtual services | DAU/MAU, subscription conversion
| AI Agent Development Platforms | $200M | Democratization of agent creation | Number of deployed agents, platform fee

Data Takeaway: The real money is in B2B enterprise solutions, not consumer chatbots. The foundational API market, while growing, will be commoditized and squeezed by open-source and cost optimization, making it a challenging standalone business. The platform play for building agents is a nascent but critical battleground.

Risks, Limitations & Open Questions

The pragmatic path is fraught with its own perils.

The Reliability Chasm: Today's LLMs, and by extension agents, are fundamentally stochastic. A 99% accuracy rate on a benchmark is meaningless if the 1% failure occurs in a critical business transaction or medical advice context. Achieving "five-nines" (99.999%) reliability required for mission-critical systems may be architecturally impossible with current autoregressive transformer-based agents. This could limit high-value applications to human-in-the-loop systems for the foreseeable future, capping the economic upside.

Data Flywheel Disadvantage: The shift to vertical, proprietary agents fragments the data pool. An agent fine-tuned on legal contracts generates data that is less useful for improving a healthcare diagnostic agent. This could slow the overall pace of general capability advancement in China compared to Western counterparts like OpenAI, which continues to aggregate diverse data through a unified, general-purpose interface like ChatGPT.

The World Model Mirage: The pursuit of world models is a high-risk, long-term bet. It may require architectural breakthroughs beyond the transformer. Significant capital and talent diverted to this exploratory research could weaken short-term competitiveness in applied agent deployment, creating a strategic vulnerability.

Regulatory & Sovereignty Tightrope: As agents become more autonomous and integrated into infrastructure, regulatory scrutiny will intensify. A single high-profile agent failure could trigger restrictive legislation. Furthermore, the push for using domestic hardware (for supply chain security) often means less efficient chips, putting Chinese companies at a persistent cost disadvantage against global peers using the latest NVIDIA GPUs.

AINews Verdict & Predictions

The roundtable consensus is correct and inevitable. The age of AI as a spectacle is over; the age of AI as a utility has begun. Our editorial judgment is that this pivot will succeed in creating substantial enterprise value but will also lead to significant consolidation within the Chinese AI sector within 18-24 months.

Specific Predictions:
1. By end of 2025, at least two major Chinese LLM startups will merge or be acquired, not due to failure, but to combine model expertise with vertical distribution channels. The standalone model provider is an endangered species.
2. The dominant business model to emerge will be "AI Transformation as a Service"—a consulting-led, deeply integrated offering where companies like Zhipu or Alibaba Cloud not only provide the agent but also redesign the client's business process. Margins will be in the services, not the API calls.
3. Open-source models (like those from DeepSeek) will become the de facto base for 70% of new enterprise agent projects due to cost, customization, and data privacy concerns. Commercial model APIs will be relegated to niche, high-performance, or convenience-use cases.
4. World model research will see its first "GPT-3 moment"—a convincing demo of physical reasoning—by a Chinese lab within 2026, but it will remain a research curiosity with limited commercial impact until the end of the decade.

What to Watch Next: Monitor the quarterly financials of listed entities like Baidu and Alibaba for breakout growth in their "Cloud & AI" segments—this is the leading indicator of enterprise adoption. Watch for the first major partnership between a top AI lab (e.g., Moonshot) and a traditional industry giant (e.g, a state-owned bank or manufacturer) to co-develop a vertically integrated agent. That will be the definitive signal that the pivot from the lab to the last mile is complete.

常见问题

这次公司发布“China's AI Leaders Shift Focus from Benchmarks to Business: The Great Pivot to Agents and World Models”主要讲了什么？

A closed-door discussion among China's foremost large language model developers has surfaced a critical consensus: the industry's primary challenge is no longer building bigger mod…

从“Moonshot AI Kimi Chat business model 2025”看，这家公司的这次发布为什么值得关注？

The pivot discussed is not merely philosophical; it demands concrete technical evolution across three axes: Agent Architecture, World Model Foundations, and Inference Economics. Agent Architecture: Moving from a single L…

围绕“Zhipu AI vs DeepSeek enterprise pricing strategy”，这次发布可能带来哪些后续影响？

后续通常要继续观察用户增长、产品渗透率、生态合作、竞品应对以及资本市场和开发者社区的反馈。