กลยุทธ์สามเสาหลักของ Didi Autonomous Driving: AI, ฮาร์ดแวร์ และสถานการณ์กำหนดเส้นทางสู่การขยายขนาด

11 เมษายน 2569 เวลา 18:34 AINews

Didi Autonomous Driving ได้กำหนดกลยุทธ์ระยะยาวรอบๆ ความสามารถสามด้านที่พึ่งพาซึ่งกันและกัน ได้แก่ ปัญญาประดิษฐ์ ระบบฮาร์ดแวร์ และความเข้าใจเชิงลึกเกี่ยวกับสถานการณ์ สิ่งนี้หมายถึงการเปลี่ยนแปลงที่สำคัญของอุตสาหกรรม จากเกณฑ์มาตรฐานเทคโนโลยีแบบแยกส่วน ไปสู่วิศวกรรมระบบแบบบูรณาการที่จำเป็นสำหรับการขยายขนาด

The article body is currently shown in English by default. You can generate the full version in this language on demand.

Didi Autonomous Driving has publicly articulated a strategic framework that positions it for the next phase of the autonomous vehicle industry. The company is moving beyond the pursuit of peak performance in individual modules like perception or planning. Instead, it is focusing on the synergistic integration of three core pillars: advanced AI models capable of complex reasoning, a purpose-built hardware stack for reliability and cost control, and a granular, operational mastery of specific driving domains or 'scenes.' This tripartite approach is fundamentally about building a service, not just a vehicle. The strategy leverages Didi's unique asset: a massive, real-time dataset from its core ride-hailing network, which provides an unparalleled training ground for AI systems to learn the nuanced, social, and often implicit rules of urban mobility. By concentrating initially on mastered scenes—such as airport routes, late-night operations, or dense urban corridors—the company aims to achieve operational reliability and positive unit economics in bounded environments before expanding. This reflects a broader industry maturation, where the challenge is no longer proving a car can drive itself, but proving it can do so safely, consistently, and profitably as part of a large-scale transportation network. The ultimate product is not an autonomous vehicle, but a new urban mobility system.

Technical Deep Dive

Didi's three-pillar strategy represents a sophisticated engineering philosophy where each component is designed to reinforce the others. The AI pillar is evolving from traditional modular pipelines (perception → prediction → planning) toward end-to-end neural architectures and, more critically, the development of world models. A world model is an AI system's internal simulation of its environment, allowing it to predict future states and reason about the consequences of actions without direct experience. For autonomy, this means moving from reactive responses to proactive, socially-aware driving. Didi's research team, including scientists like Chief Scientist Wu Gansha, has published work on multimodal perception and prediction models that ingest LiDAR, camera, and radar data to understand long-tail scenarios. A key technical repository is `DIDI-Research/Scene-Graph-Prediction`, a GitHub project focusing on using dynamic scene graphs for more accurate long-horizon trajectory forecasting of multiple agents, which has garnered significant attention for its novel approach to encoding complex urban interactions.

The hardware pillar is about creating a deterministic physical platform for this intelligence. This involves deep vertical integration, from sensor selection and placement to the design of the central computing unit. Didi is developing its own DiDi Auto Brain computing platform, which likely employs a heterogeneous architecture combining high-performance GPUs for AI inference with specialized ASICs or FPGAs for sensor fusion and deterministic control tasks. The goal is to achieve automotive-grade reliability (ASIL-D) while managing thermal, power, and cost constraints. Sensor suites are being optimized for specific operational design domains (ODDs); for example, a robotaxi destined for complex urban intersections may carry a higher-fidelity LiDAR than one operating on a simple geo-fenced highway route.

| Capability | Traditional Modular Approach | Didi's Integrated Pillar Approach |
|---|---|---|
| AI Core | Separate models for detection, tracking, prediction. Planning is rule-based or optimization-driven. | Movement towards end-to-end neural motion planning and world models for causal reasoning. |
| Data Utilization | Offline training on curated datasets. Simulation for edge cases. | Continuous online learning from live ride-hailing fleet, creating a real-world data flywheel. |
| Hardware Focus | Commodity sensors + powerful generic compute (NVIDIA Drive). | Custom sensor calibration/fusion + domain-specific compute (DiDi Auto Brain) for efficiency. |
| Validation Metric | Disengagement rates, miles per intervention. | Service-level metrics: uptime, passenger satisfaction, cost per ride, safety per billion miles. |

Data Takeaway: The table illustrates a paradigm shift from evaluating standalone technical competencies to optimizing for integrated system performance and commercial service metrics. Didi's strategy explicitly ties technical choices to business outcomes.

Key Players & Case Studies

The autonomous driving landscape is bifurcating into two primary camps: the full-stack service integrators like Didi, Waymo, and Cruise, and the technology suppliers like Mobileye, NVIDIA, and Baidu Apollo (in its supplier mode). Didi's strategy places it firmly in the first camp, competing directly with Waymo's decade-long focus on building an integrated rider service and Cruise's push for dense urban deployment.

Waymo has demonstrated the gold standard in pure technology, with over 20 million autonomous miles driven. However, its scaling has been methodical and capital-intensive, relying heavily on a meticulously mapped and simulated world. Cruise aggressively pursued dense urban complexity in San Francisco but faced severe regulatory and operational setbacks, highlighting the perils of scaling technology faster than operational safety and public trust can be established. Didi's differentiating factor is its embedded market position. Unlike Waymo or Cruise, which must build a user base from scratch, Didi can theoretically plug autonomous vehicles into its existing app with hundreds of millions of users, managing demand and supply through its sophisticated routing and matching algorithms from day one.

In China, competitors include Baidu Apollo, which operates both a robotaxi service (Apollo Go) and a supplier business, and Pony.ai, which focuses on robotaxis and trucking. AutoX is notable for its aggressive stance on removing safety drivers. Didi's scene-based approach is a direct counter to the "one-model-fits-all" challenge. For instance, by first mastering the Beijing Capital International Airport route—a high-demand, relatively structured corridor—Didi can deliver a reliable service quickly, generate revenue, and build public familiarity, all while collecting dense data on a specific ODD to further refine its models.

| Company | Core Strategy | Key Asset | Primary Challenge |
|---|---|---|---|
| Didi Autonomous | AI+Hardware+Scene integration; service scaling via existing network. | Live ride-hailing data & user base. | Achieving cost-effective hardware and validating safety at scale. |
| Waymo | Technology-first, safety-centric, gradual geographic expansion. | Unmatched autonomous mileage & simulation suite. | High cost structure and slower geographic/service scaling. |
| Cruise (GM) | Aggressive urban scale, manufacturing advantage. | GM manufacturing partnership & capital. | Rebuilding regulatory and public trust after safety incidents. |
| Baidu Apollo | Dual model: robotaxi service + full-stack tech supplier. | AI expertise & broad China partnerships. | Potential conflict between serving competitors and running its own service. |

Data Takeaway: The competitive matrix shows that while pure technology leadership is valuable, the battleground is shifting to commercialization assets. Didi's embedded ecosystem and scene-focused pragmatism give it a distinct path, though it trails in cumulative autonomous mileage compared to the U.S. leaders.

Industry Impact & Market Dynamics

Didi's strategy accelerates several key industry trends. First, it validates the "operational design domain (ODD) first" or scene-based approach as the most viable path to early commercialization. This will pressure competitors to similarly narrow their initial service claims rather than promising universal autonomy. Second, it underscores the immense value of real-world data at scale. This creates a significant moat for companies with large fleets, potentially marginalizing startups that rely solely on simulation and small test fleets.

The business model is evolving from selling autonomous systems (SDCs) to selling transportation-as-a-service (TaaS). The unit economics are paramount. Didi's ability to mix autonomous and human-driven vehicles within the same app allows it to optimize for cost and service quality dynamically. During peak demand in a mastered scene, autonomous vehicles can be dispatched; for unmatched requests, human drivers take over. This hybrid model is likely the dominant form for the rest of the decade.

Market projections reflect this shift. While the total addressable market for autonomous driving technology is vast, the near-term revenue is concentrating on robotaxis and dedicated autonomous trucking lanes.

| Market Segment | 2025 Projected Size (USD) | 2030 Projected Size (USD) | CAGR (2025-2030) | Primary Driver |
|---|---|---|---|---|
| Robotaxi Services | $5-7 Billion | $80-120 Billion | ~65-70% | Regulatory approval in key cities & fleet scaling. |
| Autonomous Trucking (Hub-to-Hub) | $3-4 Billion | $50-70 Billion | ~60-65% | Driver shortage & logistics cost pressure. |
| Passenger Vehicle ADAS/AV Tech Sales | $40 Billion | $90 Billion | ~18% | Consumer adoption of L2+/L3 systems. |
| Data & Simulation Services | $2 Billion | $15 Billion | ~50% | Demand for synthetic training data and validation. |

Data Takeaway: The growth rates indicate that commercial *services* (robotaxis, trucking) will outpace technology *sales* to consumers, justifying Didi's service-centric strategy. The data services segment is a dark horse, where Didi could potentially monetize its expertise and tools.

Risks, Limitations & Open Questions

Technical & Operational Risks: The world model approach, while promising, is largely unproven at the reliability levels required for safety-critical applications. "Black box" neural planning systems are difficult to certify and debug. Furthermore, mastering discrete scenes creates a patchwork autonomy problem: seamlessly transitioning between different mastered ODDs, or handling the edges between them, remains a formidable challenge. The hardware pillar carries significant R&D and supply chain risks; achieving cost targets while ensuring robustness is a persistent hurdle.

Regulatory & Social Risks: China's regulatory environment for autonomous vehicles, while supportive, is evolving and can change rapidly. Public acceptance cannot be assumed; a single high-profile incident in a Didi AV could damage trust not just in the autonomous unit, but in the core ride-hailing business. The ethical framework for decision-making in edge cases—and who is liable—is still being defined.

Business Model Risks: The capital intensity is staggering. While Didi has spun off the unit to raise external capital, the path to profitability is long. The hybrid human/AI fleet model, while pragmatic, creates internal tension: successful autonomy ultimately displaces human drivers, a core part of Didi's current ecosystem and social contract. Managing this transition is as much a political challenge as a business one.

Open Questions: Can Didi's AI models achieve a level of generalization that reduces the need for scene-by-scene engineering? Will the cost of its custom hardware stack fall fast enough to beat competitors using commoditized NVIDIA solutions? How will it navigate the data privacy and security concerns inherent in using real passenger trip data for model training?

AINews Verdict & Predictions

Didi Autonomous Driving's three-pillar strategy is the most coherent and commercially astute plan yet articulated by a major player in the robotaxi space. It correctly identifies that winning requires excellence not in one discipline, but in the orchestration of AI, hardware, and operational savvy. Its embedded position within a live mobility network is an advantage that pure-play AV companies cannot easily replicate.

AINews predicts:
1. By 2026, Didi will launch the first commercially viable, driver-out robotaxi service in at least two Chinese megacities, but it will be strictly limited to 3-5 pre-mastered high-demand corridors (e.g., airport to central business district). Profitability on these specific routes will be claimed, though overall unit economics will remain negative due to R&D overhead.
2. The "scene" will become the primary unit of competition. We will see a land grab as companies like Didi, Baidu, and Pony.ai publicly commit to mastering and deploying in specific, lucrative ODDs, moving the industry away from vague geographic expansion promises.
3. Didi's data advantage will crystallize into a platform play. By 2027, facing continued capital needs, Didi will begin licensing its scene-specific AI models or simulation tools to other OEMs or mobility operators in regions where it does not plan to compete directly, creating a new revenue stream.
4. The biggest stumbling block will not be AI, but hardware cost and reliability. The first pillar to show significant strain will be hardware, as supply chain issues and the relentless pressure to reduce sensor costs collide with safety requirements. Didi may be forced into a strategic partnership with a major automotive OEM or tier-1 supplier to harden its hardware stack.

The verdict is that Didi has mapped a credible path through the most difficult terrain in technology today. While execution risks are enormous, their strategy acknowledges the true complexity of the problem: building a sustainable business, not just a brilliant car. Watch for their next major milestone not in disengagement rates, but in the quarterly frequency and passenger ratings of fully driverless rides within a declared scene. That is the metric that will separate the contenders from the pretenders.

常见问题

这次公司发布“Didi Autonomous Driving's Three-Pillar Strategy: AI, Hardware, and Scenes Define the Path to Scale”主要讲了什么？

Didi Autonomous Driving has publicly articulated a strategic framework that positions it for the next phase of the autonomous vehicle industry. The company is moving beyond the pur…

从“Didi Autonomous Driving vs Waymo business model difference”看，这家公司的这次发布为什么值得关注？

围绕“cost of Didi robotaxi hardware stack 2024”，这次发布可能带来哪些后续影响？

后续通常要继续观察用户增长、产品渗透率、生态合作、竞品应对以及资本市场和开发者社区的反馈。