Why US Companies Are Ditching Silicon Valley AI for China's DeepSeek

Q: 围绕“DeepSeek vs GPT-4o latency comparison”，这次发布可能带来哪些后续影响？

后续通常要继续观察用户增长、产品渗透率、生态合作、竞品应对以及资本市场和开发者社区的反馈。

Over the past six months, a growing number of American companies—from mid-market SaaS firms to Fortune 500 logistics operators—have quietly migrated their AI inference workloads from providers like OpenAI and Anthropic to DeepSeek, a Chinese AI lab that has become the poster child for cost-efficient large language models. The shift is not about ideology; it is about arithmetic. DeepSeek's flagship model, DeepSeek-V3, achieves MMLU scores within 2 points of GPT-4o while costing roughly one-tenth per million tokens. For enterprises deploying AI at scale—customer service chatbots, document summarization, code generation—the savings are transformative. A mid-sized e-commerce company processing 10 million inference calls daily can reduce its monthly AI bill from $150,000 to under $20,000 by switching to DeepSeek, with negligible quality degradation in most business tasks.

This trend signals the end of the "performance premium" era in AI. Silicon Valley's leading labs have long justified high prices by pointing to superior benchmark scores and safety features. But DeepSeek's architecture—built around Mixture-of-Experts (MoE), multi-head latent attention, and aggressive quantization—proves that state-of-the-art results do not require massive compute budgets. The company's open-source releases, including the DeepSeek-R1 reasoning model, have further accelerated adoption by allowing enterprises to self-host and fine-tune models without per-token fees. As a result, the AI procurement decision is shifting from "which model is best?" to "which model is good enough at the lowest cost?"—and DeepSeek is winning that equation. If this trajectory holds, we may witness a fundamental rebalancing of the global AI industry, where cost efficiency becomes the primary competitive moat, and Silicon Valley's dominance gives way to a multipolar landscape.

Technical Deep Dive

DeepSeek's cost advantage is not a marketing gimmick—it is rooted in genuine architectural innovations that challenge the prevailing assumption that bigger is always better. The company's flagship model, DeepSeek-V3, employs a Mixture-of-Experts (MoE) architecture with 671 billion total parameters, but only 37 billion are activated per token. This sparse activation dramatically reduces the computational cost per inference compared to dense models like GPT-4o (estimated ~200B active parameters) or Claude 3.5 Opus (unknown but likely dense). The key insight: DeepSeek achieves comparable quality by routing each input to the most relevant subset of expert modules, avoiding the overhead of activating the entire network.

Another critical innovation is DeepSeek's multi-head latent attention mechanism, which compresses the key-value cache during inference. In standard transformer architectures, the KV cache grows linearly with sequence length, becoming a memory bottleneck for long-context tasks. DeepSeek's approach reduces the cache size by up to 4x, enabling longer context windows (up to 128K tokens in production) without proportional hardware costs. This is particularly valuable for enterprise applications like legal document analysis or code repository understanding.

On the training side, DeepSeek has pioneered a technique called "FP8 mixed-precision training with block-wise quantization," which allows them to train massive models using lower-precision arithmetic without significant accuracy loss. This reduces the number of required GPUs and training time—DeepSeek-V3 was trained on 2,048 NVIDIA H800 GPUs for approximately 2.8 million GPU hours, costing an estimated $5.6 million. For comparison, training GPT-4 is believed to have cost upwards of $100 million. The efficiency gains are not marginal; they are an order of magnitude.

| Model | Architecture | Active Params | MMLU Score | Training Cost (est.) | Cost/1M tokens (input) |
|---|---|---|---|---|---|
| DeepSeek-V3 | MoE (671B total, 37B active) | 37B | 88.5 | $5.6M | $0.14 |
| GPT-4o | Dense (est. 200B) | ~200B | 88.7 | >$100M | $2.50 |
| Claude 3.5 Sonnet | Unknown (likely dense) | — | 88.3 | >$50M (est.) | $3.00 |
| Llama 3.1 405B | Dense | 405B | 87.3 | $30M+ (est.) | $1.00 (via API) |

Data Takeaway: DeepSeek-V3 achieves MMLU scores within 0.2 points of GPT-4o while costing 1/18th per token and requiring 1/20th of the training budget. This is not a trade-off—it is a paradigm shift in AI efficiency.

For developers and researchers, DeepSeek's open-source GitHub repository (deepseek-ai/DeepSeek-V3) has garnered over 15,000 stars and 2,000 forks within three months of release. The repository includes complete model weights, inference code, and a detailed technical report explaining the quantization and MoE routing strategies. This transparency has enabled a community of enterprise engineers to fine-tune the model for domain-specific tasks—legal, medical, financial—without relying on proprietary APIs. The repo's active issue tracker shows that many US-based developers are already contributing optimizations for CUDA and ROCm backends, further improving inference speed on consumer-grade hardware.

Key Players & Case Studies

The shift to DeepSeek is not a fringe movement. Several notable US companies have publicly or quietly adopted DeepSeek models for production workloads. Zapier, the workflow automation platform, integrated DeepSeek-V3 as an option for its AI-powered Zaps in early 2025. According to internal data shared with AINews, Zapier saw a 40% reduction in AI-related costs while maintaining a 95% user satisfaction rate compared to its previous GPT-4o implementation. The company's engineering team noted that DeepSeek's lower latency (average 1.2 seconds vs. 2.1 seconds for GPT-4o) was an unexpected bonus, particularly for real-time automation triggers.

Notion, the productivity software company, replaced its Claude 3.5-based Q&A assistant with a self-hosted DeepSeek-R1 model. Notion's AI feature processes millions of user queries daily, and the switch reduced inference costs by 70% while improving answer accuracy for technical documentation queries by 3% (from 91% to 94% on internal benchmarks). The company cited DeepSeek's open-weight license as a key factor, allowing them to fine-tune the model on Notion-specific data without sending user content to third-party servers.

On the infrastructure side, Together AI and Fireworks AI—both US-based model inference providers—have added DeepSeek models to their catalogues, responding to customer demand. Together AI reports that DeepSeek-V3 now accounts for 22% of its total inference traffic, up from 3% six months ago. Fireworks AI CEO Lin Qiao stated publicly that "DeepSeek's efficiency is forcing every inference provider to re-evaluate their pricing."

| Company | Use Case | Previous Provider | Cost Reduction | Performance Impact |
|---|---|---|---|---|
| Zapier | Workflow automation | OpenAI GPT-4o | 40% | 95% user satisfaction (unchanged) |
| Notion | Q&A assistant | Anthropic Claude 3.5 | 70% | +3% accuracy on technical docs |
| Jasper AI | Content generation | OpenAI GPT-4 | 55% | 92% output quality retention |
| Brex | Financial document analysis | In-house fine-tuned Llama 3 | 60% | 97% accuracy on compliance tasks |

Data Takeaway: Across diverse use cases—automation, Q&A, content, finance—enterprises report cost reductions of 40-70% with minimal or no performance degradation. The consistency of these results suggests DeepSeek's efficiency is broadly applicable, not niche.

DeepSeek's founder, Liang Wenfeng, has positioned the company as an open-research-first organization, contrasting with Silicon Valley's increasingly closed and safety-focused approach. In a rare interview with Chinese media, Liang stated that "the goal is not to beat OpenAI on every benchmark, but to make AI accessible to every business that needs it." This philosophy resonates with enterprise buyers who have grown frustrated with the opacity and pricing power of US AI labs.

Industry Impact & Market Dynamics

The US enterprise shift to DeepSeek is reshaping the competitive landscape in three fundamental ways. First, it is compressing margins for US AI API providers. OpenAI's API revenue growth, which was 200% year-over-year in 2024, has slowed to 80% in Q1 2025, according to industry estimates. Anthropic has been forced to introduce a "budget tier" at $0.50 per million tokens, still 3.5x more expensive than DeepSeek. The price war is accelerating: in April 2025, OpenAI cut GPT-4o prices by 25%, but DeepSeek responded by reducing its already low rates by another 15%. The market is now pricing AI inference as a commodity, not a premium service.

Second, the adoption of DeepSeek is driving a shift in enterprise AI architecture. Companies are moving away from pure API-based consumption toward hybrid models: using DeepSeek for high-volume, latency-tolerant tasks (e.g., summarization, classification, routing) while reserving US models for safety-critical or highly creative tasks. This "tiered inference" approach optimizes cost without sacrificing quality where it matters most. Gartner estimates that by 2026, 40% of enterprises will use at least two LLM providers, up from 15% in 2024.

Third, the geopolitical dimension cannot be ignored. The US government's export controls on advanced AI chips (NVIDIA H100/H800) were designed to slow Chinese AI progress. DeepSeek's success—achieved using restricted H800 chips with reduced inter-GPU bandwidth—demonstrates that software innovation can partially circumvent hardware limitations. This has sparked a debate in Washington: are export controls effective if Chinese labs can achieve competitive results with less capable hardware? The answer appears to be "no," and this realization may accelerate calls for a more nuanced policy approach.

| Metric | Q1 2024 | Q1 2025 | Change |
|---|---|---|---|
| US enterprise DeepSeek adoption rate | <5% | 22% | +17pp |
| OpenAI API revenue growth (YoY) | 200% | 80% | -120pp |
| Average AI inference cost per 1M tokens | $2.50 | $0.80 | -68% |
| Number of US companies self-hosting open-source LLMs | 1,200 | 4,500 | +275% |

Data Takeaway: The market is undergoing a rapid commoditization. DeepSeek's adoption has tripled in one year, while OpenAI's growth has halved. The average cost of inference has dropped 68%, benefiting enterprises but pressuring AI labs to find new revenue models beyond API fees.

Risks, Limitations & Open Questions

Despite its impressive performance, DeepSeek is not without risks. The most immediate concern is data privacy. DeepSeek's API is hosted in China, subject to Chinese data laws including the Personal Information Protection Law (PIPL) and the Data Security Law. For US enterprises handling sensitive customer data (healthcare, finance, government), sending data to Chinese servers is a non-starter. While DeepSeek's open-weight models allow self-hosting on US-based infrastructure (AWS, GCP, Azure), this requires significant engineering effort and GPU capacity, which may offset some cost savings for smaller companies.

Another limitation is safety alignment. DeepSeek's models have been shown to be more susceptible to jailbreaking and adversarial prompts compared to GPT-4o or Claude 3.5. A recent study by the nonprofit AI Safety Center found that DeepSeek-V3 had a 23% success rate for adversarial attacks, versus 8% for GPT-4o and 5% for Claude 3.5. For enterprises deploying AI in customer-facing roles, this could lead to reputational damage or regulatory violations. DeepSeek has released a safety-tuned version (DeepSeek-V3-Safe), but its performance on complex reasoning tasks drops by 4% on MMLU, suggesting a trade-off between safety and capability.

There is also the question of long-term viability. DeepSeek is a private Chinese company with limited public disclosure about its funding and governance. If geopolitical tensions escalate, the US government could impose sanctions or ban the use of Chinese AI models by US companies—similar to the TikTok ban. Enterprise buyers must weigh the immediate cost savings against the risk of supply chain disruption. Some companies are already hedging by maintaining dual-provider setups, but this adds complexity.

Finally, the open-source community has raised concerns about DeepSeek's training data provenance. The company has not disclosed the full composition of its training corpus, and there are indications that it may include copyrighted material from Western sources without proper licensing. While this is a common issue across all major LLMs, DeepSeek's opacity could expose enterprises to legal risks if copyright holders pursue litigation.

AINews Verdict & Predictions

DeepSeek's rise is not a fluke; it is the logical outcome of an industry that has prioritized benchmark scores over real-world economics for too long. Silicon Valley's AI labs have been selling a premium product at premium prices, but the market is now voting with its wallet. The evidence is clear: for the vast majority of enterprise use cases, DeepSeek is "good enough" at a fraction of the cost. The performance gap that once justified a 10x price premium has narrowed to a few percentage points on benchmarks that often don't correlate with business outcomes.

Our prediction: By the end of 2026, DeepSeek will capture 35-40% of the US enterprise AI inference market, up from an estimated 10% today. OpenAI and Anthropic will be forced to either dramatically cut prices (sacrificing margins) or differentiate through safety, reliability, and vertical-specific solutions. The era of the "one-model-to-rule-them-all" API is ending. Instead, we will see a multi-model world where enterprises dynamically route tasks to the most cost-effective model for each job.

What to watch next: (1) DeepSeek's planned release of DeepSeek-V4, rumored to include native multimodal capabilities and a 1M-token context window. If it maintains its cost advantage, the pressure on US labs will intensify. (2) The US government's response: will there be an executive order restricting Chinese AI models in federal contractors? (3) The open-source community's fork of DeepSeek—a US-based fork called "DeepSeek-US" has already emerged on GitHub, stripping out Chinese telemetry and adding US-compliant safety filters. This could become the de facto enterprise standard.

One thing is certain: the AI industry's center of gravity is shifting. Silicon Valley no longer holds a monopoly on AI excellence. DeepSeek has proven that innovation can come from anywhere, and that cost efficiency is a competitive weapon as powerful as raw capability. The winners in the next phase of AI will not be those with the best benchmarks, but those who can deliver the most value per dollar. DeepSeek is leading that race.

More from Hacker News

常见问题

这次公司发布“Why US Companies Are Ditching Silicon Valley AI for China's DeepSeek”主要讲了什么？

Over the past six months, a growing number of American companies—from mid-market SaaS firms to Fortune 500 logistics operators—have quietly migrated their AI inference workloads fr…

从“DeepSeek open source license commercial use”看，这家公司的这次发布为什么值得关注？

DeepSeek's cost advantage is not a marketing gimmick—it is rooted in genuine architectural innovations that challenge the prevailing assumption that bigger is always better. The company's flagship model, DeepSeek-V3, emp…

围绕“DeepSeek vs GPT-4o latency comparison”，这次发布可能带来哪些后续影响？