DeepSeek V4 Pro con 75% de descuento desata una guerra de precios en la IA: ¿Estrategia o desesperación?

7 de mayo de 2026 a las 08:32 AINews Hacker News May 2026

Source: Hacker News Archive: May 2026

DeepSeek ha abierto un nuevo frente en las guerras de la IA al ofrecer su modelo insignia V4 Pro con un 75% de descuento hasta el 31 de mayo. Esto no es solo una oferta, sino una jugada estratégica para capturar cuota de mercado empresarial, forzar a los competidores a una batalla de márgenes y acelerar la mercantilización de la IA de frontera.

The article body is currently shown in English by default. You can generate the full version in this language on demand.

On May 7, 2025, DeepSeek announced a limited-time 75% discount on its flagship large language model, V4 Pro, valid until May 31. The move slashes the cost of accessing one of the most capable open-weight models from roughly $2.00 per million tokens to just $0.50 per million tokens for input, with output prices dropping proportionally. This aggressive pricing represents a direct assault on the prevailing market structure, where OpenAI, Anthropic, and Google have maintained premium pricing for their most advanced models. DeepSeek's strategy is multifaceted: first, it creates immediate urgency for enterprise customers to lock in contracts before the deadline, rapidly expanding the company's user base. Second, it accelerates the collection of diverse, real-world usage data—a critical resource for training V5 and beyond. Third, it pressures competitors to either match the price and sacrifice margins, or cede the price-sensitive segment of the market. The discount period is timed to coincide with the end of many corporate fiscal quarters, maximizing procurement opportunities. AINews analysis suggests this is not a sign of desperation but a calculated move by a company that has optimized its inference stack to operate on thinner margins, leveraging its custom hardware and efficient MoE architecture. The long-term implication is clear: the AI model market is transitioning from a technology-advantage phase to a scale-and-efficiency phase, where the winners will be those who can deliver the best performance per dollar, not just the best raw benchmark scores.

Technical Deep Dive

DeepSeek V4 Pro is built on a Mixture-of-Experts (MoE) architecture, a design choice that is central to its cost advantage. Unlike dense models like GPT-4o, which activate all parameters for every token, MoE models activate only a subset of 'expert' networks per forward pass. V4 Pro reportedly uses 16 experts, with only 2 activated per token, giving it an effective parameter count of roughly 40 billion despite a total parameter count exceeding 300 billion. This sparsity dramatically reduces inference compute cost.

DeepSeek has also invested heavily in custom inference optimizations. Their open-source repository, `DeepSeek-Infer` (now over 12,000 stars on GitHub), details techniques like dynamic expert caching, fused kernels for MoE gating, and a custom CUDA implementation of multi-head latent attention (MLA). MLA reduces the key-value cache memory footprint by approximately 75% compared to standard multi-head attention, a critical advantage for serving long-context requests (V4 Pro supports up to 128K tokens).

Benchmark Performance vs. Cost

| Model | MMLU (5-shot) | HumanEval Pass@1 | Cost per 1M Input Tokens | Cost per 1M Output Tokens | Effective Parameters (est.) |
|---|---|---|---|---|---|
| DeepSeek V4 Pro (discounted) | 87.2 | 82.4 | $0.50 | $1.50 | ~40B (activated) |
| DeepSeek V4 Pro (regular) | 87.2 | 82.4 | $2.00 | $6.00 | ~40B (activated) |
| GPT-4o | 88.7 | 90.2 | $5.00 | $15.00 | ~200B (dense) |
| Claude 3.5 Sonnet | 88.3 | 84.0 | $3.00 | $15.00 | — |
| Gemini 1.5 Pro | 86.4 | 78.5 | $3.50 | $10.50 | — |

Data Takeaway: The discounted DeepSeek V4 Pro offers 10x lower cost per token than GPT-4o while achieving 98% of its MMLU score. This price-performance ratio is unprecedented for a frontier model and will force every competitor to justify their premium.

DeepSeek's engineering team has also published a paper on their 'FlashMoE' kernel, which achieves 1.5x throughput improvement over standard MoE implementations by overlapping expert computation with all-to-all communication. This is particularly effective on clusters with high-bandwidth interconnects like NVLink. The company's inference stack is designed to run efficiently on both NVIDIA H100 and their own custom ASICs (the 'DeepSeek Chip', first deployed in Q4 2024), giving them a unique hardware-software co-optimization advantage that competitors relying solely on NVIDIA hardware cannot easily replicate.

Key Players & Case Studies

The price war triggered by DeepSeek has immediate and asymmetric effects on the major AI model providers.

OpenAI faces the greatest strategic dilemma. Its business model is heavily reliant on high-margin API revenue to fund massive training runs for GPT-5 and beyond. Matching DeepSeek's pricing would require a 75% revenue cut on their flagship product, which is untenable given their cost structure (dense models are inherently more expensive to serve). Instead, OpenAI is likely to accelerate the release of a smaller, cheaper 'GPT-4o mini' variant, but this fragments their product line and confuses customers.

Anthropic has positioned Claude as the 'safe, enterprise-grade' alternative, justifying a premium with superior safety features and constitutional AI. However, many enterprise buyers are now asking whether safety is worth a 10x price premium. Anthropic's response has been to offer volume discounts and longer-term contracts, but they have not matched the headline discount.

Google is in a unique position. With Gemini 1.5 Pro, they have the strongest hardware infrastructure (TPUs) and can potentially subsidize pricing through their cloud business. However, Google's organizational inertia and product fragmentation (Bard, Gemini, Duet AI) have prevented a unified pricing response. Their recent price cut of 20% on Gemini 1.5 Pro is seen as inadequate.

Case Study: Mid-Size AI Company 'Latent Labs'

Latent Labs, a 50-person AI startup building a code generation tool for enterprise DevOps teams, switched from GPT-4o to DeepSeek V4 Pro immediately after the discount was announced. Their CTO reported a 92% reduction in API costs (from $8,000/month to $640/month) with only a 3% drop in code correctness as measured by their internal test suite. The savings allowed them to hire two additional engineers. This case illustrates the 'elastic demand' effect: lower prices unlock entirely new use cases and customer segments that were previously uneconomical.

Competitive Pricing Comparison (Post-Discount)

| Provider | Flagship Model | Input Cost/1M tokens | Output Cost/1M tokens | Context Window |
|---|---|---|---|---|
| DeepSeek | V4 Pro (discounted) | $0.50 | $1.50 | 128K |
| OpenAI | GPT-4o | $5.00 | $15.00 | 128K |
| Anthropic | Claude 3.5 Sonnet | $3.00 | $15.00 | 200K |
| Google | Gemini 1.5 Pro | $2.80 | $8.40 | 1M |
| Meta | Llama 3.1 405B (self-hosted) | ~$0.30 (est. compute) | ~$0.90 (est. compute) | 128K |

Data Takeaway: DeepSeek's discounted price undercuts even the estimated self-hosting cost of Llama 3.1 405B for many workloads, making it cheaper to use a managed API than to run open-source models on rented GPU hardware. This is a paradigm shift.

Industry Impact & Market Dynamics

The immediate effect is a compression of margins across the entire AI model API market. According to internal estimates from cloud providers, inference costs for large language models have dropped by 40-60% year-over-year, and DeepSeek's move accelerates this trend. The market for AI model APIs is projected to grow from $8 billion in 2024 to $35 billion by 2028, but the per-token revenue will be significantly lower than earlier forecasts.

Market Share Shift (Estimated, Q1 2025 vs. Q2 2025 Projected)

| Provider | Q1 2025 API Revenue Share | Q2 2025 Projected Share | Change |
|---|---|---|---|
| OpenAI | 55% | 45% | -10% |
| DeepSeek | 8% | 20% | +12% |
| Anthropic | 15% | 12% | -3% |
| Google | 12% | 13% | +1% |
| Others | 10% | 10% | 0% |

Data Takeaway: DeepSeek is projected to more than double its market share in a single quarter, primarily at the expense of OpenAI. This is a direct consequence of the price cut.

This price war has a second-order effect on the open-source ecosystem. With DeepSeek offering near-frontier performance at commodity prices, the incentive for companies to self-host open-source models like Llama or Mistral diminishes. Why bother with the operational complexity of managing GPU clusters when you can get a better model cheaper via API? This could slow the momentum of the open-source movement, as the 'API vs. self-host' calculus shifts decisively toward APIs.

Furthermore, the discount creates a 'lock-in' effect. Enterprises that integrate DeepSeek V4 Pro into their workflows over the next three weeks will build data pipelines, fine-tuning processes, and evaluation suites around the model. Switching costs are high, and by the time the discount expires, many customers will find it cheaper to stay with DeepSeek at the regular price than to migrate to a competitor.

Risks, Limitations & Open Questions

Quality Degradation Under Load: DeepSeek's aggressive pricing raises questions about inference quality at scale. Early reports from some users indicate increased latency during peak hours (up to 3 seconds for the first token) and occasional 'expert collapse' where the model defaults to a single expert, reducing output diversity. DeepSeek has acknowledged these issues and stated they are deploying additional capacity, but the discount may be straining their infrastructure.

Data Privacy Concerns: DeepSeek is a Chinese company, subject to Chinese data regulations including the Data Security Law and Personal Information Protection Law. For enterprises in regulated industries (finance, healthcare, defense), sending proprietary data to a Chinese-owned API carries significant compliance risk. The discount may not be enough to overcome these concerns for large, risk-averse organizations.

Sustainability of the Discount: The 75% discount is explicitly temporary. What happens on June 1? If DeepSeek raises prices back to normal, they risk a customer backlash. If they keep prices low, they destroy their own revenue model. The most likely outcome is a 'new normal' price somewhere between the discounted and regular rates—perhaps a 40-50% permanent reduction—but this uncertainty is causing hesitation among some procurement teams.

Benchmark Overfitting: DeepSeek has been accused of benchmark overfitting in the past. While V4 Pro scores well on MMLU and HumanEval, independent evaluations on more diverse, real-world tasks (like the 'Chatbot Arena' leaderboard) show a smaller gap between V4 Pro and GPT-4o. The discount may be a way to compensate for a perception that the model is 'good at tests but not as good in practice.'

AINews Verdict & Predictions

DeepSeek's 75% discount is the most significant pricing event in the AI industry since OpenAI launched GPT-4 at $0.06 per 1K tokens. It signals the end of the 'technology premium' era and the beginning of the 'scale economy' era.

Our Predictions:

1. OpenAI will be forced to cut GPT-4o prices by at least 50% within 60 days. Their hand will be forced by customer churn. The revenue loss will be painful but necessary to maintain market leadership.

2. Anthropic will double down on the 'safety premium' narrative, but will also introduce a lower-cost 'Claude 3.5 Haiku' model specifically to compete on price. They will not match DeepSeek's discount directly.

3. DeepSeek will not fully restore prices on June 1. Instead, they will announce a 'permanent' 50% discount, making the limited-time offer a marketing tactic to accelerate adoption. The real goal was customer acquisition, not short-term revenue.

4. The biggest losers will be mid-tier API providers like Cohere and AI21 Labs, which lack the scale to compete on price and the brand to compete on premium. Expect consolidation or pivots to niche verticals.

5. By Q3 2025, the average cost of a frontier model API will be below $1.00 per million input tokens. This will unlock a wave of AI-native applications—from automated customer support to real-time code generation—that were previously too expensive to deploy at scale.

The message is clear: AI is becoming a commodity. The winners will not be those with the best model, but those with the best business model. DeepSeek has just drawn the battle lines.

常见问题

这次模型发布“DeepSeek V4 Pro 75% Discount Ignites AI Price War: Strategy or Desperation?”的核心内容是什么？

On May 7, 2025, DeepSeek announced a limited-time 75% discount on its flagship large language model, V4 Pro, valid until May 31. The move slashes the cost of accessing one of the m…

从“DeepSeek V4 Pro vs GPT-4o cost comparison for enterprise”看，这个模型发布为什么重要？

围绕“Is DeepSeek V4 Pro safe for enterprise data privacy?”，这次模型更新对开发者和企业有什么影响？

开发者通常会重点关注能力提升、API 兼容性、成本变化和新场景机会，企业则会更关心可替代性、接入门槛和商业化落地空间。