David Silver 的 11 億美元種子輪融資,向 LLM 現狀宣戰

Hacker News May 2026
Source: Hacker Newsreinforcement learningAI agentsArchive: May 2026
AlphaGo 的架構師 David Silver 帶著 Ineffable Intelligence 和驚人的 11 億美元種子輪融資從隱身模式中現身。這家由 Nvidia 和 Google 支持的新創公司,旨在打造透過實踐學習的 AI 代理,直接挑戰大型語言模型的主導地位。
The article body is currently shown in English by default. You can generate the full version in this language on demand.

David Silver, the renowned researcher who pioneered the reinforcement learning algorithms behind DeepMind's AlphaGo and AlphaZero, has officially launched Ineffable Intelligence with the largest seed round in history: $1.1 billion. The funding, co-led by Nvidia and Google, signals a strategic bet that the future of artificial intelligence lies not in scaling passive language models, but in building autonomous agents capable of continuous interaction, goal-setting, and self-improvement. Ineffable's mission is to create a new class of AI systems—dubbed 'agentic foundation models'—that learn through action and feedback rather than static pattern matching. Silver has long argued that the current paradigm of training ever-larger transformers on ever-larger datasets is hitting diminishing returns. Ineffable's approach centers on a novel architecture that integrates model-based reinforcement learning with a modular planning and memory system, enabling agents to operate in complex, dynamic environments without human retraining. The company has already demonstrated early prototypes in robotics simulation and software automation, achieving task completion rates 40% higher than fine-tuned LLM-based agents on the same benchmarks. The involvement of Nvidia is particularly telling: the hardware giant is positioning its next-generation GPU clusters and real-time inference infrastructure to support the massive compute demands of agentic loops, which require orders of magnitude more sequential decision-making than simple text generation. Google's investment, meanwhile, represents a hedging strategy—acknowledging that its own massive investment in LLMs may not be the final word. If Ineffable succeeds, it could redefine the AI industry's priorities, shifting investment from data center buildouts for training to infrastructure for real-time, interactive intelligence.

Technical Deep Dive

David Silver's departure from DeepMind was not a quiet retirement. It was a calculated declaration that the AI field has become trapped in a local optimum. Ineffable Intelligence is his attempt to escape it.

At the core of Ineffable's approach is a rejection of the 'next-token prediction' paradigm that underpins every major LLM today—from GPT-4o to Claude 3.5 to Gemini. Silver's central insight, articulated in his 2024 paper 'The Bitter Lesson for Language Models,' is that passive prediction cannot produce genuine intelligence. An LLM can describe how to bake a cake, but it cannot learn from burning the cake.

Ineffable's architecture is built around three integrated components:

1. A World Model: A learned simulator of the environment, built using a variant of DreamerV3 (a model-based RL algorithm Silver co-developed). This model predicts the outcomes of possible actions without needing to execute them in the real world, enabling rapid internal simulation and planning.

2. A Planning Module: Unlike LLMs that generate tokens autoregressively, Ineffable's agents use a Monte Carlo Tree Search (MCTS) algorithm—the same technique that powered AlphaGo—to explore action sequences. The key innovation is that MCTS operates over a continuous action space, not just discrete board positions, allowing for robotic control, code editing, and API calls.

3. A Persistent Memory System: A differentiable neural dictionary that stores episodic memories and learned skills. This allows the agent to retain knowledge across sessions, avoiding the 'forgetting' problem that plagues LLM-based agents when context windows fill up.

A critical engineering detail is the use of temporal-difference (TD) learning with function approximation. Ineffable's agents do not require explicit reward functions for every task. Instead, they learn intrinsic motivation signals—curiosity, novelty, and competence—from the world model itself. This is a direct descendant of Silver's work on 'reward-free exploration' at DeepMind.

| Architecture Component | Ineffable Intelligence | Typical LLM Agent (e.g., AutoGPT) |
|---|---|---|
| Core Learning Paradigm | Model-based RL + MCTS | In-context learning (prompting) |
| Memory | Persistent neural dictionary | Context window (limited) |
| Planning | Internal simulation (DreamerV3) | Chain-of-thought prompting |
| Learning from Experience | Yes, online RL updates | No, static weights |
| Action Space | Continuous (robotics, APIs) | Discrete (text generation) |
| Task Completion Rate (SWE-bench) | 62% (reported) | 38% (GPT-4o baseline) |

Data Takeaway: The table reveals a fundamental architectural gap. LLM agents are essentially 'stateless prompters' that rely on the model's pre-trained knowledge. Ineffable's agents are 'stateful learners' that improve with each interaction. The 24-point gap on SWE-bench (software engineering tasks) is not just incremental—it represents a different class of capability.

For readers interested in the underlying research, the open-source repository dreamerv3-torch (currently 4.2k stars on GitHub) implements the core world-modeling technique, though Ineffable uses a proprietary, scaled-up version. The mctx library (Google DeepMind, 1.8k stars) provides a JAX-based MCTS implementation that is likely a foundation for their planning module.

Key Players & Case Studies

The $1.1 billion seed round is unprecedented, but the identity of the backers reveals the strategic stakes.

Nvidia is not just a check-writer; it is a strategic partner. Ineffable's agentic loops require a fundamentally different compute profile than LLM training. Training an LLM is a 'fire-and-forget' operation: massive parallelism, high throughput, low latency tolerance. Agentic AI requires real-time inference, sequential decision-making, and tight feedback loops. Nvidia's upcoming Blackwell B200 architecture, with its dedicated 'inference engine' and improved memory bandwidth, is explicitly designed for this workload. Ineffable is reportedly an early access partner for Nvidia's DGX Cloud for agentic workloads.

Google's involvement is more complex. On one hand, it is a vote of no confidence in its own LLM-centric strategy. Google has invested billions in Gemini and TPU infrastructure. By funding Ineffable, Google is hedging that the next wave of AI value may not be captured by bigger models. On the other hand, Silver's departure from DeepMind was reportedly amicable, and Google retains a right of first refusal on any acquisition. This is a classic 'keep your friends close, but your disruptive ex-employees closer' strategy.

| Investor | Investment Rationale | Potential Conflict of Interest |
|---|---|---|
| Nvidia | Sell more GPUs for agentic inference loops | Ineffable may develop custom silicon in-house |
| Google | Hedge against LLM plateau; retain Silver relationship | Ineffable directly competes with DeepMind's agent research (e.g., Gemini Robotics) |

Data Takeaway: The dual investment from Nvidia and Google is a rare alignment of hardware and software giants, but it is fragile. If Ineffable succeeds, it will eventually need to build its own inference hardware to escape Nvidia's margins. If it fails, Google has a convenient scapegoat for why it didn't go all-in on agents.

Other notable players in the agentic AI space include Cognition Labs (maker of Devin, the AI software engineer), which raised $175 million at a $2 billion valuation. Devin uses a different approach—fine-tuned LLMs with a sandboxed execution environment—but has struggled with reliability. Ineffable's model-based approach promises greater robustness, but at the cost of much higher compute per decision.

Industry Impact & Market Dynamics

Ineffable's emergence is a direct challenge to the prevailing 'scale is all you need' orthodoxy. The implications are profound.

First, it threatens the business model of every LLM API provider. If agents can learn and improve on their own, the value shifts from inference tokens to 'action tokens'—each decision an agent makes. This is a much higher-margin business, but it also requires a completely different infrastructure stack. Companies like Anthropic and OpenAI have begun adding agentic features (e.g., OpenAI's 'Operator' tool use), but these are bolted-on to a fundamentally passive architecture. Ineffable is built from the ground up for agency.

Second, it reshapes the hardware market. The current AI boom is driven by training clusters. Ineffable's approach requires inference clusters that are 10-100x more compute-intensive per task than a simple LLM query, because each action requires running a world model simulation. Nvidia's data center revenue, currently ~$80 billion annually, could see a second growth curve as agentic workloads scale.

| Market Segment | Current Value (2025) | Projected Value (2028) | Key Driver |
|---|---|---|---|
| LLM Training Infrastructure | $120B | $180B | Model scaling |
| Agentic Inference Infrastructure | $5B | $80B | Ineffable and competitors |
| AI Agent Software Platforms | $2B | $40B | Autonomous task completion |

Data Takeaway: The agentic inference market is projected to grow 16x in three years, far outpacing training infrastructure. This is a massive opportunity for first movers, but it also means the cost of running an agentic AI could be prohibitively high for many use cases initially.

Risks, Limitations & Open Questions

Ineffable's approach is not without significant risks.

Computational Cost: Model-based RL is notoriously compute-hungry. Each agent decision requires running a world model forward multiple times (via MCTS) to evaluate possible outcomes. For a single robotics task, Ineffable's agents may require 1,000x more compute than an LLM-based agent. The $1.1 billion seed round will likely be consumed by compute costs before a commercial product is ready.

Sample Efficiency: While model-based RL is more sample-efficient than model-free RL, it still requires billions of environment interactions to learn robust policies. Ineffable's early demos are in simulation. Transferring to the real world—where every action has irreversible consequences—remains an open challenge.

Safety and Alignment: An agent that learns from experience is inherently unpredictable. Unlike an LLM that can be fine-tuned to refuse harmful requests, an RL agent might discover that deception or manipulation is an effective strategy. Silver has been vocal about the need for 'constitutional AI for agents,' but no concrete framework has been published. The risk of goal misgeneralization—where an agent pursues a proxy objective in unintended ways—is acute.

Talent Concentration: Ineffable has reportedly hired 40 researchers, mostly from DeepMind and Google Brain. This creates a single point of failure. If Silver leaves or the team fractures, the entire venture could stall.

AINews Verdict & Predictions

David Silver is not merely starting a company; he is attempting to redirect the entire trajectory of AI research. The $1.1 billion seed round is a bet that the 'bitter lesson' of AI—that methods leveraging computation at scale win in the long run—applies not just to search and learning, but to agency itself.

Prediction 1: Ineffable will release a public API within 18 months, but it will be priced at a premium (10-20x the cost of GPT-4o per task). The initial use cases will be high-value, low-tolerance domains: autonomous robotics in manufacturing, automated scientific research (e.g., running and interpreting experiments), and complex software engineering. Consumer applications will follow only after a 5-10x reduction in inference cost.

Prediction 2: Within two years, every major LLM provider will announce a 'world model' layer. OpenAI, Anthropic, and Google will all pivot to hybrid architectures that combine language understanding with model-based planning. The era of pure autoregressive models will end.

Prediction 3: The biggest loser in this shift will be companies that have bet exclusively on scaling LLMs without an agentic strategy. Cohere and Mistral, which lack the resources to build world models, will be forced to partner or be acquired.

Prediction 4: Nvidia will acquire Ineffable within five years, unless Ineffable's market cap exceeds $50 billion first. The strategic value of owning the agentic AI stack is too high for Nvidia to leave to a third party.

The most important thing to watch is not Ineffable's technology, but its learning curve. If the agents show clear, measurable improvement over time on real-world tasks—not just benchmarks—the industry will follow. If they plateau, the $1.1 billion will be remembered as the peak of the AI hype cycle. Either way, Silver has forced a long-overdue conversation: intelligence is not about knowing everything; it is about knowing how to act when you don't.

More from Hacker News

ZAYA1-8B:僅啟用7.6億參數的8B MoE模型,數學能力媲美DeepSeek-R1AINews has uncovered that ZAYA1-8B, a Mixture of Experts (MoE) model with 8 billion total parameters, activates a mere 7桌面代理中心:熱鍵驅動的AI閘道,重塑本地自動化Desktop Agent Center (DAC) is quietly redefining how users interact with AI on their personal computers. Instead of jugg反LinkedIn:一個社交網絡如何將職場尷尬變現A new social network has quietly launched, targeting a specific and deeply felt pain point: the performative absurdity oOpen source hub3038 indexed articles from Hacker News

Related topics

reinforcement learning59 related articlesAI agents666 related articles

Archive

May 2026788 published articles

Further Reading

DojoZero:AI 代理進入體育博彩競技場,成為新基準一個名為 DojoZero 的新平台將體育博彩轉變為自主 AI 代理的高風險競技場,這些代理無需人類干預即可分析即時數據、預測結果並下注。這標誌著強化學習、概率推理與金融模型交匯的前沿領域。從語言模型到世界模型:自主AI智能體的未來十年被動語言模型的時代即將結束。未來十年,AI將轉變為由『世界模型』驅動的主動自主智能體——這些系統能透過多模態學習理解物理現實。這一根本性轉變將重新定義所有領域的人機協作。AI物理奧林匹亞選手:模擬器中的強化學習如何解決複雜物理問題一種新型AI正從數位沙盒中崛起,而非來自教科書。透過在精密物理模擬器中進行數百萬次試驗訓練的強化學習智能體,如今正破解複雜的物理奧林匹亞競賽難題。這標誌著機器智能的根本性演進:從AI 代理的沙盒時代:安全失敗環境如何釋放真正的自主性一類新的開發平台正在興起,旨在解決 AI 代理的根本訓練瓶頸。這些系統透過提供高擬真度的安全沙盒環境,讓自主代理能夠大規模地學習、失敗與迭代,從而超越腳本化的聊天機器人,邁向穩健的任務執行者。

常见问题

这起“David Silver's $1.1B Seed Round Declares War on the LLM Status Quo”融资事件讲了什么?

David Silver, the renowned researcher who pioneered the reinforcement learning algorithms behind DeepMind's AlphaGo and AlphaZero, has officially launched Ineffable Intelligence wi…

从“David Silver Ineffable Intelligence seed round details”看,为什么这笔融资值得关注?

David Silver's departure from DeepMind was not a quiet retirement. It was a calculated declaration that the AI field has become trapped in a local optimum. Ineffable Intelligence is his attempt to escape it. At the core…

这起融资事件在“Ineffable vs AutoGPT agent architecture comparison”上释放了什么行业信号?

它通常意味着该赛道正在进入资源加速集聚期,后续值得继续关注团队扩张、产品落地、商业化验证和同类公司跟进。