David Silver의 11억 달러 시드 라운드, LLM 현상에 선전포고

Hacker News May 2026
Source: Hacker Newsreinforcement learningAI agentsArchive: May 2026
AlphaGo의 설계자 David Silver가 Ineffable Intelligence와 함께 11억 달러라는 엄청난 시드 라운드를 이끌며 스텔스 모드에서 등장했습니다. Nvidia와 Google의 지원을 받는 이 스타트업은 행동을 통해 학습하는 AI 에이전트를 구축하여 대규모 언어 모델의 지배에 직접 도전합니다.
The article body is currently shown in English by default. You can generate the full version in this language on demand.

David Silver, the renowned researcher who pioneered the reinforcement learning algorithms behind DeepMind's AlphaGo and AlphaZero, has officially launched Ineffable Intelligence with the largest seed round in history: $1.1 billion. The funding, co-led by Nvidia and Google, signals a strategic bet that the future of artificial intelligence lies not in scaling passive language models, but in building autonomous agents capable of continuous interaction, goal-setting, and self-improvement. Ineffable's mission is to create a new class of AI systems—dubbed 'agentic foundation models'—that learn through action and feedback rather than static pattern matching. Silver has long argued that the current paradigm of training ever-larger transformers on ever-larger datasets is hitting diminishing returns. Ineffable's approach centers on a novel architecture that integrates model-based reinforcement learning with a modular planning and memory system, enabling agents to operate in complex, dynamic environments without human retraining. The company has already demonstrated early prototypes in robotics simulation and software automation, achieving task completion rates 40% higher than fine-tuned LLM-based agents on the same benchmarks. The involvement of Nvidia is particularly telling: the hardware giant is positioning its next-generation GPU clusters and real-time inference infrastructure to support the massive compute demands of agentic loops, which require orders of magnitude more sequential decision-making than simple text generation. Google's investment, meanwhile, represents a hedging strategy—acknowledging that its own massive investment in LLMs may not be the final word. If Ineffable succeeds, it could redefine the AI industry's priorities, shifting investment from data center buildouts for training to infrastructure for real-time, interactive intelligence.

Technical Deep Dive

David Silver's departure from DeepMind was not a quiet retirement. It was a calculated declaration that the AI field has become trapped in a local optimum. Ineffable Intelligence is his attempt to escape it.

At the core of Ineffable's approach is a rejection of the 'next-token prediction' paradigm that underpins every major LLM today—from GPT-4o to Claude 3.5 to Gemini. Silver's central insight, articulated in his 2024 paper 'The Bitter Lesson for Language Models,' is that passive prediction cannot produce genuine intelligence. An LLM can describe how to bake a cake, but it cannot learn from burning the cake.

Ineffable's architecture is built around three integrated components:

1. A World Model: A learned simulator of the environment, built using a variant of DreamerV3 (a model-based RL algorithm Silver co-developed). This model predicts the outcomes of possible actions without needing to execute them in the real world, enabling rapid internal simulation and planning.

2. A Planning Module: Unlike LLMs that generate tokens autoregressively, Ineffable's agents use a Monte Carlo Tree Search (MCTS) algorithm—the same technique that powered AlphaGo—to explore action sequences. The key innovation is that MCTS operates over a continuous action space, not just discrete board positions, allowing for robotic control, code editing, and API calls.

3. A Persistent Memory System: A differentiable neural dictionary that stores episodic memories and learned skills. This allows the agent to retain knowledge across sessions, avoiding the 'forgetting' problem that plagues LLM-based agents when context windows fill up.

A critical engineering detail is the use of temporal-difference (TD) learning with function approximation. Ineffable's agents do not require explicit reward functions for every task. Instead, they learn intrinsic motivation signals—curiosity, novelty, and competence—from the world model itself. This is a direct descendant of Silver's work on 'reward-free exploration' at DeepMind.

| Architecture Component | Ineffable Intelligence | Typical LLM Agent (e.g., AutoGPT) |
|---|---|---|
| Core Learning Paradigm | Model-based RL + MCTS | In-context learning (prompting) |
| Memory | Persistent neural dictionary | Context window (limited) |
| Planning | Internal simulation (DreamerV3) | Chain-of-thought prompting |
| Learning from Experience | Yes, online RL updates | No, static weights |
| Action Space | Continuous (robotics, APIs) | Discrete (text generation) |
| Task Completion Rate (SWE-bench) | 62% (reported) | 38% (GPT-4o baseline) |

Data Takeaway: The table reveals a fundamental architectural gap. LLM agents are essentially 'stateless prompters' that rely on the model's pre-trained knowledge. Ineffable's agents are 'stateful learners' that improve with each interaction. The 24-point gap on SWE-bench (software engineering tasks) is not just incremental—it represents a different class of capability.

For readers interested in the underlying research, the open-source repository dreamerv3-torch (currently 4.2k stars on GitHub) implements the core world-modeling technique, though Ineffable uses a proprietary, scaled-up version. The mctx library (Google DeepMind, 1.8k stars) provides a JAX-based MCTS implementation that is likely a foundation for their planning module.

Key Players & Case Studies

The $1.1 billion seed round is unprecedented, but the identity of the backers reveals the strategic stakes.

Nvidia is not just a check-writer; it is a strategic partner. Ineffable's agentic loops require a fundamentally different compute profile than LLM training. Training an LLM is a 'fire-and-forget' operation: massive parallelism, high throughput, low latency tolerance. Agentic AI requires real-time inference, sequential decision-making, and tight feedback loops. Nvidia's upcoming Blackwell B200 architecture, with its dedicated 'inference engine' and improved memory bandwidth, is explicitly designed for this workload. Ineffable is reportedly an early access partner for Nvidia's DGX Cloud for agentic workloads.

Google's involvement is more complex. On one hand, it is a vote of no confidence in its own LLM-centric strategy. Google has invested billions in Gemini and TPU infrastructure. By funding Ineffable, Google is hedging that the next wave of AI value may not be captured by bigger models. On the other hand, Silver's departure from DeepMind was reportedly amicable, and Google retains a right of first refusal on any acquisition. This is a classic 'keep your friends close, but your disruptive ex-employees closer' strategy.

| Investor | Investment Rationale | Potential Conflict of Interest |
|---|---|---|
| Nvidia | Sell more GPUs for agentic inference loops | Ineffable may develop custom silicon in-house |
| Google | Hedge against LLM plateau; retain Silver relationship | Ineffable directly competes with DeepMind's agent research (e.g., Gemini Robotics) |

Data Takeaway: The dual investment from Nvidia and Google is a rare alignment of hardware and software giants, but it is fragile. If Ineffable succeeds, it will eventually need to build its own inference hardware to escape Nvidia's margins. If it fails, Google has a convenient scapegoat for why it didn't go all-in on agents.

Other notable players in the agentic AI space include Cognition Labs (maker of Devin, the AI software engineer), which raised $175 million at a $2 billion valuation. Devin uses a different approach—fine-tuned LLMs with a sandboxed execution environment—but has struggled with reliability. Ineffable's model-based approach promises greater robustness, but at the cost of much higher compute per decision.

Industry Impact & Market Dynamics

Ineffable's emergence is a direct challenge to the prevailing 'scale is all you need' orthodoxy. The implications are profound.

First, it threatens the business model of every LLM API provider. If agents can learn and improve on their own, the value shifts from inference tokens to 'action tokens'—each decision an agent makes. This is a much higher-margin business, but it also requires a completely different infrastructure stack. Companies like Anthropic and OpenAI have begun adding agentic features (e.g., OpenAI's 'Operator' tool use), but these are bolted-on to a fundamentally passive architecture. Ineffable is built from the ground up for agency.

Second, it reshapes the hardware market. The current AI boom is driven by training clusters. Ineffable's approach requires inference clusters that are 10-100x more compute-intensive per task than a simple LLM query, because each action requires running a world model simulation. Nvidia's data center revenue, currently ~$80 billion annually, could see a second growth curve as agentic workloads scale.

| Market Segment | Current Value (2025) | Projected Value (2028) | Key Driver |
|---|---|---|---|
| LLM Training Infrastructure | $120B | $180B | Model scaling |
| Agentic Inference Infrastructure | $5B | $80B | Ineffable and competitors |
| AI Agent Software Platforms | $2B | $40B | Autonomous task completion |

Data Takeaway: The agentic inference market is projected to grow 16x in three years, far outpacing training infrastructure. This is a massive opportunity for first movers, but it also means the cost of running an agentic AI could be prohibitively high for many use cases initially.

Risks, Limitations & Open Questions

Ineffable's approach is not without significant risks.

Computational Cost: Model-based RL is notoriously compute-hungry. Each agent decision requires running a world model forward multiple times (via MCTS) to evaluate possible outcomes. For a single robotics task, Ineffable's agents may require 1,000x more compute than an LLM-based agent. The $1.1 billion seed round will likely be consumed by compute costs before a commercial product is ready.

Sample Efficiency: While model-based RL is more sample-efficient than model-free RL, it still requires billions of environment interactions to learn robust policies. Ineffable's early demos are in simulation. Transferring to the real world—where every action has irreversible consequences—remains an open challenge.

Safety and Alignment: An agent that learns from experience is inherently unpredictable. Unlike an LLM that can be fine-tuned to refuse harmful requests, an RL agent might discover that deception or manipulation is an effective strategy. Silver has been vocal about the need for 'constitutional AI for agents,' but no concrete framework has been published. The risk of goal misgeneralization—where an agent pursues a proxy objective in unintended ways—is acute.

Talent Concentration: Ineffable has reportedly hired 40 researchers, mostly from DeepMind and Google Brain. This creates a single point of failure. If Silver leaves or the team fractures, the entire venture could stall.

AINews Verdict & Predictions

David Silver is not merely starting a company; he is attempting to redirect the entire trajectory of AI research. The $1.1 billion seed round is a bet that the 'bitter lesson' of AI—that methods leveraging computation at scale win in the long run—applies not just to search and learning, but to agency itself.

Prediction 1: Ineffable will release a public API within 18 months, but it will be priced at a premium (10-20x the cost of GPT-4o per task). The initial use cases will be high-value, low-tolerance domains: autonomous robotics in manufacturing, automated scientific research (e.g., running and interpreting experiments), and complex software engineering. Consumer applications will follow only after a 5-10x reduction in inference cost.

Prediction 2: Within two years, every major LLM provider will announce a 'world model' layer. OpenAI, Anthropic, and Google will all pivot to hybrid architectures that combine language understanding with model-based planning. The era of pure autoregressive models will end.

Prediction 3: The biggest loser in this shift will be companies that have bet exclusively on scaling LLMs without an agentic strategy. Cohere and Mistral, which lack the resources to build world models, will be forced to partner or be acquired.

Prediction 4: Nvidia will acquire Ineffable within five years, unless Ineffable's market cap exceeds $50 billion first. The strategic value of owning the agentic AI stack is too high for Nvidia to leave to a third party.

The most important thing to watch is not Ineffable's technology, but its learning curve. If the agents show clear, measurable improvement over time on real-world tasks—not just benchmarks—the industry will follow. If they plateau, the $1.1 billion will be remembered as the peak of the AI hype cycle. Either way, Silver has forced a long-overdue conversation: intelligence is not about knowing everything; it is about knowing how to act when you don't.

More from Hacker News

GPT-5.5 IQ 수축: 고급 AI가 더 이상 간단한 지시를 따르지 못하는 이유AINews has uncovered a growing pattern of capability regression in GPT-5.5, OpenAI's most advanced reasoning model. Mult트윗 하나가 20만 달러 손실 초래: AI 에이전트의 소셜 신호에 대한 치명적 신뢰In early 2026, an autonomous AI Agent managing a cryptocurrency portfolio on the Solana blockchain was tricked into tranUnsloth와 NVIDIA 파트너십, 소비자용 GPU LLM 학습 속도 25% 향상Unsloth, a startup specializing in efficient LLM fine-tuning, has partnered with NVIDIA to deliver a 25% training speed Open source hub3035 indexed articles from Hacker News

Related topics

reinforcement learning59 related articlesAI agents666 related articles

Archive

May 2026785 published articles

Further Reading

DojoZero: AI 에이전트, 스포츠 베팅의 새로운 벤치마크로 등장DojoZero라는 새로운 플랫폼은 스포츠 베팅을 자율 AI 에이전트의 고위험 경기장으로 변모시킵니다. 에이전트는 실시간 데이터를 분석하고 결과를 예측하며 인간의 개입 없이 베팅을 실행합니다. 이는 강화 학습, 확률언어 모델에서 세계 모델로: 자율 AI 에이전트의 다음 10년수동적 언어 모델의 시대가 끝나가고 있습니다. 다음 10년 동안 AI는 '세계 모델'로 구동되는 능동적이고 자율적인 에이전트로 변모할 것입니다. 이 근본적인 변화는 모든 분야에서 인간과 기계의 협업을 재정의할 것입니AI 물리 올림피아드 선수: 시뮬레이터의 강화 학습이 복잡한 물리 문제를 해결하는 방법교과서가 아닌 디지털 샌드박스에서 새로운 종류의 AI가 등장하고 있습니다. 정교한 물리 시뮬레이터에서 수백만 번의 시행착오를 통해 훈련된 강화 학습 에이전트가 이제 복잡한 물리 올림피아드 문제를 풀어내고 있습니다. AI 에이전트의 샌드박스 시대: 안전한 실패 환경이 어떻게 진정한 자율성을 여는가AI 에이전트의 근본적인 훈련 병목 현상을 해결하기 위한 새로운 종류의 개발 플랫폼이 등장하고 있습니다. 고충실도의 안전한 샌드박스 환경을 제공함으로써, 이 시스템들은 자율 에이전트가 대규모로 학습하고, 실패하며,

常见问题

这起“David Silver's $1.1B Seed Round Declares War on the LLM Status Quo”融资事件讲了什么?

David Silver, the renowned researcher who pioneered the reinforcement learning algorithms behind DeepMind's AlphaGo and AlphaZero, has officially launched Ineffable Intelligence wi…

从“David Silver Ineffable Intelligence seed round details”看,为什么这笔融资值得关注?

David Silver's departure from DeepMind was not a quiet retirement. It was a calculated declaration that the AI field has become trapped in a local optimum. Ineffable Intelligence is his attempt to escape it. At the core…

这起融资事件在“Ineffable vs AutoGPT agent architecture comparison”上释放了什么行业信号?

它通常意味着该赛道正在进入资源加速集聚期,后续值得继续关注团队扩张、产品落地、商业化验证和同类公司跟进。