LLM 發現的 FreeBSD 漏洞被 CHERI 硬體攔截:安全範式轉移

Hacker News April 2026
Source: Hacker NewsLLMArchive: April 2026
大型語言模型首次發現 FreeBSD 中的關鍵記憶體損毀漏洞,但攻擊被 CHERI 硬體層級的記憶體安全機制化為無效。這項里程碑證明,硬體原生安全能在 AI 發現的零日漏洞成為威脅前將其消除。
The article body is currently shown in English by default. You can generate the full version in this language on demand.

In a watershed moment for systems security, researchers demonstrated that a classic memory corruption vulnerability in the FreeBSD kernel, identified through systematic code audit by a large language model, was completely blocked by the CHERI (Capability Hardware Enhanced RISC Instructions) architecture. The vulnerability, a use-after-free bug in the network stack, would have granted an attacker arbitrary code execution on any conventional x86 or ARM system. On CHERI hardware, however, the exploit failed at the instruction level because the processor enforced fine-grained memory permissions and capability token verification, preventing the attacker from forging pointers or accessing unauthorized memory regions. This is the first publicly documented case where an AI-discovered vulnerability was neutralized by hardware-enforced memory safety, validating a decade of academic research from the University of Cambridge and SRI International. The implications are profound: as LLMs accelerate vulnerability discovery to machine speed, the traditional model of reactive software patching becomes unsustainable. CHERI offers a path toward 'immunity by design,' where entire classes of memory safety bugs—responsible for roughly 70% of all critical CVEs—become unexploitable. This event is expected to accelerate CHERI deployment in cloud infrastructure, IoT, and critical systems, while forcing operating system designers to reconsider capability-based security models as a core architectural principle.

Technical Deep Dive

The vulnerability in question was a classic use-after-free bug in FreeBSD's TCP stack, specifically within the `tcp_usrreq` function. An LLM—likely a variant of GPT-4 or a fine-tuned code analysis model—scanned the kernel source and identified a path where a socket buffer (`mbuf`) was freed but a dangling pointer remained accessible in a control block. On a standard architecture, an attacker could trigger this race condition to overwrite the freed memory with controlled data, hijacking the instruction pointer and executing arbitrary code with kernel privileges.

CHERI's defense operates at the microarchitectural level. The core innovation is the capability: a 128-bit or 256-bit token that combines a pointer with unforgeable bounds, permissions, and validity metadata. Every memory access is checked against the capability's authority. In this case, when the dangling pointer was dereferenced, the CHERI processor's capability coprocessor detected that the capability had been revoked (since the underlying memory was freed), and raised a hardware exception—not a software signal that could be intercepted, but a processor-level trap that halted execution instantly.

Key architectural components:
- Capability coprocessor: Integrated into the CPU pipeline, it validates every load/store against capability registers.
- Monotonicity: Capabilities can only be narrowed (reduced permissions), never widened, preventing privilege escalation.
- Compartmentalization: The kernel itself is divided into fine-grained compartments, each with its own capability table, so even a kernel bug cannot corrupt other compartments.

For readers interested in the open-source implementation, the CHERI LLVM toolchain (GitHub: `CTSRD-CHERI/llvm-project`, ~1,200 stars) provides the compiler support, while CHERI FreeBSD (GitHub: `CTSRD-CHERI/cheribsd`, ~800 stars) is the reference OS port. The Morello board from Arm (a CHERI prototype) is available for testing, though production hardware remains limited.

| Metric | Standard RISC-V | CHERI-RISC-V | Improvement Factor |
|---|---|---|---|
| Memory safety CVEs exploitable | ~100% | ~0% (theoretically) | Infinite |
| Performance overhead (SPEC CPU 2017) | Baseline | 2-5% | Negligible |
| Code size increase | Baseline | 3-8% | Acceptable |
| Hardware area overhead (est.) | Baseline | 5-10% | Moderate |
| Deployment complexity | Low | High (requires new silicon) | — |

Data Takeaway: The performance overhead of CHERI is remarkably low (2-5%) compared to software-only mitigations like Address Space Layout Randomization (ASLR) or Control Flow Integrity (CFI), which can introduce 10-30% overhead and still leave side channels open. The trade-off is hardware cost and deployment inertia, but for cloud providers, the elimination of entire classes of exploits justifies the investment.

Key Players & Case Studies

The CHERI project originated at the University of Cambridge Computer Laboratory, led by Professor Robert Watson, with major contributions from SRI International and Arm Research. Arm's Morello program, a 2021-2024 initiative, produced a prototype CPU and board specifically to evaluate CHERI in real-world scenarios. The FreeBSD port was a collaborative effort between Cambridge and the FreeBSD Foundation.

On the AI side, the LLM used for vulnerability discovery was likely a specialized code analysis model. Several startups and research groups are now deploying LLMs for systematic vulnerability hunting:
- Chainguard uses LLMs to audit open-source packages.
- Socket.dev employs AI for supply chain security scanning.
- Palo Alto Networks has demonstrated LLM-based fuzzing pipelines.

| Entity | Role | Key Contribution | Status |
|---|---|---|---|
| Cambridge University | Research lead | CHERI architecture, FreeBSD port | Active, academic |
| SRI International | Co-developer | Formal verification, security policies | Active |
| Arm Research | Hardware partner | Morello prototype, ISA extensions | Prototype phase |
| FreeBSD Foundation | OS integration | Kernel compartmentalization | Production-ready on Morello |
| Google (Project Zero) | Vulnerability research | LLM-based bug hunting tools | Experimental |

Data Takeaway: The collaboration between academia (Cambridge, SRI) and industry (Arm, FreeBSD Foundation) is crucial. Unlike proprietary solutions, CHERI is open-source and royalty-free, which lowers the barrier for adoption but also means slower standardization.

Industry Impact & Market Dynamics

This event is a catalyst for a fundamental shift in cybersecurity spending. According to Gartner, global cybersecurity spending reached $188 billion in 2024, with over 60% allocated to software patching, incident response, and vulnerability management. Hardware-based memory safety could reduce this by 30-40% over a decade, as entire classes of bugs become non-exploitable.

The immediate beneficiaries are cloud hyperscalers (AWS, Azure, Google Cloud) and IoT chipmakers. AWS has already invested in custom silicon (Graviton, Nitro) and could integrate CHERI into future designs. For IoT, where patching is often impossible, CHERI offers a permanent fix for memory safety bugs.

| Market Segment | Current Annual Spend | Projected CHERI-Adjacent Spend (2030) | CAGR |
|---|---|---|---|
| Cloud infrastructure security | $45B | $60B | 5% |
| IoT/embedded security | $12B | $25B | 13% |
| OS kernel development | $8B | $10B | 3% |
| Chip design (security features) | $6B | $15B | 16% |

Data Takeaway: The fastest growth is in chip design for security, as hardware vendors race to include capability-based features. IoT security spending is also accelerating because CHERI solves the 'unpatchable device' problem.

Risks, Limitations & Open Questions

Despite the success, CHERI is not a silver bullet. First, it only protects against spatial and temporal memory safety violations—it does not prevent logic bugs, side-channel attacks, or supply chain compromises. Second, the transition cost is enormous: rewriting operating systems and applications to be capability-aware requires decades of effort. Third, performance overhead, while low for compute-bound workloads, can spike for memory-intensive applications (e.g., databases, web servers) due to capability validation latency.

There are also open research questions:
- Can LLMs themselves be used to find bugs in CHERI's own implementation? (A meta-risk.)
- How do we formally verify that the capability coprocessor itself is bug-free?
- Will attackers develop CHERI-aware exploits that bypass capability checks? (Theoretical attacks exist, but none practical yet.)

AINews Verdict & Predictions

This is the most important security development since the introduction of ASLR. Our editorial judgment is that CHERI will follow a similar adoption curve to ARM's TrustZone: slow for the first 5 years, then exponential as cloud providers and governments mandate hardware memory safety.

Predictions:
1. By 2028, at least one major cloud provider will announce a CHERI-enabled server chip.
2. By 2030, the Linux kernel will have an official CHERI port, mirroring FreeBSD's lead.
3. By 2032, memory safety CVEs will drop by 50% in CHERI-deployed environments.
4. The LLM-security arms race will bifurcate: attackers will use LLMs to find logic bugs (which CHERI does not stop), while defenders will use LLMs to automatically generate capability policies.

The FreeBSD-CHERI event is not a one-off demonstration; it is the first shot in a new era where hardware, not software, defines the security boundary. The question is no longer 'Can we patch fast enough?' but 'Can we design systems that don't need patching?' CHERI answers with a definitive yes.

More from Hacker News

Claude 無法賺取真實收入:AI 編碼代理實驗揭示殘酷真相In a controlled experiment, AINews tasked Claude with completing real paid programming bounties on Algora, a platform whClaude 記憶可視化工具:一款全新 macOS 應用程式揭開 AI 黑箱A new macOS-native application has emerged that can directly parse and display the memory files generated by Claude CodeAI 首次發現 M5 晶片漏洞:Claude Mythos 攻破 Apple 的記憶堡壘In a landmark event for both artificial intelligence and hardware security, researchers using Anthropic's Claude Mythos Open source hub3511 indexed articles from Hacker News

Related topics

LLM23 related articles

Archive

April 20263042 published articles

Further Reading

Anthropic Mythos 漏洞曝光前沿AI安全致命缺陷Anthropic 正在調查其實驗性 AI 工具 Mythos 的未經授權存取事件,該工具為具備自主多步驟推理與工具調用能力的代理系統。此事件暴露了前沿模型能力與營運安全實務之間的結構性落差,可能重新定義安全威脅的邊界。Anthropic 以 Rust 重寫 Bun:AI 加速自身基礎設施演進Anthropic 已將以 Rust 重寫的 Bun JavaScript 執行環境整合至其核心基礎設施,並借助 AI 輔助編碼與自動化測試,將傳統上需耗時數月的重寫過程壓縮至驚人的短週期內完成。這標誌著一個關鍵轉折:AI 實驗室正開始運用Bun 的 Rust 重寫:Claude 如何重新定義 AI 驅動的程式碼遷移高效能 JavaScript 執行環境 Bun 正借助 Anthropic 的 Claude,從 Zig 語言移植到 Rust。我們的編輯團隊審閱了早期的 Rust 翻譯程式碼,發現其速度驚人,但也顯露出 AI 在語言慣用語法上的盲點。Hi-Vis攻擊:利用LLM對系統更新信任的單次越獄手法一種名為Hi-Vis的新型越獄技術,透過將惡意提示偽裝成合法的軟體修補程式指令,在單次查詢中達到100%的成功率。它利用LLM傾向優先處理「更新」和「修補」上下文的特性,繞過安全對齊機制,對開發者構成嚴重威脅。

常见问题

这次模型发布“LLM-Discovered FreeBSD Bug Stopped by CHERI Hardware: A Security Paradigm Shift”的核心内容是什么?

In a watershed moment for systems security, researchers demonstrated that a classic memory corruption vulnerability in the FreeBSD kernel, identified through systematic code audit…

从“CHERI vs ARM memory tagging extension comparison”看,这个模型发布为什么重要?

The vulnerability in question was a classic use-after-free bug in FreeBSD's TCP stack, specifically within the tcp_usrreq function. An LLM—likely a variant of GPT-4 or a fine-tuned code analysis model—scanned the kernel…

围绕“LLM vulnerability discovery tools open source”,这次模型更新对开发者和企业有什么影响?

开发者通常会重点关注能力提升、API 兼容性、成本变化和新场景机会,企业则会更关心可替代性、接入门槛和商业化落地空间。