關鍵的缺失層:為何AI智能體需要安全執行框架才能生存

AI產業過度專注於打造更聰明的智能體,導致了一個危險的疏忽:強大的『心智』在缺乏實體約束下運作。一類新的安全執行框架正應運而生,旨在解決這個根本性的弱點,將不可預測的程式碼執行轉化為可信賴的流程。
The article body is currently shown in English by default. You can generate the full version in this language on demand.

The rapid advancement of AI agent frameworks has exposed a critical architectural gap. While significant resources have been poured into improving planning capabilities, tool orchestration, and multi-agent collaboration, the industry has largely neglected the security and control mechanisms needed for production deployment. Agents can now formulate sophisticated plans but execute them in environments with inadequate safeguards, creating unacceptable risks for enterprise adoption.

This oversight is now being addressed through what's emerging as the 'security execution layer'—a structured environment that sits between an agent's decision-making core and the tools it invokes. Projects like Castor are pioneering this space by implementing sandbox isolation, resource monitoring, granular permission controls, and comprehensive audit trails. Their approach transforms the traditional model where agents directly call APIs or execute code into a mediated architecture where every action is constrained, monitored, and potentially reversible.

The significance extends beyond technical implementation. This layer represents the essential bridge between experimental AI systems and production-ready automation. Without it, agents remain confined to low-stakes demonstrations and controlled research environments. With it, they can safely operate in financial services, healthcare, enterprise IT, and personal computing—domains where errors have real consequences. The companies and open-source projects that successfully establish this execution layer will effectively control the operating system for autonomous AI, creating a strategic position with potentially greater long-term value than the application frameworks built on top of it.

Technical Deep Dive

The security execution layer represents a fundamental shift in AI agent architecture. Traditional frameworks like LangChain, AutoGPT, and CrewAI focus primarily on the cognitive stack: planning, memory, and tool selection. They treat tool execution as a simple function call, delegating safety to the underlying operating system or external API providers. This approach fails at scale because it provides no unified security model, no resource governance, and no transaction-level auditability.

Emerging solutions like Castor implement a three-tier architecture:
1. Policy Engine: A declarative system defining what actions are permitted, under what conditions, and with what resource limits. Policies are expressed in domain-specific languages (DSLs) or extended YAML/JSON schemas, allowing security teams to define constraints separate from agent logic.
2. Runtime Enforcer: This core component intercepts all tool invocation requests. It validates them against the policy engine, applies transformations if needed (like sanitizing inputs), and executes the action within a constrained environment. For code execution, this typically involves container-based sandboxes (Docker, gVisor) or WebAssembly (Wasm) runtimes that provide strong isolation.
3. Observability & Audit Layer: Every action—approved, denied, or modified—is logged with full context: which agent initiated it, what the inputs were, what policy was applied, and what the outcome was. This creates an immutable audit trail crucial for compliance and debugging.

A key innovation is the move from blacklisting dangerous actions to whitelisting permitted ones. Instead of trying to anticipate every harmful API call (an impossible task), the system only allows explicitly approved operations. For example, a file system tool might be granted write access only to `/tmp/agent_workspace/` and read access to `/data/input/`, with quotas on total disk usage.

On the open-source front, several projects are exploring adjacent territory. The `smolagents` repository by researcher Swyx provides a lightweight, security-focused alternative to heavier frameworks, emphasizing deterministic execution and simpler control flows. Microsoft's `AutoGen` has introduced safety patterns through conversational validation, though it lacks the deep runtime enforcement of dedicated security layers. The `LangGraph` project from LangChain enables more controlled state machines for agent workflows, which can be combined with security checkpoints.

Performance overhead is a critical consideration. Early benchmarks from prototype systems show the security layer adds between 50-200ms of latency per tool call, depending on the isolation mechanism. The table below compares isolation techniques:

| Isolation Method | Security Level | Startup Latency | Memory Overhead | Best For |
|---|---|---|---|---|
| Process Isolation | Low-Medium | 1-10ms | Low | Trusted environments, speed-critical tasks |
| Docker Container | High | 100-500ms | Medium-High | Full system calls, complex dependencies |
| gVisor | Very High | 50-150ms | Medium | Strong isolation with better performance than Docker |
| WebAssembly (Wasm) | Medium-High | 5-50ms | Very Low | Pure computation, limited system access |
| eBPF-based | Medium | <1ms | Minimal | Network/system call filtering on host |

Data Takeaway: The security-performance trade-off is stark. Docker provides gold-standard isolation but at significant latency cost, making it unsuitable for interactive agents. WebAssembly offers an intriguing middle ground for computational tasks but cannot handle all tool types. Hybrid approaches that dynamically select isolation based on risk level will likely dominate.

Key Players & Case Studies

The security execution layer space is nascent but already attracting distinct approaches from different segments of the ecosystem.

Castor has emerged as a pure-play security layer startup. Founded by engineers with backgrounds in cybersecurity and distributed systems, their approach is explicitly infrastructure-first. Castor doesn't build agents but provides the 'rails' on which any agent framework can run safely. Their early design decisions reveal strategic thinking: they support multiple LLM backends (OpenAI, Anthropic, open-source models), integrate with existing identity and access management (IAM) systems, and offer both cloud-hosted and on-premise deployments. This enterprise-friendly approach suggests they're targeting regulated industries first.

Large cloud providers are taking notice. Amazon Web Services has integrated basic agent safety features into Amazon Bedrock's Agents, primarily through pre-execution validation prompts and post-execution verification. Microsoft Azure is extending its Copilot Studio with 'guardrails' that can block or modify agent actions based on content filters and compliance rules. However, these are currently bolt-on features rather than architectural foundations.

Open-source frameworks face the most immediate pressure to adapt. LangChain recently introduced the `RunnableWithMessageHistory` abstraction that can be extended with security checks, but it remains optional and developer-implemented. CrewAI, popular for multi-agent systems, has no built-in security model, leaving teams to implement their own—a dangerous proposition given the complexity of multi-agent interactions.

A revealing case study comes from Klaviyo, the marketing automation platform. Their engineering team attempted to build an internal AI agent for customer data analysis but halted development when they realized the security implications. "We had an agent that could query customer databases, generate segments, and even initiate email campaigns," explained a senior engineer who requested anonymity. "The planning worked beautifully in testing. Then we realized it had the equivalent of admin keys to our production systems with zero audit trail. One hallucination could email thousands of customers incorrectly." They're now evaluating security execution layers as a prerequisite for continuing.

| Solution Type | Representative | Primary Approach | Target Market | Key Limitation |
|---|---|---|---|---|
| Pure-Play Security Layer | Castor | Runtime enforcement, policy-first | Enterprise, regulated sectors | New technology, integration complexity |
| Cloud Platform Integrated | AWS Bedrock Agents, Azure Copilot | Pre/post-execution validation, content filters | Existing cloud customers | Limited to provider ecosystem, shallow isolation |
| Framework Extensions | LangChain (emerging) | Hooks, middleware patterns | Developer community | Optional, inconsistent implementation |
| DIY/In-House | Various enterprises | Custom scripts, manual reviews | Large tech companies | High maintenance, expertise-dependent |

Data Takeaway: The market is fragmenting along architectural philosophies. Pure-play solutions offer depth but face adoption hurdles. Cloud providers leverage their ecosystem but provide weaker security. This creates an opportunity for a solution that combines the depth of Castor with the deployment simplicity of cloud services.

Industry Impact & Market Dynamics

The emergence of security execution layers will reshape the AI agent landscape in profound ways, creating new winners and rendering previous approaches obsolete.

First, it will bifurcate the agent market. On one side will be 'toy' agents—creative writing assistants, brainstorming partners, and simple chatbots that operate in low-risk environments without security layers. On the other will be 'industrial' agents that handle sensitive data, control physical systems, or manage financial transactions, requiring certified security execution environments. The latter market will command premium pricing but face slower adoption due to compliance requirements.

Second, it changes the competitive moat. Previously, agent framework competition centered on which had the most tools, the best planning algorithms, or the slickest developer experience. Now, security and trust become primary differentiators. A framework with mediocre planning but excellent security controls will beat a brilliant but unsafe framework in enterprise procurement processes. This plays to the strengths of established infrastructure companies rather than AI research startups.

Third, it creates a new business model: Agent Security as a Service (ASaaS). Similar to how cloud security evolved, we'll see companies offering continuous monitoring, threat detection, and compliance reporting for AI agent operations. This could become a multi-billion dollar market segment by 2027, as every company using agents in production will need these guarantees.

Market projections support this thesis. While the overall AI agent market is predicted to reach $50 billion by 2030 (according to various analyst reports), the security and governance subset could capture 15-20% of that value—approximately $7.5-$10 billion—due to its essential nature and premium pricing.

| Market Segment | 2024 Est. Size | 2027 Projection | CAGR | Key Drivers |
|---|---|---|---|---|
| AI Agent Development Tools | $1.2B | $4.5B | 55% | Developer productivity, automation demand |
| AI Agent Security & Governance | $0.3B | $2.8B | 110% | Enterprise adoption, regulatory pressure |
| Agent-Based Automation Services | $0.8B | $6.2B | 67% | ROI on task automation |
| Total Addressable Market | $2.3B | $13.5B | 80% | Combined growth across segments |

Data Takeaway: The security segment is projected to grow twice as fast as the overall agent tools market, indicating its strategic importance. By 2027, security could represent over 20% of the total agent ecosystem value, making it the most lucrative niche for infrastructure-focused companies.

Funding patterns already reflect this shift. In the past six months, three startups focusing specifically on AI agent safety have raised seed rounds totaling over $40 million. Castor's recent $15 million Series A was led by cybersecurity-focused venture capital firms rather than traditional AI investors, signaling recognition of this as a security problem first, AI problem second.

Risks, Limitations & Open Questions

Despite its promise, the security execution layer approach faces significant challenges that could limit its effectiveness or create new vulnerabilities.

The Policy Definition Problem remains unsolved. Creating comprehensive, conflict-free security policies for AI agents is extraordinarily difficult. Unlike traditional software where actions are predetermined, agents can combine tools in novel ways, creating emergent behaviors that bypass policy intent. A policy might allow "read database" and "send email" separately but fail to prevent an agent from reading sensitive data and emailing it externally—a classic confused deputy problem. Current policy languages lack the expressiveness to capture such compound risks.

Performance degradation could make secured agents impractical for real-time applications. Adding 100-200ms per tool call might be acceptable for a background data processing agent but destroys user experience for an interactive assistant. Optimization efforts will constantly battle against security guarantees, potentially leading to corners being cut in production.

Adversarial attacks will evolve to target the security layer itself. Researchers have already demonstrated prompt injection attacks that can bypass pre-execution validation by hiding malicious intent within seemingly benign requests. More sophisticated attacks might attempt to exploit vulnerabilities in the sandbox environment or policy engine. The security layer becomes a high-value attack surface.

Regulatory uncertainty creates compliance risks. Different industries and jurisdictions will develop conflicting requirements for AI agent oversight. A solution compliant with EU's AI Act might not satisfy FDA requirements for healthcare or FINRA rules for financial services. Companies building these layers face the daunting task of creating configurable compliance frameworks adaptable to multiple regimes.

Several open questions will determine the trajectory:
1. Will standards emerge? Currently, each security layer uses proprietary policy formats and APIs, creating vendor lock-in. The industry needs something akin to Kubernetes' Container Runtime Interface (CRI) but for agent security.
2. How much responsibility shifts? If a secured agent causes damage despite policy controls, who is liable—the agent developer, the security layer provider, or the end-user organization? Legal precedents haven't been established.
3. Can open-source compete? Building and maintaining a robust security execution layer requires significant resources. Can open-source projects like `smolagents` keep pace with well-funded commercial offerings, or will this become another infrastructure domain dominated by proprietary solutions?

AINews Verdict & Predictions

The security execution layer represents the most important yet underappreciated development in AI agents today. While less glamorous than new model releases or clever prompting techniques, it addresses the fundamental barrier to widespread adoption: trust. Our analysis leads to several concrete predictions:

Prediction 1: By end of 2025, no major enterprise will deploy AI agents without a dedicated security execution layer. Regulatory pressure, insurance requirements, and past incidents will make this non-negotiable. Companies attempting to bypass this requirement will face catastrophic failures that accelerate the trend.

Prediction 2: The first acquisition wave will hit in 2024-2025. Major cloud providers (AWS, Google Cloud, Microsoft Azure) will acquire or deeply partner with security layer startups to fill gaps in their agent offerings. Infrastructure companies like Palo Alto Networks or CrowdStrike might also enter through acquisition, viewing agent security as a natural extension of their existing platforms.

Prediction 3: A standards war will emerge by 2026. At least two competing standards for agent security policies will vie for dominance—one led by open-source communities and another by enterprise consortiums. The winner will determine whether agent security remains fragmented or becomes interoperable.

Prediction 4: Security layers will enable entirely new agent categories. Once reliable containment exists, we'll see agents deployed in previously unthinkable domains: autonomous financial trading, real-time medical diagnosis support, critical infrastructure management. The security layer doesn't just protect existing use cases—it unlocks transformative ones.

AINews Editorial Judgment: The companies and projects treating agent security as a first-class architectural concern, not an afterthought, will dominate the next phase of AI automation. Castor and similar initiatives are correctly identifying that the true bottleneck isn't making agents smarter—it's making them safer. Developers should immediately evaluate their agent projects against the security execution gap and begin integrating these layers now, before incidents force reactive measures. The organizations that master this integration will gain 12-18 month advantages over competitors still treating AI agents as experimental toys.

What to watch next: Monitor adoption patterns in financial services and healthcare—the canaries in the coal mine for security requirements. Watch for the first major security incident involving an unsecured AI agent, which will serve as an inflection point for the entire industry. And pay close attention to whether open-source projects can develop credible alternatives to commercial security layers, preventing vendor lock-in in this critical infrastructure domain.

Further Reading

Nono.sh 的核心級安全模型重新定義關鍵基礎設施的 AI 代理安全開源項目 Nono.sh 對 AI 代理安全提出了根本性的重新思考。它不再依賴脆弱的應用層權限,而是實現了一個由核心強制執行的零信任運行模型,將每個代理視為本質上不可信。這一基礎性轉變有望為關鍵基礎設施解鎖 AI 的深度應用。為何單一沙盒安全模式對AI代理失效?下一步是什麼?保護AI代理的安全模式正經歷一場根本性的變革。業界標準的單一沙盒方法,在自主使用工具的系統壓力下正逐漸崩潰。一種基於細粒度、工具級隔離的新架構正在興起,成為構建安全AI的關鍵基礎。Nomos執行防火牆:安全部署AI代理所缺失的關鍵層AI從對話式聊天機器人迅速演進為能執行複雜任務的自動代理,這暴露了一個危險的安全漏洞。開源項目Nomos正在開創一個解決方案:一種『執行防火牆』,它能在每個提議的行動執行前,對其進行攔截、分析和授權。Defender 的本地 Prompt Injection 防禦重塑 AI Agent 安全架構名為 Defender 的新開源庫正在根本性地改變 AI Agent 的安全格局,提供針對 Prompt Injection 攻擊的本地即時保護。這項技術消除了對外部的安全 API 的依賴,創建了隨 Agent 移動的便携式安全邊界。

常见问题

这次模型发布“The Critical Missing Layer: Why AI Agents Need Security Execution Frameworks to Survive”的核心内容是什么?

The rapid advancement of AI agent frameworks has exposed a critical architectural gap. While significant resources have been poured into improving planning capabilities, tool orche…

从“Castor AI security layer vs traditional sandboxing”看,这个模型发布为什么重要?

The security execution layer represents a fundamental shift in AI agent architecture. Traditional frameworks like LangChain, AutoGPT, and CrewAI focus primarily on the cognitive stack: planning, memory, and tool selectio…

围绕“cost of implementing AI agent security execution framework”,这次模型更新对开发者和企业有什么影响?

开发者通常会重点关注能力提升、API 兼容性、成本变化和新场景机会,企业则会更关心可替代性、接入门槛和商业化落地空间。