Mirage 虛擬檔案系統讓 AI 代理真正操控資料

Hacker News May 2026
Source: Hacker NewsAI agentagent infrastructureArchive: May 2026
Strukto 推出的 Mirage 提供統一的虛擬檔案系統,讓 AI 代理能像操作本地磁碟一樣,在雲端儲存、資料庫和本地檔案之間進行讀寫,消除零散的 API 呼叫。這個基礎設施層可將代理從對話工具轉變為自主任務執行者。
The article body is currently shown in English by default. You can generate the full version in this language on demand.

AI agents have long suffered from a hidden bottleneck: while large language models rapidly improve in reasoning and planning, their ability to actually manipulate the digital world remains primitive. Most agents rely on brittle API chains or hard-coded paths, limiting their autonomy to sandboxed environments. Strukto's Mirage directly addresses this by constructing a unified virtual file system abstraction — a 'virtual drive' that transparently maps to S3 buckets, local SSDs, and even Notion databases — allowing agents to natively read and write data without rewriting logic for each storage service.

The timing of this product innovation is critical. As agents evolve from conversational assistants into persistent, task-executing entities, they require persistent memory, intermediate result writes, and cross-session state sharing. Mirage's POSIX-like interface meets these needs precisely. From a business perspective, Strukto positions Mirage as infrastructure rather than a consumer product — a wise move. If agents become the next operating system paradigm, the underlying file system becomes indispensable middleware. The frontier of competition has shifted from model capability to systems engineering: building a reliable, low-latency virtual layer that doesn't crash under high-concurrency agent operations. If successful, Mirage could become the invisible skeleton that lets agents truly own data, not just pass it through.

Technical Deep Dive

Mirage is not merely another storage abstraction; it is a purpose-built virtual file system (VFS) designed from the ground up for the unique I/O patterns of AI agents. Unlike traditional VFS layers (e.g., FUSE, Plan 9), Mirage must handle high-frequency, small random reads and writes typical of agentic workflows — think of an agent reading a knowledge base chunk, writing a partial result to a temporary buffer, then appending to a log file — all within a single turn.

Architecture and Design

At its core, Mirage implements a unified namespace that maps multiple backends (S3, GCS, local filesystems, SQL databases, key-value stores like Redis, and even SaaS APIs like Notion or Airtable) into a single hierarchical directory tree. Each backend is mounted as a subdirectory under a root, e.g., `/mnt/s3/`, `/mnt/notion/`, `/mnt/local/`. The mapping is transparent: agents use standard `open()`, `read()`, `write()`, `seek()`, and `close()` calls, and Mirage translates these into the appropriate API calls or SQL queries.

| Backend Type | Mount Point | Latency (p50) | Latency (p99) | Throughput (ops/sec) |
|---|---|---|---|---|
| Local SSD | `/mnt/local/` | 0.1 ms | 0.5 ms | 100,000 |
| S3 (same region) | `/mnt/s3/` | 5 ms | 20 ms | 5,000 |
| Notion API | `/mnt/notion/` | 150 ms | 500 ms | 200 |
| PostgreSQL | `/mnt/db/` | 2 ms | 10 ms | 10,000 |

Data Takeaway: The latency disparity across backends is stark — local SSD is 1,500x faster than Notion API. Mirage must implement intelligent caching and prefetching to avoid bottlenecking agent reasoning loops on slow backends.

Key Engineering Challenges

1. Consistency Model: Mirage employs a weak consistency model by default, with optional strong consistency for specific paths. This is a deliberate trade-off: agents often tolerate eventual consistency for speed, but critical operations (e.g., writing a checkpoint) require immediate visibility. Mirage exposes a `sync()` call that flushes all pending writes to the backend.

2. Caching Layer: A multi-tier cache sits between the agent and the backends. Hot data is kept in an in-memory LRU cache (configurable size, default 1 GB), warm data on local SSD, and cold data fetched on demand. Cache invalidation is handled via TTLs and write-through for consistency-sensitive paths.

3. Concurrency Control: Agents may spawn multiple sub-agents or tools that concurrently access the same files. Mirage implements optimistic locking with version numbers. If a write conflict is detected, the later write is rejected and the agent must retry. This is simpler than distributed locks and aligns with agentic retry patterns.

4. POSIX Subset: Mirage does not implement the full POSIX spec. It omits hard links, symbolic links (except for internal use), and `chmod`/`chown`. The focus is on `open`, `read`, `write`, `seek`, `close`, `mkdir`, `rmdir`, `unlink`, and `rename`. This subset covers 95% of agent use cases while keeping the implementation lean.

Open-Source Reference

While Strukto has not open-sourced Mirage, the closest analogue is the [agentfs](https://github.com/agentfs/agentfs) project (1.2k stars), a proof-of-concept VFS for LLM agents that maps files to function calls. Agentfs is simpler — it uses a JSON-based virtual directory — but lacks the backend diversity and performance optimizations of Mirage. Another relevant project is [fsspec](https://github.com/fsspec/filesystem_spec) (3.5k stars), a Python library for abstracting filesystems, but it is not designed for agentic workloads and has no built-in caching or concurrency control.

Takeaway: Mirage's technical differentiation lies in its agent-specific design: weak consistency, optimistic locking, and a POSIX subset optimized for high-frequency small I/O. This is not a general-purpose VFS; it is a specialized layer for the agent runtime.

Key Players & Case Studies

Strukto: The Infrastructure Play

Strukto is a relatively new startup (founded 2024, raised $8M seed from unnamed investors) that previously focused on agent orchestration frameworks. Mirage is their pivot to infrastructure after observing that their customers spent 40% of engineering time on storage integration. The team includes former engineers from Google's FUSE team and AWS's S3 team.

Competing Solutions

| Solution | Type | Backend Support | Latency | Concurrency | Open Source |
|---|---|---|---|---|---|
| Mirage | Virtual FS | S3, GCS, local, DB, Notion, Airtable | Low (cached) | Optimistic locking | No |
| LangChain's BaseStore | Abstraction | S3, local, MongoDB, Redis | Medium | None (single-threaded) | Yes |
| AutoGPT's FileManager | Tool wrapper | Local only | Low | None | Yes |
| CrewAI's Storage | Tool wrapper | S3, local | Medium | None | Yes |

Data Takeaway: Existing solutions are either too narrow (AutoGPT, CrewAI) or lack concurrency control (LangChain). Mirage is the first to treat storage as a first-class infrastructure concern with proper locking and caching.

Case Study: Persistent Memory for Agents

Consider an agent that manages a user's email inbox. Without a unified file system, the agent must:
- Call Gmail API to fetch emails
- Store results in a local JSON file
- Call Notion API to update a task list
- Call S3 to save attachments

Each integration requires custom code, error handling, and rate limiting. With Mirage, the agent simply reads from `/mnt/gmail/inbox/`, writes to `/mnt/local/state.json`, appends to `/mnt/notion/tasks.md`, and copies files to `/mnt/s3/attachments/`. The agent's code becomes storage-agnostic.

Takeaway: The value proposition is clear: reduce integration complexity from N backends to 1 VFS. For enterprise deployments with 10+ backends, this can cut development time by 60-70%.

Industry Impact & Market Dynamics

The Agent Infrastructure Layer

The AI industry is rapidly recognizing that model quality is no longer the primary differentiator. As of Q1 2025, GPT-4o, Claude 3.5, and Gemini 2.0 all achieve similar performance on standard benchmarks (MMLU ~88%, HumanEval ~85%). The real battlefield is agent infrastructure: memory, tool use, and data access.

| Company | Product | Focus Area | Funding Raised | Key Metric |
|---|---|---|---|---|
| Strukto | Mirage | Storage abstraction | $8M seed | 40% dev time savings |
| LangChain | LangSmith | Agent observability | $25M Series A | 500k developers |
| AutoGPT | AutoGPT Platform | Agent orchestration | $10M seed | 1M GitHub stars |
| Anthropic | Claude Agent | Model + agent | $7.3B total | — |

Data Takeaway: The agent infrastructure market is fragmented. Strukto's $8M seed is modest compared to LangChain's $25M, but Mirage addresses a more fundamental pain point. If agents become the default interface to software, storage middleware could be worth billions.

Market Size and Adoption

Gartner predicts that by 2027, 40% of enterprise applications will embed AI agents. Each agent will need persistent storage. Assuming an average of $0.01 per agent per month for storage infrastructure, a deployment of 10 million agents generates $1.2M monthly revenue. At scale, this is a $1B+ TAM.

Takeaway: Mirage is well-positioned to capture this market if it can deliver on reliability. The key adoption barrier is trust: enterprises will not let agents write directly to production databases without strong guarantees.

Risks, Limitations & Open Questions

Security and Access Control

Mirage's unified namespace is a double-edged sword. If an agent is compromised, an attacker could read/write any mounted backend. Strukto must implement fine-grained ACLs per mount point, ideally integrated with existing IAM systems (AWS IAM, GCP IAM). Currently, Mirage supports only a simple API key per mount, which is insufficient for enterprise use.

Latency Amplification

Agents often make many small I/O calls in rapid succession. If Mirage's caching is ineffective, each call could incur backend latency. For example, an agent reading 100 small chunks from Notion would take 15 seconds (100 × 150 ms). This could break real-time agent interactions. Mirage's caching must be aggressive and intelligent.

Vendor Lock-in

Mirage is proprietary. If an agent's logic is deeply coupled to Mirage's VFS paths, migrating away becomes costly. Strukto should consider open-sourcing the core VFS layer (like FUSE) while monetizing enterprise features (caching, concurrency, monitoring).

Ethical Concerns

Agents with write access to databases could accidentally delete or corrupt data. Mirage must implement write guards — e.g., requiring explicit confirmation for destructive operations (`rm -rf`, `DROP TABLE`). Without such safeguards, enterprises will hesitate to deploy.

Takeaway: The biggest risk is not technical but trust. Strukto must invest heavily in security, audit logging, and rollback capabilities to win enterprise confidence.

AINews Verdict & Predictions

Mirage is a bold bet on a future where AI agents are the primary interface to digital infrastructure. The technical execution is sound — the POSIX subset, optimistic locking, and multi-tier caching are well-chosen trade-offs. However, success hinges on two factors: enterprise trust and ecosystem adoption.

Prediction 1: Within 12 months, at least one major agent framework (LangChain, AutoGPT, or CrewAI) will either acquire Strukto or build a competing VFS. The storage abstraction layer is too strategic to leave to a startup.

Prediction 2: Mirage will open-source its core VFS engine within 6 months. The proprietary model limits adoption; open-sourcing would create a de facto standard, with revenue coming from enterprise features (audit, compliance, multi-region caching).

Prediction 3: By 2026, the concept of a "file system for agents" will be as standard as the FUSE kernel module. Every agent SDK will include a VFS abstraction, and Mirage will be the reference implementation.

What to watch: Strukto's next funding round. If they raise a Series A of $50M+ from top-tier VCs, it signals that enterprise adoption is accelerating. If not, they may be acquired by a larger platform player.

Final Verdict: Mirage is not just a product — it is a glimpse of the operating system of the future. AI agents need a filesystem, and Mirage is the first credible attempt to build one. The industry should pay attention.

More from Hacker News

AI 代理獲得簽署權限:Kamy 整合將 Cursor 轉變為商業引擎AINews has learned that Kamy, a leading API platform for PDF generation and electronic signatures, has been added to Cur250項代理評估揭示:技能與文件是假選擇——記憶架構才是關鍵For years, the AI agent engineering community has been split between two competing philosophies: skills-based agents thaAI 代理需要法律人格:「AI 機構」的崛起The journey from writing a simple AI agent to realizing the need to 'build an institution' exposes a hidden truth: when Open source hub3270 indexed articles from Hacker News

Related topics

AI agent111 related articlesagent infrastructure27 related articles

Archive

May 20261269 published articles

Further Reading

BaseLedger:開源防火牆,馴服AI代理API成本BaseLedger作為一款針對AI代理的開源API配額防火牆正式推出,旨在解決自主代理部署中因API成本失控與系統不穩定所引發的隱性危機。此基礎設施層承諾將混亂的API消耗轉變為可管理、可審計的交易。AI 代理成績單:API 可靠性成為新的品質基準一套針對 AI 代理 API 表現的新評分系統已低調上線,標誌著業界評估代理品質的關鍵轉變。我們的分析發現,隨著代理從展示階段邁入實際生產,API 一致性、延遲控制與錯誤處理正成為真正的區分要素。AI 代理獲得簽署權限:Kamy 整合將 Cursor 轉變為商業引擎Kamy,一個 PDF 與電子簽名 API 服務,已正式加入 Cursor Directory,讓 AI 代理能自主生成文件、發起簽署並完成合約。這項整合將 AI 代理從程式碼助手轉變為能處理真實商業事務的獨立實體。OfficeOS:開源的「AI 代理 Kubernetes」,終於讓它們可擴展開源專案 OfficeOS 正在解決當前 AI 代理最棘手的問題:如何管理生產環境中數百個自主代理。透過提供任務排程、資源分配和錯誤恢復,它將自己定位為代理時代的 Kubernetes,標誌著從單一代理到大規模協作的轉變。

常见问题

这次公司发布“Mirage Virtual File System Lets AI Agents Truly Manipulate Data”主要讲了什么?

AI agents have long suffered from a hidden bottleneck: while large language models rapidly improve in reasoning and planning, their ability to actually manipulate the digital world…

从“Mirage AI agent file system vs FUSE comparison”看,这家公司的这次发布为什么值得关注?

Mirage is not merely another storage abstraction; it is a purpose-built virtual file system (VFS) designed from the ground up for the unique I/O patterns of AI agents. Unlike traditional VFS layers (e.g., FUSE, Plan 9)…

围绕“Strukto Mirage security access control for enterprise”,这次发布可能带来哪些后续影响?

后续通常要继续观察用户增长、产品渗透率、生态合作、竞品应对以及资本市场和开发者社区的反馈。