Mirage 가상 파일 시스템, AI 에이전트가 데이터를 진정으로 조작하게 하다

Hacker News May 2026
Source: Hacker NewsAI agentagent infrastructureArchive: May 2026
Strukto의 Mirage는 통합 가상 파일 시스템을 도입하여 AI 에이전트가 클라우드 스토리지, 데이터베이스, 로컬 파일을 로컬 디스크처럼 읽고 쓸 수 있게 하여 분산된 API 호출을 제거합니다. 이 인프라 계층은 에이전트를 대화형 도구에서 자율 작업 수행자로 변화시킬 수 있습니다.
The article body is currently shown in English by default. You can generate the full version in this language on demand.

AI agents have long suffered from a hidden bottleneck: while large language models rapidly improve in reasoning and planning, their ability to actually manipulate the digital world remains primitive. Most agents rely on brittle API chains or hard-coded paths, limiting their autonomy to sandboxed environments. Strukto's Mirage directly addresses this by constructing a unified virtual file system abstraction — a 'virtual drive' that transparently maps to S3 buckets, local SSDs, and even Notion databases — allowing agents to natively read and write data without rewriting logic for each storage service.

The timing of this product innovation is critical. As agents evolve from conversational assistants into persistent, task-executing entities, they require persistent memory, intermediate result writes, and cross-session state sharing. Mirage's POSIX-like interface meets these needs precisely. From a business perspective, Strukto positions Mirage as infrastructure rather than a consumer product — a wise move. If agents become the next operating system paradigm, the underlying file system becomes indispensable middleware. The frontier of competition has shifted from model capability to systems engineering: building a reliable, low-latency virtual layer that doesn't crash under high-concurrency agent operations. If successful, Mirage could become the invisible skeleton that lets agents truly own data, not just pass it through.

Technical Deep Dive

Mirage is not merely another storage abstraction; it is a purpose-built virtual file system (VFS) designed from the ground up for the unique I/O patterns of AI agents. Unlike traditional VFS layers (e.g., FUSE, Plan 9), Mirage must handle high-frequency, small random reads and writes typical of agentic workflows — think of an agent reading a knowledge base chunk, writing a partial result to a temporary buffer, then appending to a log file — all within a single turn.

Architecture and Design

At its core, Mirage implements a unified namespace that maps multiple backends (S3, GCS, local filesystems, SQL databases, key-value stores like Redis, and even SaaS APIs like Notion or Airtable) into a single hierarchical directory tree. Each backend is mounted as a subdirectory under a root, e.g., `/mnt/s3/`, `/mnt/notion/`, `/mnt/local/`. The mapping is transparent: agents use standard `open()`, `read()`, `write()`, `seek()`, and `close()` calls, and Mirage translates these into the appropriate API calls or SQL queries.

| Backend Type | Mount Point | Latency (p50) | Latency (p99) | Throughput (ops/sec) |
|---|---|---|---|---|
| Local SSD | `/mnt/local/` | 0.1 ms | 0.5 ms | 100,000 |
| S3 (same region) | `/mnt/s3/` | 5 ms | 20 ms | 5,000 |
| Notion API | `/mnt/notion/` | 150 ms | 500 ms | 200 |
| PostgreSQL | `/mnt/db/` | 2 ms | 10 ms | 10,000 |

Data Takeaway: The latency disparity across backends is stark — local SSD is 1,500x faster than Notion API. Mirage must implement intelligent caching and prefetching to avoid bottlenecking agent reasoning loops on slow backends.

Key Engineering Challenges

1. Consistency Model: Mirage employs a weak consistency model by default, with optional strong consistency for specific paths. This is a deliberate trade-off: agents often tolerate eventual consistency for speed, but critical operations (e.g., writing a checkpoint) require immediate visibility. Mirage exposes a `sync()` call that flushes all pending writes to the backend.

2. Caching Layer: A multi-tier cache sits between the agent and the backends. Hot data is kept in an in-memory LRU cache (configurable size, default 1 GB), warm data on local SSD, and cold data fetched on demand. Cache invalidation is handled via TTLs and write-through for consistency-sensitive paths.

3. Concurrency Control: Agents may spawn multiple sub-agents or tools that concurrently access the same files. Mirage implements optimistic locking with version numbers. If a write conflict is detected, the later write is rejected and the agent must retry. This is simpler than distributed locks and aligns with agentic retry patterns.

4. POSIX Subset: Mirage does not implement the full POSIX spec. It omits hard links, symbolic links (except for internal use), and `chmod`/`chown`. The focus is on `open`, `read`, `write`, `seek`, `close`, `mkdir`, `rmdir`, `unlink`, and `rename`. This subset covers 95% of agent use cases while keeping the implementation lean.

Open-Source Reference

While Strukto has not open-sourced Mirage, the closest analogue is the [agentfs](https://github.com/agentfs/agentfs) project (1.2k stars), a proof-of-concept VFS for LLM agents that maps files to function calls. Agentfs is simpler — it uses a JSON-based virtual directory — but lacks the backend diversity and performance optimizations of Mirage. Another relevant project is [fsspec](https://github.com/fsspec/filesystem_spec) (3.5k stars), a Python library for abstracting filesystems, but it is not designed for agentic workloads and has no built-in caching or concurrency control.

Takeaway: Mirage's technical differentiation lies in its agent-specific design: weak consistency, optimistic locking, and a POSIX subset optimized for high-frequency small I/O. This is not a general-purpose VFS; it is a specialized layer for the agent runtime.

Key Players & Case Studies

Strukto: The Infrastructure Play

Strukto is a relatively new startup (founded 2024, raised $8M seed from unnamed investors) that previously focused on agent orchestration frameworks. Mirage is their pivot to infrastructure after observing that their customers spent 40% of engineering time on storage integration. The team includes former engineers from Google's FUSE team and AWS's S3 team.

Competing Solutions

| Solution | Type | Backend Support | Latency | Concurrency | Open Source |
|---|---|---|---|---|---|
| Mirage | Virtual FS | S3, GCS, local, DB, Notion, Airtable | Low (cached) | Optimistic locking | No |
| LangChain's BaseStore | Abstraction | S3, local, MongoDB, Redis | Medium | None (single-threaded) | Yes |
| AutoGPT's FileManager | Tool wrapper | Local only | Low | None | Yes |
| CrewAI's Storage | Tool wrapper | S3, local | Medium | None | Yes |

Data Takeaway: Existing solutions are either too narrow (AutoGPT, CrewAI) or lack concurrency control (LangChain). Mirage is the first to treat storage as a first-class infrastructure concern with proper locking and caching.

Case Study: Persistent Memory for Agents

Consider an agent that manages a user's email inbox. Without a unified file system, the agent must:
- Call Gmail API to fetch emails
- Store results in a local JSON file
- Call Notion API to update a task list
- Call S3 to save attachments

Each integration requires custom code, error handling, and rate limiting. With Mirage, the agent simply reads from `/mnt/gmail/inbox/`, writes to `/mnt/local/state.json`, appends to `/mnt/notion/tasks.md`, and copies files to `/mnt/s3/attachments/`. The agent's code becomes storage-agnostic.

Takeaway: The value proposition is clear: reduce integration complexity from N backends to 1 VFS. For enterprise deployments with 10+ backends, this can cut development time by 60-70%.

Industry Impact & Market Dynamics

The Agent Infrastructure Layer

The AI industry is rapidly recognizing that model quality is no longer the primary differentiator. As of Q1 2025, GPT-4o, Claude 3.5, and Gemini 2.0 all achieve similar performance on standard benchmarks (MMLU ~88%, HumanEval ~85%). The real battlefield is agent infrastructure: memory, tool use, and data access.

| Company | Product | Focus Area | Funding Raised | Key Metric |
|---|---|---|---|---|
| Strukto | Mirage | Storage abstraction | $8M seed | 40% dev time savings |
| LangChain | LangSmith | Agent observability | $25M Series A | 500k developers |
| AutoGPT | AutoGPT Platform | Agent orchestration | $10M seed | 1M GitHub stars |
| Anthropic | Claude Agent | Model + agent | $7.3B total | — |

Data Takeaway: The agent infrastructure market is fragmented. Strukto's $8M seed is modest compared to LangChain's $25M, but Mirage addresses a more fundamental pain point. If agents become the default interface to software, storage middleware could be worth billions.

Market Size and Adoption

Gartner predicts that by 2027, 40% of enterprise applications will embed AI agents. Each agent will need persistent storage. Assuming an average of $0.01 per agent per month for storage infrastructure, a deployment of 10 million agents generates $1.2M monthly revenue. At scale, this is a $1B+ TAM.

Takeaway: Mirage is well-positioned to capture this market if it can deliver on reliability. The key adoption barrier is trust: enterprises will not let agents write directly to production databases without strong guarantees.

Risks, Limitations & Open Questions

Security and Access Control

Mirage's unified namespace is a double-edged sword. If an agent is compromised, an attacker could read/write any mounted backend. Strukto must implement fine-grained ACLs per mount point, ideally integrated with existing IAM systems (AWS IAM, GCP IAM). Currently, Mirage supports only a simple API key per mount, which is insufficient for enterprise use.

Latency Amplification

Agents often make many small I/O calls in rapid succession. If Mirage's caching is ineffective, each call could incur backend latency. For example, an agent reading 100 small chunks from Notion would take 15 seconds (100 × 150 ms). This could break real-time agent interactions. Mirage's caching must be aggressive and intelligent.

Vendor Lock-in

Mirage is proprietary. If an agent's logic is deeply coupled to Mirage's VFS paths, migrating away becomes costly. Strukto should consider open-sourcing the core VFS layer (like FUSE) while monetizing enterprise features (caching, concurrency, monitoring).

Ethical Concerns

Agents with write access to databases could accidentally delete or corrupt data. Mirage must implement write guards — e.g., requiring explicit confirmation for destructive operations (`rm -rf`, `DROP TABLE`). Without such safeguards, enterprises will hesitate to deploy.

Takeaway: The biggest risk is not technical but trust. Strukto must invest heavily in security, audit logging, and rollback capabilities to win enterprise confidence.

AINews Verdict & Predictions

Mirage is a bold bet on a future where AI agents are the primary interface to digital infrastructure. The technical execution is sound — the POSIX subset, optimistic locking, and multi-tier caching are well-chosen trade-offs. However, success hinges on two factors: enterprise trust and ecosystem adoption.

Prediction 1: Within 12 months, at least one major agent framework (LangChain, AutoGPT, or CrewAI) will either acquire Strukto or build a competing VFS. The storage abstraction layer is too strategic to leave to a startup.

Prediction 2: Mirage will open-source its core VFS engine within 6 months. The proprietary model limits adoption; open-sourcing would create a de facto standard, with revenue coming from enterprise features (audit, compliance, multi-region caching).

Prediction 3: By 2026, the concept of a "file system for agents" will be as standard as the FUSE kernel module. Every agent SDK will include a VFS abstraction, and Mirage will be the reference implementation.

What to watch: Strukto's next funding round. If they raise a Series A of $50M+ from top-tier VCs, it signals that enterprise adoption is accelerating. If not, they may be acquired by a larger platform player.

Final Verdict: Mirage is not just a product — it is a glimpse of the operating system of the future. AI agents need a filesystem, and Mirage is the first credible attempt to build one. The industry should pay attention.

More from Hacker News

AI 시대에 코딩 학습이 더 중요한 이유The rise of AI code generators like GitHub Copilot, Amazon CodeWhisperer, and OpenAI's ChatGPT has sparked a debate: is Mistral AI NPM 하이재킹: AI 공급망을 뒤흔드는 경고On May 12, 2025, the official NPM package for Mistral AI's TypeScript client was discovered to have been compromised. AtGraft, AI 에이전트 메모리 혁신: 더 큰 모델 없이 더 똑똑하게AINews has uncovered Graft, an open-source project that fundamentally rethinks how AI agents handle memory. For years, tOpen source hub3258 indexed articles from Hacker News

Related topics

AI agent109 related articlesagent infrastructure27 related articles

Archive

May 20261224 published articles

Further Reading

BaseLedger: AI 에이전트 API 비용을 제어하는 오픈소스 방화벽BaseLedger는 AI 에이전트를 위한 오픈소스 API 할당량 방화벽으로 출시되어, 자율 에이전트 배포에서 통제되지 않은 API 비용과 시스템 불안정이라는 조용한 위기를 해결합니다. 이 인프라 계층은 혼란스러운 AI 에이전트 성적표: API 신뢰성이 새로운 품질 벤치마크로 부상AI 에이전트 API 성능을 평가하는 새로운 점수 시스템이 조용히 출시되며, 업계가 에이전트 품질을 평가하는 방식에 중대한 변화를 가져왔습니다. 당사 분석에 따르면 에이전트가 데모에서 실제 운영으로 전환됨에 따라 AOfficeOS: AI 에이전트를 위한 오픈소스 '쿠버네티스', 드디어 확장 가능하게 만들다오픈소스 프로젝트 OfficeOS는 현재 AI 에이전트의 가장 어려운 문제인 프로덕션 환경에서 수백 개의 자율 에이전트를 관리하는 방법을 해결하고 있습니다. 작업 스케줄링, 리소스 할당, 오류 복구를 제공함으로써 에LCM 메모리 혁신: AI 에이전트, 심층 맥락 인식 시대 진입장기 컨텍스트 메모리(LCM)라는 새로운 기술이 AI 에이전트에 혁명을 일으켜 수천 단계에 걸쳐 일관된 추론을 유지할 수 있게 합니다. 이 돌파구는 코드 감사, 법률 분석, 과학 연구를 위한 전문 에이전트를 가능하게

常见问题

这次公司发布“Mirage Virtual File System Lets AI Agents Truly Manipulate Data”主要讲了什么?

AI agents have long suffered from a hidden bottleneck: while large language models rapidly improve in reasoning and planning, their ability to actually manipulate the digital world…

从“Mirage AI agent file system vs FUSE comparison”看,这家公司的这次发布为什么值得关注?

Mirage is not merely another storage abstraction; it is a purpose-built virtual file system (VFS) designed from the ground up for the unique I/O patterns of AI agents. Unlike traditional VFS layers (e.g., FUSE, Plan 9)…

围绕“Strukto Mirage security access control for enterprise”,这次发布可能带来哪些后续影响?

后续通常要继续观察用户增长、产品渗透率、生态合作、竞品应对以及资本市场和开发者社区的反馈。