Technical Deep Dive
Moduna’s architecture is built on a custom event pipeline designed to handle the unique telemetry of autonomous agents. Unlike traditional APM tools that track HTTP requests and database queries, Moduna models agent behavior as a directed acyclic graph (DAG) of “actions.” Each action—whether an LLM call, a tool invocation (e.g., a web search or file read), a conditional branch, or a retry—is recorded as an event with rich metadata: input/output tokens, latency, cost, error codes, and parent-child relationships. The platform uses a columnar time-series database (based on Apache Parquet and ClickHouse) to store these events, enabling sub-second queries over millions of agent sessions.
A key innovation is Moduna’s “Decision Tree Replay” engine. It serializes the agent’s internal state—including the prompt context, intermediate reasoning (if using chain-of-thought), and chosen action—at each step. Developers can scrub through a timeline, pause at any node, and inspect the exact inputs and outputs. This is analogous to Mixpanel’s user session replay, but for non-human actors. The replay engine also supports “what-if” branching: a developer can fork a session at a decision point, modify the prompt or tool choice, and simulate the alternative outcome without re-running the entire agent.
On the cost side, Moduna provides granular token accounting. It parses LLM API responses to extract prompt and completion tokens, then maps them to real-time pricing from providers like OpenAI, Anthropic, and Google. The platform surfaces cost-per-task, cost-per-tool-call, and even cost-per-decision-path, enabling teams to identify expensive failure modes—for example, an agent that repeatedly calls a slow, high-cost API due to a flawed prompt.
A relevant open-source project in this space is LangFuse (GitHub: langfuse/langfuse, 8.5k stars), which offers LLM observability with tracing and cost tracking. However, LangFuse is focused on individual LLM calls, not the full agent decision graph. Another is Arize AI’s Phoenix (GitHub: Arize-AI/phoenix, 7.2k stars), which provides LLM evaluation and tracing but lacks session replay for multi-step agents. Moduna’s differentiation lies in its agent-native data model and the replay feature.
| Feature | Moduna | LangFuse | Arize Phoenix | Traditional APM (Datadog) |
|---|---|---|---|---|
| Agent decision tree replay | ✅ Full | ❌ | ❌ | ❌ |
| Per-step token cost tracking | ✅ | ✅ | ✅ | ❌ |
| Multi-agent session correlation | ✅ | ❌ | ❌ | ❌ |
| “What-if” simulation | ✅ | ❌ | ❌ | ❌ |
| Integration with LangChain/AutoGPT | ✅ | ✅ | ✅ | ❌ |
| Real-time alerting on agent drift | ✅ | ❌ | Partial | ❌ |
Data Takeaway: Moduna’s feature set is uniquely tailored for agent workflows, while existing LLM observability tools and traditional APMs lack the decision-tree replay and multi-agent correlation capabilities. This gap justifies Moduna’s existence as a standalone category.
Key Players & Case Studies
Moduna was founded by a team of ex-engineering leads from Mixpanel and Datadog, giving them deep domain expertise in both product analytics and infrastructure monitoring. The CEO, Sarah Chen, previously led the real-time analytics team at Mixpanel, where she built the session replay engine for web applications. The CTO, Marcus Rivera, was a staff engineer at Datadog focused on distributed tracing. Their combined experience directly informs Moduna’s architecture.
Early enterprise adopters include Finova, a fintech company deploying AI agents for loan underwriting. Finova’s agents process 50,000 applications per month, each requiring 15–20 tool calls to credit bureaus, fraud databases, and internal risk models. Before Moduna, debugging a failed application took 4 hours of log spelunking. With Moduna’s session replay, they reduced mean time to resolution (MTTR) from 4 hours to 45 minutes. They also discovered that 12% of their agent’s API costs came from redundant credit bureau lookups—a fix that saved $8,000/month.
Another case is CodeForge, a startup using agents for automated code review. Their agents analyze pull requests, run static analysis, and suggest fixes. CodeForge used Moduna to track agent “hallucination” rates—instances where the agent suggested incorrect code changes. By replaying sessions, they identified that the agent was misinterpreting certain TypeScript generics, leading to a prompt refinement that reduced hallucination by 22%.
Competing solutions are emerging. LangSmith (by LangChain) offers basic agent tracing but lacks cost analytics and replay. Weights & Biases Prompts provides LLM monitoring but not agent-level DAG visualization. New Relic and Datadog have announced LLM monitoring features, but they treat agent calls as plain API requests, missing the decision context.
| Company | Product | Agent-Native? | Session Replay? | Cost Tracking? | Pricing Model |
|---|---|---|---|---|---|
| Moduna | Moduna Agent Analytics | ✅ | ✅ | ✅ | Usage-based (per agent session) |
| LangChain | LangSmith | Partial | ❌ | ❌ | Per-seat + usage |
| Weights & Biases | Prompts | ❌ | ❌ | ✅ | Per-seat |
| Datadog | LLM Observability | ❌ | ❌ | Partial | Per-host + usage |
Data Takeaway: Moduna is the only platform offering all three core features—agent-native modeling, session replay, and cost tracking—in a single product. Competitors either lack replay or treat agents as generic API calls.
Industry Impact & Market Dynamics
The AI agent market is projected to grow from $5.4 billion in 2025 to $42.3 billion by 2030 (CAGR 51%), according to industry estimates. As agents move from demo to production, observability becomes a non-negotiable layer. Moduna is positioning itself as the “Mixpanel for agents,” capitalizing on the same pattern that made Mixpanel essential for web apps: when a new interaction paradigm emerges, a dedicated analytics tool is needed.
Moduna has raised $12 million in a seed round led by Sequoia Capital and Index Ventures, valuing the company at $80 million. The funding will be used to expand the engineering team and build integrations with orchestration frameworks like CrewAI, AutoGPT, and Microsoft’s Copilot Studio.
The market is currently fragmented. There are no dominant players in agent observability, and most enterprises rely on a patchwork of custom logging and LLM provider dashboards. Moduna’s first-mover advantage is significant, but incumbents like Datadog and New Relic have large sales teams and existing enterprise relationships. However, these incumbents face an architectural challenge: their data models are optimized for request/response patterns, not for DAG-based agent workflows. Retooling their entire pipeline would take 12–18 months, giving Moduna a window.
| Metric | 2025 | 2026 (est.) | 2027 (est.) |
|---|---|---|---|
| Global AI agent deployments (millions) | 1.2 | 3.8 | 9.5 |
| Average agent cost per month ($) | 450 | 320 | 210 |
| % of enterprises using dedicated agent analytics | 8% | 22% | 45% |
| Moduna estimated market share | <1% | 12% | 28% |
Data Takeaway: The rapid growth in agent deployments and the increasing cost sensitivity (average cost dropping due to optimization) create a strong tailwind for Moduna. By 2027, nearly half of enterprises deploying agents are expected to use dedicated analytics, and Moduna is well-positioned to capture a leading share.
Risks, Limitations & Open Questions
Moduna faces several challenges. First, data privacy and security: agent telemetry often includes sensitive user data (e.g., financial information, PII). Moduna must offer on-premise deployment and SOC 2 Type II compliance to win enterprise trust. Currently, it is cloud-only, which may deter regulated industries.
Second, vendor lock-in risk: Moduna’s agent-native data model is proprietary. If a developer builds deep integrations, switching costs could be high. The company should open-source its event schema to encourage community adoption and reduce lock-in fears.
Third, scalability: agents can generate thousands of events per second in high-throughput environments. Moduna’s ClickHouse backend is performant, but real-time replay of complex DAGs at scale remains unproven. Early users report occasional lag when replaying sessions with over 500 steps.
Fourth, competition from LLM providers: OpenAI and Anthropic could add built-in analytics to their APIs, similar to how Stripe added dashboard analytics. If they do, Moduna’s value proposition weakens. However, LLM providers have little incentive to support multi-agent, multi-provider scenarios—Moduna’s strength.
Finally, ethical concerns: session replay of agents raises questions about auditing and bias. If an agent makes a discriminatory decision, replay can help identify the root cause, but it also creates a permanent record that could be used for surveillance of developers. Moduna needs to implement role-based access controls and data retention policies.
AINews Verdict & Predictions
Moduna is solving a real, urgent problem. As AI agents become the new “users” of enterprise systems, the ability to observe, debug, and optimize their behavior is not a luxury—it’s a requirement for production reliability. Moduna’s team has the right pedigree and product vision.
Prediction 1: By Q2 2027, Moduna will be acquired by a major observability platform (likely Datadog or New Relic) for $300–500 million. The acquirer will integrate Moduna’s agent-native data model into their existing APM suite, while Moduna’s standalone product will be discontinued.
Prediction 2: Within 18 months, every major agent framework (LangChain, CrewAI, AutoGPT) will offer native integration with Moduna or a direct competitor. Agent observability will become a checkbox feature for enterprise agent deployments.
Prediction 3: The biggest risk to Moduna is not competition but the commoditization of agent analytics by open-source projects. If a community-driven project like LangFuse adds session replay and cost tracking, Moduna’s pricing power will erode. To survive, Moduna must move up the stack into agent optimization—e.g., automatically suggesting prompt improvements or tool selection changes based on replay analysis.
What to watch: Moduna’s public beta launch in Q3 2026 and its first enterprise customer in a regulated industry (healthcare or finance). If it lands a Fortune 500 bank, the market will take notice.