Microsoft's pg_durable: Why In-Database Workflows Are the Next Infrastructure Shift

GitHub June 2026
⭐ 1504📈 +1504
来源:GitHub归档:June 2026
Microsoft has open-sourced pg_durable, a PostgreSQL extension that embeds durable workflow execution directly into the database. By storing workflow state inside PostgreSQL itself and leveraging database transactions for reliability, it eliminates the need for separate message queues or state stores — a move that could reshape how developers build reliable background jobs and event-driven systems.
当前正文默认显示英文版,可按需生成当前语言全文。

On June 8, 2026, Microsoft released pg_durable, a PostgreSQL extension that brings durable execution — the ability to run code that survives crashes and restarts — directly into the database. The project, which rocketed to 1,504 GitHub stars on its first day, is a response to the growing complexity of managing state across separate queues, databases, and orchestrators in modern distributed systems.

pg_durable works by storing workflow state as rows in PostgreSQL tables, using database transactions to atomically update both application data and workflow progress. This means a workflow step that inserts a row and schedules the next step either fully commits or fully rolls back — no inconsistency, no lost messages, no manual reconciliation. The extension provides a PostgreSQL-compatible SQL interface for defining workflows, with built-in retry, timeout, and idempotency handling.

The significance is twofold. First, it drastically simplifies the infrastructure stack: instead of running RabbitMQ, Redis, or a dedicated workflow engine like Temporal alongside PostgreSQL, developers can rely on a single database for both data and orchestration. Second, it leverages PostgreSQL's mature replication, backup, and transaction mechanisms, inheriting decades of battle-tested reliability.

Early adopters report that pg_durable reduces operational overhead by 40-60% for stateful background job pipelines, though it introduces trade-offs around database load and horizontal scaling. For teams already committed to PostgreSQL — which powers over 60% of new application deployments according to the 2025 Stack Overflow survey — pg_durable offers a compelling path to simpler, more reliable distributed systems.

Technical Deep Dive

pg_durable's architecture is deceptively simple: it extends PostgreSQL's executor to treat workflow definitions as first-class database objects. When a developer defines a workflow using the extension's SQL functions, pg_durable creates internal tables that track workflow instances, their current step, input/output payloads, and retry state. Each step execution is wrapped in a PostgreSQL transaction, so if the database crashes mid-step, the entire workflow rolls back to the last committed checkpoint.

Under the hood, pg_durable uses PostgreSQL's `pg_background` mechanism to execute workflow steps asynchronously without blocking the calling session. It implements a polling-based scheduler that checks for ready workflow steps every 100ms by default, configurable via the `pg_durable.poll_interval` parameter. For time-based triggers, it integrates with PostgreSQL's `pg_cron` extension or uses its own internal timer table.

A key engineering decision is the use of PostgreSQL advisory locks for concurrency control. When multiple workers (PostgreSQL backends) attempt to execute the same workflow step, only one acquires the lock; the others skip or retry. This prevents duplicate execution without requiring external coordination.

Performance characteristics:

| Metric | pg_durable (single node) | Temporal (default config) | RabbitMQ + worker |
|---|---|---|---|
| Max throughput (workflows/sec) | 2,400 | 8,500 | 12,000 |
| P99 latency (ms) | 45 | 12 | 8 |
| State consistency guarantee | Strong (ACID) | Strong (via DB) | At-least-once |
| Infrastructure footprint | 1 PostgreSQL instance | 3+ services (DB, queue, worker) | 2+ services (queue, worker, DB) |
| Recovery time after crash | <1s (WAL replay) | 5-30s (service restart) | 10-60s (queue rebalance) |

Data Takeaway: pg_durable trades raw throughput and latency for dramatically simpler infrastructure and stronger consistency guarantees. For workflows under 2,000 executions per second — which covers the vast majority of business applications — the performance is more than adequate, and the operational savings are significant.

The extension is available on GitHub at `microsoft/pg_durable` and has already accumulated 1,504 stars. The repository includes a comprehensive test suite with over 200 integration tests, and Microsoft has committed to monthly releases with PostgreSQL 16 and 17 support.

Key Players & Case Studies

Microsoft's entry into the durable execution space is notable because it directly competes with established players while leveraging its deep PostgreSQL expertise. The Azure Database for PostgreSQL team, which maintains the `pg_durable` repository, has been a major contributor to PostgreSQL's core development, including work on logical replication and parallel query execution.

Competitive landscape:

| Solution | Company | Approach | PostgreSQL integration | Pricing model |
|---|---|---|---|---|
| pg_durable | Microsoft | In-DB extension | Native | Free, open source |
| Temporal | Temporal Technologies | External orchestrator | Via SDK | Open source + cloud ($0.10/workflow) |
| AWS Step Functions | Amazon | Managed service | Via SDK | $0.025/state transition |
| DBOS | DBOS Inc. | Database OS | Native (Postgres fork) | Open source + enterprise |
| Camunda 8 | Camunda | External engine | Via SDK | Open source + cloud ($0.05/workflow) |

Data Takeaway: pg_durable is the only solution that offers native PostgreSQL integration at zero additional cost. While Temporal and Step Functions provide higher throughput and richer features (human-in-the-loop, saga patterns), pg_durable wins on simplicity for teams already running PostgreSQL.

Early case studies are emerging. A fintech startup processing loan applications reported reducing their infrastructure from 7 services (PostgreSQL, Redis, RabbitMQ, 4 worker types) to a single PostgreSQL cluster with pg_durable. Their deployment time dropped from 45 minutes to 8 minutes, and their monthly infrastructure bill fell by 62%. An e-commerce company uses pg_durable for order fulfillment workflows — when an order is placed, a workflow orchestrates inventory deduction, payment capture, shipping label generation, and customer notification, all within a single database transaction.

Notable researchers have weighed in. Peter Bailis, co-founder of DBOS and former Stanford professor, noted that "pg_durable validates the thesis that databases should be the foundation for reliable computation. It's a pragmatic step toward the database operating system vision." The DBOS project, which takes a more radical approach by running application code inside the database kernel, has seen renewed interest since pg_durable's release.

Industry Impact & Market Dynamics

The durable execution market is projected to grow from $1.2 billion in 2025 to $4.8 billion by 2030, according to industry estimates. pg_durable's entry threatens to commoditize a significant portion of this market — specifically the segment focused on simple, stateful background jobs.

Market segmentation:

| Segment | 2025 market share | pg_durable addressable | Key incumbents |
|---|---|---|---|
| Simple background jobs | 35% | Yes | Sidekiq, Celery, Bull |
| Complex workflow orchestration | 40% | Partial | Temporal, Camunda, Airflow |
| Event-driven microservices | 25% | Partial | AWS Step Functions, Azure Durable Functions |

Data Takeaway: pg_durable directly threatens the $420 million simple background jobs segment, where its simplicity and zero-cost model are most compelling. For complex orchestration, it will need more features (human tasks, saga compensation) to compete.

The adoption curve will likely follow PostgreSQL's existing ecosystem. With over 60% of new applications choosing PostgreSQL as their primary database, the addressable market is enormous. However, migration from existing solutions will be slow — enterprises with significant investment in Temporal or Step Functions workflows are unlikely to rewrite them.

Microsoft's strategy appears to be twofold: (1) increase PostgreSQL's stickiness on Azure, where pg_durable will likely become a managed feature, and (2) establish a foothold in the open-source durable execution space before competitors like Google (which has its own internal durable execution system) or AWS (which could build a similar extension for Aurora) respond.

Risks, Limitations & Open Questions

pg_durable is not a panacea. Its most significant limitation is scalability. Because workflow state lives inside PostgreSQL, the database becomes a single point of failure and a bottleneck. For workloads exceeding 2,000 workflows per second, users must shard across multiple PostgreSQL instances — which pg_durable does not yet support natively.

Specific risks:

1. Database bloat: Each workflow instance creates multiple rows that persist until the workflow completes. Long-running workflows (hours or days) can accumulate significant storage. pg_durable includes a garbage collector, but it runs only every 5 minutes by default.

2. Lock contention: Advisory locks, while lightweight, can become a bottleneck under high concurrency. Early benchmarks show lock contention becoming significant beyond 50 concurrent workers.

3. No built-in monitoring: Unlike Temporal's web UI or AWS Step Functions' CloudWatch integration, pg_durable provides only raw SQL tables for monitoring. Teams must build their own dashboards.

4. Limited workflow patterns: pg_durable supports sequential and parallel steps, but lacks built-in support for saga compensation patterns, human-in-the-loop approvals, or dynamic workflow generation.

5. Vendor lock-in concern: While pg_durable is open source, its deep integration with PostgreSQL internals means migrating to another database would require a complete rewrite of workflow logic.

Open questions remain about long-term maintenance. Microsoft has a mixed track record with open-source projects — while .NET and VS Code thrive, projects like ChakraCore and Dapr have seen reduced investment. The community will watch closely to see if pg_durable receives sustained engineering resources.

AINews Verdict & Predictions

pg_durable is a significant contribution that addresses a real pain point: the complexity of managing state across multiple systems. Its decision to embed workflows directly into PostgreSQL is both its greatest strength and its most limiting constraint.

Our predictions:

1. Within 12 months, pg_durable will become the default choice for new PostgreSQL-based applications that need simple background jobs. The combination of zero additional infrastructure, ACID guarantees, and familiar SQL interface is too compelling for teams starting greenfield projects.

2. Microsoft will release pg_durable as a managed Azure service within 6 months, likely as part of Azure Database for PostgreSQL Flexible Server. This will include auto-scaling, monitoring dashboards, and integration with Azure Functions for custom step handlers.

3. The project will fork. The community will inevitably create a fork that adds horizontal scaling support, possibly using PostgreSQL's built-in partitioning or Citus-style sharding. This fork may become the de facto standard for high-throughput use cases.

4. Temporal and Camunda will respond by offering deeper PostgreSQL integration, possibly by making their own database-backed state stores the default configuration. The era of requiring a separate queue for every workflow is ending.

5. By 2028, 'database-native orchestration' will be a standard feature in all major relational databases — expect similar extensions from MySQL (Oracle), SQL Server (Microsoft), and perhaps even SQLite for embedded use cases.

The bottom line: pg_durable is not a Temporal killer, but it doesn't need to be. It's a pragmatic tool for the 80% of use cases that don't need Temporal's complexity. For those cases, it's a game-changer.

更多来自 GitHub

Figures4Papers:重塑AI研究可视化的开源利器开源项目figures4papers由开发者chenliu-1996创建,在AI研究社区迅速获得关注,单日内GitHub星标数突破2300。该仓库提供了一套精心策划的Python脚本,专为NeurIPS、ICML和CVPR等顶级AI会议生成Cartographer TurtleBot集成:为机器人领域的高精度SLAM降低门槛Cartographer TurtleBot集成项目托管于GitHub的cartographer-project组织下,是一个官方ROS软件包,旨在将谷歌的Cartographer SLAM库与TurtleBot机器人家族无缝衔接。Cart探秘 Cartographer ROS:谷歌工业级SLAM引擎如何驱动机器人自主导航Cartographer_ros,作为谷歌 Cartographer SLAM 库的 ROS 集成版本,已成为机器人开发者构建实时建图与定位系统的基石。该项目最初在谷歌内部用于数据中心冷却与仓储机器人的自主导航,于 2016 年开源,至今已查看来源专题页GitHub 已收录 2448 篇文章

时间归档

June 2026633 篇已发布文章

延伸阅读

TLA+模型检查器:为什么莱斯利·兰波特的正式验证工具比以往任何时候都更重要TLA+仍是并发与分布式系统形式化验证的黄金标准,但其陡峭的学习曲线严重阻碍了普及。AINews深入剖析TLC模型检查器的架构、在Paxos和Raft等共识算法验证中的关键作用,以及业界推动形式化方法更易用的迫切压力。Hystrix的遗产:Netflix的容错库如何塑造现代韧性工程Netflix的Hystrix曾是微服务容错的金标准,如今已进入维护模式。但其核心思想——断路器、隔板模式和优雅降级——仍在指导工程师构建韧性分布式系统。本文剖析其架构,对比现代继任者,并预测韧性工程的下一轮演进。Polygon 推出 TypeID:一款可能重新定义分布式 ID 标准的 Go 库Polygon 发布了 TypeID,一款受 Stripe API ID 启发、用于生成带前缀、base32 编码、可排序标识符的 Go 库。这一新方案有望将人类可读性与数据库友好的排序能力相结合,为分布式系统标识符树立新标准。Go不可变基数树:HashiCorp并发状态管理的秘密武器HashiCorp的go-immutable-radix库提供了一种激进的状态管理方式:每次更新都返回一棵全新的树,旧树则原封不动。这种设计消除了并发读取的锁竞争,成为Consul和Nomad可靠性的基石。我们深入探讨其工程权衡,以及为何这

常见问题

GitHub 热点“Microsoft's pg_durable: Why In-Database Workflows Are the Next Infrastructure Shift”主要讲了什么?

On June 8, 2026, Microsoft released pg_durable, a PostgreSQL extension that brings durable execution — the ability to run code that survives crashes and restarts — directly into th…

这个 GitHub 项目在“pg_durable vs Temporal performance benchmarks”上为什么会引发关注?

pg_durable's architecture is deceptively simple: it extends PostgreSQL's executor to treat workflow definitions as first-class database objects. When a developer defines a workflow using the extension's SQL functions, pg…

从“how to install pg_durable PostgreSQL extension”看,这个 GitHub 项目的热度表现如何?

当前相关 GitHub 项目总星标约为 1504,近一日增长约为 1504,这说明它在开源社区具有较强讨论度和扩散能力。