Longhorn Manager 微服務架構重新定義大規模 Kubernetes 儲存

GitHub March 2026
⭐ 203
Source: GitHubArchive: March 2026
作為 CNCF 孵化項目 Longhorn 的核心控制平面,Longhorn Manager 在為 Kubernetes 編排持久儲存方面展現了前所未有的擴展性。它將每個儲存卷視為獨立的微服務,為有狀態應用提供了一個極度簡化的操作模型。
The article body is currently shown in English by default. You can generate the full version in this language on demand.

Longhorn Manager represents a fundamental rethinking of how persistent block storage should be integrated into Kubernetes environments. Unlike monolithic storage systems that manage pools of capacity, Longhorn Manager instantiates a dedicated controller and replica instances for every single volume, creating a true microservices architecture for storage. This design, built entirely on Kubernetes Custom Resource Definitions (CRDs) and operators, provides granular lifecycle management, high availability through synchronous replication, and enterprise features like incremental snapshots and cross-cluster backup.

The system's significance lies in its operational simplicity. By leveraging Kubernetes as its substrate, it eliminates the need for separate storage administration skills, allowing platform teams to manage petabytes of storage using familiar kubectl commands or its intuitive web UI. This dramatically lowers the barrier to deploying reliable storage for stateful applications like databases (PostgreSQL, MySQL), message queues (Kafka), and custom applications.

However, this architectural elegance comes with inherent trade-offs. The user-space implementation and per-volume overhead can introduce latency and throughput limitations compared to kernel-level drivers or high-performance commercial arrays. The project's growth, evidenced by its steady GitHub star accumulation and adoption in production by companies like SUSE (which offers Rancher Prime with Longhorn) and numerous mid-market enterprises, underscores a clear market need: a 'good enough,' Kubernetes-native storage solution that prioritizes operational simplicity and declarative management over peak performance. As Kubernetes becomes the default runtime for both stateless and stateful services, Longhorn Manager's approach of embedding storage logic directly into the orchestration layer is gaining substantial traction.

Technical Deep Dive

At its core, Longhorn Manager is a collection of Kubernetes controllers that reconcile the state of custom resources, primarily the `Volume` and `Node` CRDs. When a user creates a PersistentVolumeClaim (PVC), the Longhorn CSI driver triggers the manager, which then orchestrates the creation of a volume microservice. This microservice comprises a controller pod (managing the frontend iSCSI block device and handling I/O) and replica pods (storing the actual data) distributed across worker nodes.

The replication protocol is a key innovation. It uses a log-structured, copy-on-write approach for all writes. When a write request arrives at the controller, it is assigned a sequence number and forwarded to all replicas. Each replica writes the data to its local disk (typically a mounted block device or partition) and acknowledges only after the write is persisted. This synchronous replication ensures strong consistency and forms the basis for crash-consistent snapshots. A snapshot is merely a marker in the write log; subsequent writes go to new segments, enabling space-efficient, incremental snapshots without performance-degrading copy operations.

The `longhorn-manager` GitHub repository (part of the main `longhorn/longhorn` project) contains the entire control plane logic. Recent commits show a focus on stability at scale, improved disaster recovery workflows, and integration with broader Kubernetes ecosystem tools like Velero for backups. The architecture's resilience is tested through constant fault injection: the system is designed to detect failed replica instances, automatically rebuild data on healthy nodes, and promote a new controller instance if the active one fails.

Performance characteristics are well-documented. Longhorn operates optimally in environments with low-latency networks (e.g., intra-data center) and direct-attached storage or fast cloud volumes on worker nodes. Its throughput is bounded by network replication overhead and user-space processing.

| Storage Solution | Architecture | Consistency Model | Snapshot Efficiency | Typical Read Latency (4k random) | Typical Write Latency (4k random) |
|---|---|---|---|---|---|
| Longhorn | Microservice-per-volume, User-space | Strong (sync replication) | High (incremental, CoW) | 2-5 ms | 3-8 ms (depends on replica count) |
| Ceph RBD | Monolithic Cluster, Kernel | Strong/Eventual | Medium (depends on pool) | 1-3 ms | 1-4 ms |
| OpenEBS (cStor) | Containerized, User-space | Strong | High (incremental) | 3-7 ms | 4-10 ms |
| AWS EBS | Cloud-managed, Kernel | Strong | High | 0.5-2 ms | 1-3 ms |

Data Takeaway: The table reveals Longhorn's primary trade-off: it sacrifices some raw latency (due to user-space and network hops) for vastly superior operational simplicity and Kubernetes-native integration compared to Ceph. Its performance is competitive with other container-native solutions like OpenEBS, positioning it in the 'easy-to-manage' tier rather than the 'maximum-performance' tier.

Key Players & Case Studies

The development of Longhorn was initiated by Sheng Liang and the team at Rancher Labs (acquired by SUSE in 2020). Their vision was to solve the persistent storage problem for the Rancher Kubernetes platform's users. The project was donated to the Cloud Native Computing Foundation (CNCF) in 2020 and entered incubation status, signaling its growing maturity and community adoption. SUSE now offers Longhorn as a core component of its Rancher Prime subscription, providing enterprise support and hardened builds.

A notable case study is a mid-sized fintech company migrating its on-premise MySQL and Redis instances to a hybrid-cloud Kubernetes platform. They evaluated Ceph Rook but found the operational complexity and resource requirements prohibitive for their small platform team. By deploying Longhorn, they were able to provide developers with self-service, durable volumes via standard PVCs, achieving recovery point objectives (RPO) of zero for critical databases through three-way replication. The built-in backup to S3-compatible object storage satisfied their disaster recovery requirements without additional tooling.

Competition in this space is fierce. Red Hat OpenShift Data Foundation (based on Ceph and NooBaa) targets the full-stack, enterprise OpenShift platform. VMware Tanzu Kubernetes Grid Integrated Edition offers vSphere storage integration. Pure Storage's Portworx, now part of Pure, focuses on data services (encryption, backups, multi-cloud mobility) for large enterprises, but at a higher cost and complexity.

| Product/Project | Primary Backer | Licensing Model | Key Differentiator | Ideal Use Case |
|---|---|---|---|---|
| Longhorn | CNCF Community / SUSE | Open Source (Apache 2.0) | Extreme Kubernetes-native simplicity, per-volume microservice | Kubernetes teams needing simple, reliable storage for standard stateful apps |
| Portworx (Pure Storage) | Pure Storage | Commercial (with free tier) | Advanced data services, multi-cloud data mobility | Enterprise Kubernetes with strict security, compliance, and DR needs |
| Rook (Ceph) | CNCF / Red Hat | Open Source (Apache 2.0) | Mature, feature-rich storage platform at scale | Large deployments where operators can manage Ceph's complexity |
| Google GKE Persistent Disk CSI | Google Cloud | Managed Service | Deep GCP integration, high performance | GKE-exclusive workloads requiring top-tier cloud performance |

Data Takeaway: The competitive landscape shows clear segmentation. Longhorn occupies the 'developer-friendly' and 'platform team-empowering' quadrant, winning through ease of use rather than raw feature breadth. Its open-source model and CNCF affiliation give it a significant advantage in community-driven environments over commercial alternatives like Portworx.

Industry Impact & Market Dynamics

Longhorn Manager is a catalyst for the 'container-attached storage' (CAS) market segment. CAS refers to storage architectures where the control and data paths are scaled per workload or container, aligning with microservices principles. This paradigm shift is accelerating Kubernetes adoption for stateful workloads, a market projected to grow from $1.3 billion in 2023 to over $5.8 billion by 2028, according to industry analysis.

The impact is most profound in small to medium enterprise (SME) and platform engineering teams. These groups often lack dedicated storage administrators. Longhorn democratizes access to resilient storage by encapsulating expertise into software. This lowers the total cost of ownership and accelerates development cycles, as teams no longer need to file tickets with a separate storage team to provision volumes.

Funding and commercial activity around the ecosystem are increasing. While Longhorn itself is open source, SUSE's commercial backing provides a stable downstream. Furthermore, several managed Kubernetes service providers are beginning to offer Longhorn as a built-in or easily installable storage option, recognizing its appeal for users who find cloud provider's native block storage too expensive or insufficiently integrated for complex, multi-tenant clusters.

| Adoption Metric | 2022 Estimate | 2024 Estimate | Growth Driver |
|---|---|---|---|
| Production Clusters Using Longhorn | ~15,000 | ~45,000 | Rise of in-house platform teams, SME Kubernetes adoption |
| Annual PVCs Orchestrated (Est.) | ~50 Million | ~200 Million | Growth of CI/CD, ephemeral environments, data pipelines |
| Contributor Companies (GitHub) | ~12 | ~25 | CNCF incubation attracting corporate contributors |

Data Takeaway: The estimated growth in PVC orchestration is staggering, highlighting Longhorn's role in enabling the 'data-intensive' side of cloud-native development. It's becoming the default storage choice for organizations that prioritize Kubernetes consistency over infrastructure specialization.

Risks, Limitations & Open Questions

Performance ceilings remain the most cited limitation. For I/O-intensive workloads like high-transaction-rate databases or large-scale data processing (e.g., Apache Spark), the overhead of user-space TCP/IP stack traversal and synchronous network replication can become a bottleneck. Longhorn is generally not recommended for latency-sensitive applications requiring sub-millisecond response times.

Operational complexity shifts rather than disappears. While volume management is simple, underlying infrastructure requirements become critical. Longhorn's performance and reliability are directly tied to the network (low latency, high throughput) and the performance of the underlying block device on each node (local NVMe, cloud SSD). Managing these node-level resources at scale—ensuring they are not over-provisioned, monitoring disk health—introduces a new layer of infrastructure concern.

Security in a multi-tenant environment is an open question. While Longhorn supports volume encryption at rest, the fine-grained access control and quota management across numerous teams sharing a large cluster are less mature than in traditional enterprise storage arrays or commercial CAS solutions.

The project's future development pace is also a consideration. As a CNCF incubation project, it relies on a mix of community and corporate contributions. Competing priorities could slow the development of advanced features like stretched clusters for true metro-area high availability or deeper integration with Kubernetes security contexts.

AINews Verdict & Predictions

Longhorn Manager is not just a storage component; it is a strategic enabler for the maturation of Kubernetes as a universal application platform. Its genius lies in its constraint-accepting design: it does not try to beat high-end storage at its own game but instead redefines the game around Kubernetes operational models.

Our predictions are as follows:

1. Prediction 1: Longhorn will become the default storage choice for 70% of new on-premise and hybrid Kubernetes deployments in SMEs by 2026. Its combination of CNCF pedigree, straightforward installation, and 'good enough' performance will make it the path of least resistance, much like Nginx became for web serving.

2. Prediction 2: The major cloud providers will launch 'Longhorn-as-a-Service' managed offerings within the next 24 months. Recognizing the operational pull of its model, AWS, Azure, and GCP will offer integrated, managed Longhorn services on their Kubernetes engines (EKS, AKS, GKE), providing an alternative to their native disk services that feels more 'Kubernetes-native' to developers.

3. Prediction 3: The next major performance breakthrough will come from eBPF integration. We anticipate the Longhorn community will explore using eBPF to shortcut the network and I/O stack, moving data path operations closer to the kernel. This could reduce latency by 30-50%, closing the gap with kernel-based drivers without sacrificing the microservice architecture.

The final verdict: Longhorn Manager is a pivotal piece of infrastructure software that successfully translates the ethos of microservices and declarative management to the stubbornly stateful world of block storage. Its limitations are real but intentional, trading peak performance for radical operational simplicity. For the vast majority of Kubernetes workloads that are not pushing the boundaries of I/O physics, that is an excellent trade. Its continued evolution will be a primary indicator of Kubernetes' success in fully digesting the data center stack.

More from GitHub

角色蒸餾革命:精選技能庫如何賦予AI代理人性化特質The GitHub repository xixu-me/awesome-persona-distill-skills has rapidly gained traction, amassing over 3,300 stars withCharmbracelet 推出 Crush AI 程式碼助手,以終端機優先設計挑戰 GitHub CopilotCharmbracelet has unveiled Crush, an AI-powered code generation and assistance tool currently in early development. The Magic Resume:開源AI工具如何普及專業履歷製作Magic Resume represents a significant evolution in career development technology, moving beyond template-based resume buOpen source hub631 indexed articles from GitHub

Archive

March 20262347 published articles

Further Reading

Harbor 崛起成為企業容器註冊表標準:安全性、複雜性與雲原生演進Harbor 已成為雲原生生態系統中,安全、私有的容器映像管理的實際企業標準。作為一個擁有超過 28,000 個 GitHub 星標的 CNCF 畢業項目,它透過整合的漏洞掃描、簽章政策等功能,填補了軟體供應鏈安全中的關鍵缺口。Kubernetes 獲 120 萬星標:Google 的容器編排器如何成為雲端作業系統Kubernetes 在 GitHub 上的星標數已突破 120 萬,鞏固了其作為現代雲端運算基礎層的地位。本文將深入探討 Google 這款開源容器編排器,如何從內部工具演變成價值超過 500 億美元的龐大生態系,並重塑企業構建與部署軟體阿里巴巴Higress從API網關演進為AI原生流量控制器阿里巴巴的開源項目Higress已完成戰略轉型,正式更名為AI網關。此舉標誌著基礎設施優先級的根本轉變,將AI模型API視為需要專門流量管理的一等公民,而非事後補充。Longhorn 的雲原生儲存革命:Kubernetes 原生區塊儲存如何重塑企業基礎架構Longhorn 代表了在 Kubernetes 生態系統中管理持久性儲存方式的根本性轉變。這個由 Rancher Labs 主導的專案,將每個儲存卷視為獨立的微服務,為有狀態容器工作負載帶來了前所未有的簡易性和可移植性。本文將深入探討其運

常见问题

GitHub 热点“Longhorn Manager's Microservice Architecture Redefines Kubernetes Storage at Scale”主要讲了什么?

Longhorn Manager represents a fundamental rethinking of how persistent block storage should be integrated into Kubernetes environments. Unlike monolithic storage systems that manag…

这个 GitHub 项目在“Longhorn vs Ceph Rook performance benchmark 2024”上为什么会引发关注?

At its core, Longhorn Manager is a collection of Kubernetes controllers that reconcile the state of custom resources, primarily the Volume and Node CRDs. When a user creates a PersistentVolumeClaim (PVC), the Longhorn CS…

从“How to backup Longhorn volumes to S3 step by step”看,这个 GitHub 项目的热度表现如何?

当前相关 GitHub 项目总星标约为 203,近一日增长约为 0,这说明它在开源社区具有较强讨论度和扩散能力。