惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

美团技术团队
D
DataBreaches.Net
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
D
Docker
N
Netflix TechBlog - Medium
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
C
Check Point Blog
腾讯CDC
Stack Overflow Blog
Stack Overflow Blog
V
Visual Studio Blog
IT之家
IT之家
月光博客
月光博客
U
Unit 42
K
Kaspersky official blog
T
Threatpost
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
GbyAI
GbyAI
P
Proofpoint News Feed
Last Week in AI
Last Week in AI
云风的 BLOG
云风的 BLOG
酷 壳 – CoolShell
酷 壳 – CoolShell
I
InfoQ
Engineering at Meta
Engineering at Meta
Recorded Future
Recorded Future
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
S
Security @ Cisco Blogs
MyScale Blog
MyScale Blog
大猫的无限游戏
大猫的无限游戏
Security Archives - TechRepublic
Security Archives - TechRepublic
Webroot Blog
Webroot Blog
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Hacker News - Newest:
Hacker News - Newest: "LLM"
S
Schneier on Security
S
Secure Thoughts
The Register - Security
The Register - Security
B
Blog RSS Feed
The Last Watchdog
The Last Watchdog
P
Palo Alto Networks Blog
爱范儿
爱范儿
B
Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
N
News and Events Feed by Topic
阮一峰的网络日志
阮一峰的网络日志
L
LINUX DO - 热门话题
C
Cisco Blogs
Spread Privacy
Spread Privacy
F
Full Disclosure
博客园 - 聂微东
T
The Blog of Author Tim Ferriss

Cloud Native Computing Foundation

Kepler, re-architected: Improved power accuracy and a community call to action! Dragonfly v2.5.0 is released OTel and mesh-derived metrics: A 2026 reference Security Profiles Operator v1: Stable APIs, Security Hardened, and Shaping Upstream Kubernetes Securing CI/CD for an open source project, part 3: Credentials, verification, and what’s next Building a Cluster-Aware AI Agent with Kubernetes, Argo CD, and GitOps From Awareness to Engineered Accessibility in Open Source Agent Auth: A lawyer’s day in court Building Jaeger’s ClickHouse backend: 8.6× compression on 10 million spans Telemetry that matters: Designing sustainable, high-impact observability pipelines KubeCon + CloudNativeCon, OpenInfra Summit and PyTorch Conference Unite in China to Scale AI Flipkart Wins CNCF End User Case Study Contest for Kubernetes and Chaos Engineering Scale Expanding CARE: Passing CKS can now extend your CKA certification CNCF and Linux Foundation Education Partner with Udemy to Provide a Unified Cloud Native Training & Certification Opportunity CNCF and SlashData Report Confirms India as One of the Largest Cloud Native Communities with 2.25 Million Developers CNCF Welcomes New Silver Members as Global Demand for Cloud Native Infrastructure Grows Why cloud native belongs at the heart of agentic AI: Lessons from building a multi-agent security platform on Kubernetes Client Challenge Client Challenge Client Challenge Client Challenge Client Challenge Client Challenge Client Challenge Client Challenge Client Challenge Client Challenge Client Challenge Client Challenge Client Challenge Building a cloud native internal developer platform with Kubernetes, GitOps, and supply chain security The Kubernetes integration tax: Prometheus, Cilium and production reality GPU autoscaling on Kubernetes with KEDA: Building an external scaler Three TAG leads walk into the TOC How Jaeger is evolving to trace AI agents with OpenTelemetry Why Kubernetes policy enforcement happens too late—and what to do about it Zero-Downtime migration from ingress NGINX to Envoy Gateway Client Challenge Client Challenge Client Challenge
etcd-operator joins Cozystack with a new v1alpha2 API
epower · 2026-06-29 · via Cloud Native Computing Foundation

Posted on June 29, 2026 by Andrey Kolkov and Andrei Kvapil, Ænix

CNCF projects highlighted in this post

Cozystack logo

The etcd-operator project, which develops an operator for deploying and maintaining etcd clusters on Kubernetes, has been donated to the Cozystack project. Alongside the donation, a from-scratch implementation of the operator has been published under a new API version — etcd-operator.cozystack.io/v1alpha2, superseding the previous etcd.aenix.io/v1alpha1. Instead of managing members through a StatefulSet, the new implementation directly drives etcd’s native Membership API (the MemberAdd, MemberPromote and MemberRemove operations), giving the operator full control over cluster membership. The new implementation was written by Timofei Larkin, one of the maintainers of the previous codebase, which is preserved in the v1alpha1 branch. The project is written in Go and distributed under the Apache 2.0 license.

The project was started by Ænix, which assembled an initiative group from the Kubernetes community to build it. After the base implementation was completed, an attempt was made to donate the project to the CNCF. Prompted by this initiative, the etcd project concluded that an official operator was needed and formed its own working group, which, after evaluating existing implementations, chose to develop a codebase from scratch — this is how etcd-io/etcd-operator came to be. Feature-wise, the official operator has not yet caught up with the aenix etcd-operator, which is already used in production by the community and by projects such as Cozystack and Kamaji, so the project has continued its own independent line of development (a comparison with the official operator is given at the end of this article).

The operator manages etcd clusters through two resources: EtcdCluster describes the desired state of a cluster (replica count, etcd version, storage parameters, TLS, authentication, etcd tuning), while EtcdMember is created by the operator itself for every cluster member and owns its Pod and PVC. Unlike typical solutions, the operator does not use a StatefulSet — each member’s Pod and PVC are reconciled independently, and cluster membership changes go through etcd’s Membership API: new members join as learners (MemberAdd) and are later promoted to voting members (MemberPromote), removal is performed with a graceful exit from quorum (MemberRemove), and pausing a cluster preserves member identity. The rationale behind this architecture is described in concepts.md.

Key features

  • cluster bootstrap and scaling in both directions one member at a time: learner-mode joins, graceful removal with exit from quorum;
  • pausing a cluster without losing data (spec.replicas: 0) and resuming it with the same cluster and member identifiers;
  • data storage in PVCs (default) or in tmpfs — for data that can be reconstructed; memory-backed members are automatically recreated when their Pod is lost;
  • independent TLS configuration for client and peer connections: bring your own Secrets or let the operator issue and automatically renew certificates via cert-manager;
  • authentication with a single root user whose credentials are supplied through a Secret;
  • snapshots to S3 or a PVC via the EtcdSnapshot resource and cluster restore from a snapshot at initial bootstrap;
  • an automatically created PodDisruptionBudget that prevents drain operations from breaking quorum;
  • spec validation by the apiserver (CEL expressions in the CRD) without webhooks or a cert-manager dependency;
  • the /scale subresource, which makes kubectl scale and the VerticalPodAutoscaler work, a metrics port on 2381, pass-through affinity and topologySpreadConstraints;
  • the kubectl-etcd plugin for day-2 operations performed after the cluster is deployed.

What changed compared to v1alpha1

Compared with the old etcd.aenix.io/v1alpha1 implementation, the following changes were made:

  • the API group changed from etcd.aenix.io to etcd-operator.cozystack.io;
  • separate per-member EtcdMember resources are used instead of a StatefulSet;
  • the free-form spec.options map was replaced with a typed set of parameters (quota-backend-bytes, auto-compaction mode and retention, snapshot-count) — the free-form map allowed passing flags that conflicted with the operator’s logic;
  • the EtcdBackup resource was renamed to EtcdSnapshot with its semantics preserved;
  • validation moved from a webhook to CEL rules in the CRD;
  • the cluster Service was switched to headless mode, which is required for stable per-member DNS names.

Migration is performed in place with the etcd-migrate tool: a running cluster of the old operator is adopted without moving data, restarting Pods or losing quorum — only object ownership, labels and annotations are changed, after which the new operator takes over. Clients that reach the cluster by DNS name keep working unchanged. The procedure is described in migration.md.

Comparison with the official operator

The implementation covers most items of the roadmap of the official etcd-operator developed by the etcd project. Status by roadmap item:

  1. Create a new etcd cluster, e.g., a 3- or 5-member cluster of a specified etcd version — implemented.
  2. Understand health of a cluster — implemented.
  3. Enabling TLS communication, including cert renewal — implemented.
  4. Upgrade across patches or one minor version — partially implemented: spec.version applies only to newly created members.
  5. Scale in and out, e.g., 1 -> 3 -> 5 members and vice versa — implemented.
  6. Support customizing etcd options (via flags or env vars) — implemented, as a typed closed set of parameters.
  7. Recover a single failed cluster member (still have quorum) — partially implemented: members with a broken PVC are not replaced automatically yet.
  8. Recover from multiple failed cluster members (quorum loss) — not implemented, work is planned.
  9. Create on-demand backup of a cluster — implemented.
  10. Create periodic backup of a cluster — deliberately out of scope: recurring snapshots are expected to be driven by a standard CronJob.

Beyond that roadmap, v1alpha2 also ships capabilities the official plan does not enumerate, driven by the Cozystack and Kamaji multi-tenant use case:

  • scale to zero (pause/resume) preserving cluster and member identity;
  • memory-backed (tmpfs) storage with operator-driven member replacement;
  • apiserver-side CEL validation — no webhook, no certificate dependency;
  • an auto-emitted PodDisruptionBudget scoped to voting members;
  • the /scale subresource with a populated status.selector, so kubectl scale and a VerticalPodAutoscaler.targetRef work directly;
  • pass-through scheduling (affinity, topologySpreadConstraints) and merged additionalMetadata across every owned object;
  • an in-place migration tool from the legacy operator;
  • the kubectl-etcd plugin for day-2 operations.