惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
V
Vulnerabilities – Threatpost
Attack and Defense Labs
Attack and Defense Labs
N
News and Events Feed by Topic
SecWiki News
SecWiki News
S
Security @ Cisco Blogs
Schneier on Security
Schneier on Security
B
Blog
TaoSecurity Blog
TaoSecurity Blog
The Last Watchdog
The Last Watchdog
H
Hacker News: Front Page
Hacker News - Newest:
Hacker News - Newest: "LLM"
博客园_首页
D
Docker
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Y
Y Combinator Blog
W
WeLiveSecurity
N
News and Events Feed by Topic
F
Fortinet All Blogs
PCI Perspectives
PCI Perspectives
WordPress大学
WordPress大学
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Recent Announcements
Recent Announcements
Forbes - Security
Forbes - Security
T
Tailwind CSS Blog
Hacker News: Ask HN
Hacker News: Ask HN
爱范儿
爱范儿
腾讯CDC
Last Week in AI
Last Week in AI
月光博客
月光博客
C
Cybersecurity and Infrastructure Security Agency CISA
P
Proofpoint News Feed
Help Net Security
Help Net Security
V
V2EX
C
Cyber Attacks, Cyber Crime and Cyber Security
C
CXSECURITY Database RSS Feed - CXSecurity.com
H
Heimdal Security Blog
L
LINUX DO - 最新话题
GbyAI
GbyAI
The Hacker News
The Hacker News
罗磊的独立博客
S
SegmentFault 最新的问题
H
Hackread – Cybersecurity News, Data Breaches, AI and More
博客园 - 【当耐特】
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
V2EX - 技术
V2EX - 技术
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
O
OpenAI News
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻

OneUptime Blog

How to Monitor Azure App Services (PaaS) with OpenTelemetry Grafana Stack vs OneUptime: DIY Observability or Unified Platform? Your AI Workloads Are About to Blow Up Your Observability Bill The Great Observability Consolidation Is Here How to Write Custom Object Classes for Ceph How to Write Custom Ceph Manager Modules How to Write a ceph.conf Configuration File How to Use Rook-Ceph with OpenShift How to Use Rook-Ceph with Longhorn for Comparison How to Configure Volume Snapshot Class for RBD in Rook How to Configure VolumeReplicationClass Scheduling Intervals in Rook How to Set Up Volume Replication with Rook-Ceph How to Create Volume Group Snapshots with Rook CSI How to Visualize Ceph Network Performance in Grafana How to Enable Virtual Host-Style Bucket Access in Rook How to View Runtime Configuration via Admin Socket How to View Quota Settings and Update Stats in Ceph RGW How to View PG Scaling Recommendations with autoscale-status How to View PG Distribution via Admin Socket How to View Performance Metrics in the Ceph Dashboard How to View OSD Performance Counters in Ceph How to View Connection Status via Admin Socket How to View Ceph Cluster Summary Dashboard via CLI How to Version Control Rook-Ceph Configuration How to Version Control Ceph Infrastructure with Terraform How to Verify Kubernetes Node Requirements for Rook-Ceph Deployment How to Verify Health Before and After Rook Upgrades How to Verify Data Integrity with Deep Scrubbing How to Verify Complete Rook-Ceph Cleanup How to Verify Backup Integrity from Ceph Snapshots How to Use Rook-Ceph with Velero for Kubernetes Backup How to Integrate HashiCorp Vault with Rook-Ceph (Token Auth) How to Configure TLS for Vault Integration in Rook How to Integrate HashiCorp Vault with Rook-Ceph (Kubernetes Auth) How to Validate Ceph Cluster Configuration After Deployment How to Understand User Type and ID Notation (TYPE.ID) in Ceph How to Configure User Management in the Ceph Dashboard How to Use Rook-Ceph with Kubernetes Operators How to Use Rook-Ceph with Helm Chart Deployments How to Use the Swift API with Ceph RGW How to Use SQLite Databases Stored on Ceph How to Use s3cmd with Ceph RGW How to Use the S3 API with Ceph RGW How to Use Red Hat Ceph with RHEL Virtualization How to Use RBD with QEMU How to Use RBD with Nomad How to Use RBD with CloudStack How to Use RBD Snapshot Rollback How to Use rados bench for Object Storage Benchmarking How to Secure Rook-Ceph with Pod Security Admission How to Use pg-upmap for PG Mapping in Ceph How to Use Multipath Devices with Ceph OSDs How to Use MinIO Client (mc) with Ceph RGW How to Use fs swap for CephFS How to Use fio for Ceph Block Storage Benchmarking How to Use the CephFS Shell How to Use Ceph RGW for Media Asset Management How to Use Ceph RGW for Log Storage and Archival How to Use Ceph RGW for Data Lake Storage How to Use Ceph RGW for Backup Repository Storage How to Use the ceph-authtool Utility How to Use boto3 (Python) with Ceph RGW S3 How to Use AWS CLI with Ceph RGW S3 How to Use the Admin Ops API with Ceph RGW How to Configure Usage Log Key Transition in Ceph RGW How to Handle Rook-Ceph Upgrades in GitOps Pipelines How to Upgrade Rook-Ceph with Zero Downtime How to Create a Ceph Upgrade Runbook How to Upgrade the Rook Operator from v1.18 to v1.19 How to Upgrade the Rook Operator on Kubernetes How to Upgrade External Cluster Connections in Rook How to Upgrade the Ceph Version in Rook How to Upgrade from Ceph Reef to Squid How to Upgrade from Ceph Quincy to Reef How to Update Kernel for CephFS Feature Compatibility How to Update Ceph Configuration on a Running Rook Cluster How to Create Unique Kubernetes Services per NFS Server in Rook How to Understand When Compression Helps vs Hurts in Ceph How to Understand User Types (Individual vs System) in Ceph How to Understand the undersized PG State in Ceph How to Understand the stale PG State in Ceph How to Understand the repair PG State in Ceph How to Understand the remapped PG State in Ceph How to Understand Red Hat Ceph Storage vs Upstream Ceph How to Understand Placement Groups in Ceph How to Understand PG Splitting in Ceph How to Understand the peering PG State in Ceph How to Understand OSD Recovery Process in Ceph How to Understand the OSD Map in Ceph How to Understand New Features in Each Ceph Release How to Understand Monitor Leadership in Ceph How to Understand MDS States in CephFS How to Understand Deprecated Features in Ceph Reef How to Understand the degraded PG State in Ceph How to Understand D3N in Ceph How to Understand the creating PG State in Ceph How to Understand the clean PG State in Ceph How to Understand CephX Authentication Protocol How to Understand CephX Authentication Flow How to Understand What Data Ceph Telemetry Collects
How to Upgrade Ceph Clusters in Stretch Mode
Nawaz Dhandala · 2026-03-31 · via OneUptime Blog

Upgrade Considerations for Stretch Mode

Upgrading a Ceph cluster in stretch mode requires extra care because you must maintain quorum throughout the upgrade. Upgrading monitors or OSDs on both sites simultaneously can break quorum and cause a cluster-wide outage.

Pre-Upgrade Checklist

Before starting the upgrade, verify the cluster is healthy:

ceph status
ceph health detail
ceph osd stat
ceph mon stat

Ensure all PGs are active+clean:

ceph pg stat | grep "active+clean"

Set conservative flags to prevent rebalancing during the upgrade:

ceph osd set noout
ceph osd set noscrub
ceph osd set nodeep-scrub

Step 1 - Upgrade the Arbiter Monitor

Start with the arbiter since it has no OSDs and minimal impact on cluster operations:

ceph orch upgrade start --image quay.io/ceph/ceph:v18.2.0 --daemon-types mon --hosts mon-arbiter

Monitor the upgrade:

ceph orch upgrade status

Wait for the arbiter to rejoin quorum:

ceph mon stat

Step 2 - Upgrade Site A Monitors

Upgrade the two monitors on site A one at a time:

ceph orch upgrade start --image quay.io/ceph/ceph:v18.2.0 --daemon-types mon --hosts mon-dc1a

Wait for quorum:

ceph quorum_status

Then upgrade the second monitor on site A:

ceph orch upgrade start --image quay.io/ceph/ceph:v18.2.0 --daemon-types mon --hosts mon-dc1b

Step 3 - Upgrade Site B Monitors

Repeat for site B monitors:

ceph orch upgrade start --image quay.io/ceph/ceph:v18.2.0 --daemon-types mon --hosts mon-dc2a
ceph quorum_status
ceph orch upgrade start --image quay.io/ceph/ceph:v18.2.0 --daemon-types mon --hosts mon-dc2b

Step 4 - Upgrade Managers

Upgrade MGR daemons:

ceph orch upgrade start --image quay.io/ceph/ceph:v18.2.0 --daemon-types mgr

Step 5 - Upgrade OSDs by Site

Upgrade OSDs on site A first, then site B. This maintains data accessibility from at least one site:

# Upgrade site A OSDs
ceph orch upgrade start --image quay.io/ceph/ceph:v18.2.0 \
  --daemon-types osd --hosts host-dc1a,host-dc1b

# Wait for PGs to recover
watch ceph pg stat

# Upgrade site B OSDs
ceph orch upgrade start --image quay.io/ceph/ceph:v18.2.0 \
  --daemon-types osd --hosts host-dc2a,host-dc2b

Step 6 - Post-Upgrade Verification

After all daemons are upgraded, clear the maintenance flags:

ceph osd unset noout
ceph osd unset noscrub
ceph osd unset nodeep-scrub

Verify the cluster version and health:

ceph version
ceph status
ceph orch ps | grep -v "running"

Summary

Upgrading Ceph clusters in stretch mode requires a sequential per-site approach starting with the arbiter monitor. By upgrading monitors and OSDs site by site and waiting for quorum and PG health at each step, you maintain continuous availability throughout the upgrade process. Setting maintenance flags before starting prevents unnecessary rebalancing during the rolling upgrade.