
























Rook-Ceph operates differently from typical Kubernetes applications. OSDs communicate directly with block devices, CSI drivers mount filesystems using kernel modules, and monitors require stable hostnames and persistent storage. A node that appears healthy to Kubernetes may still cause Rook-Ceph failures if it lacks the right kernel version, modules, or disk configuration.
graph LR
A[Node Verification] --> B[Kernel Check]
A --> C[Disk Check]
A --> D[Memory Check]
A --> E[CPU Check]
A --> F[Network Check]
A --> G[Package Check]
B --> H{All Pass?}
C --> H
D --> H
E --> H
F --> H
G --> H
H -- Yes --> I[Safe to Deploy]
H -- No --> J[Fix Issues First]Check that each storage node meets the minimum resource thresholds before deploying Rook-Ceph:
| Component | Minimum | Recommended |
|---|---|---|
| CPU (per OSD) | 1 core | 2 cores |
| RAM (per OSD) | 2 GB | 4 GB |
| RAM (per Mon) | 1 GB | 2 GB |
| RAM (per Mgr) | 512 MB | 1 GB |
| Disk (OSD) | 10 GB raw | 100+ GB raw |
| Network | 1 Gbps | 10 Gbps |
Check available memory on a node:
kubectl get node node1 -o jsonpath='{.status.capacity.memory}'List allocatable CPU and memory across all nodes:
kubectl get nodes -o custom-columns=\
NAME:.metadata.name,\
CPU:.status.allocatable.cpu,\
MEMORY:.status.allocatable.memoryCeph kernel support improves with newer kernels. Check the kernel version on each node:
kubectl get nodes -o custom-columns=\
NAME:.metadata.name,\
KERNEL:.status.nodeInfo.kernelVersionYou can also run this directly on a node:
uname -rRook-Ceph recommends kernel 4.17 or later. For CephFS with quotas, kernel 4.17+ is required. For RBD fast-diff, kernel 4.10+ is needed.
Check that the required kernel modules are loadable on each node. The easiest way is to run a DaemonSet that probes the modules:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: rook-prereq-check
namespace: default
spec:
selector:
matchLabels:
app: rook-prereq-check
template:
metadata:
labels:
app: rook-prereq-check
spec:
hostPID: true
containers:
- name: checker
image: busybox
command:
- /bin/sh
- -c
- |
echo "Node: $(hostname)"
lsmod | grep rbd && echo "rbd: OK" || echo "rbd: MISSING"
lsmod | grep ceph && echo "ceph: OK" || echo "ceph: MISSING"
sleep 3600
securityContext:
privileged: true
tolerations:
- operator: ExistsApply and read the logs:
kubectl apply -f prereq-check.yaml
kubectl logs -l app=rook-prereq-check --prefix=trueVerify that each node has clean block devices available for Rook-Ceph. Run this command on each storage node:
lsblk --output NAME,SIZE,TYPE,FSTYPE,MOUNTPOINT,LABELIdentify devices that are unpartitioned, have no filesystem, and are not mounted - these are eligible:
NAME SIZE TYPE FSTYPE MOUNTPOINT LABEL
sda 100G disk
sdb 100G disk
nvme0n1 500G diskConfirm no LVM signatures remain on the target devices:
sudo pvs /dev/sdc 2>/dev/null && echo "Has LVM" || echo "Clean"Ceph monitors use hostnames for quorum. Each node's hostname must be resolvable from all other nodes:
# On each node, verify self-resolution
hostname -f
nslookup $(hostname -f)
# Test cross-node resolution
nslookup node2.example.comIf DNS is not available, add entries to /etc/hosts on all nodes:
echo "192.168.1.11 node1.example.com node1" | sudo tee -a /etc/hosts
echo "192.168.1.12 node2.example.com node2" | sudo tee -a /etc/hosts
echo "192.168.1.13 node3.example.com node3" | sudo tee -a /etc/hostsThe Rook CSI driver (ceph-csi) runs on every node that needs to mount Ceph volumes. Check that the CSI requirements are met:
# Check if iscsiadm is available (needed for some configurations)
which iscsiadm
# Check if cryptsetup is available (needed for encrypted volumes)
which cryptsetup
# Check if multipath is properly configured
cat /etc/multipath.conf 2>/dev/null || echo "No multipath config found"Verify the namespace allows privileged pods, which Rook-Ceph requires:
kubectl get namespace rook-ceph -o yaml | grep pod-securityIf Pod Security Admission is enforced, check the labels:
kubectl get namespace rook-ceph --show-labelsThe namespace should have pod-security.kubernetes.io/enforce=privileged.
Use this script to check all requirements from your local machine using kubectl:
#!/bin/bash
echo "=== Rook-Ceph Node Requirements Check ==="
echo ""
echo "--- Kubernetes Version ---"
kubectl version 2>/dev/null | head -2
echo ""
echo "--- Node Status ---"
kubectl get nodes -o wide
echo ""
echo "--- Node Resources ---"
kubectl get nodes -o custom-columns=\
NAME:.metadata.name,\
STATUS:.status.conditions[-1].type,\
CPU:.status.allocatable.cpu,\
MEMORY:.status.allocatable.memory,\
KERNEL:.status.nodeInfo.kernelVersion
echo ""
echo "--- Storage Capacity Check ---"
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.capacity.ephemeral-storage}{"\n"}{end}'
echo "=== Check complete ==="Verifying node requirements before deploying Rook-Ceph prevents the most common class of deployment failures. The key checks are: sufficient CPU and RAM per daemon type, kernel version 4.17 or later with rbd and ceph modules loadable, clean block devices with no filesystem signatures or LVM metadata, reliable hostname resolution between nodes, and a namespace configured to allow privileged pods. Running a pre-deployment DaemonSet to automate these checks across all nodes catches problems before they surface as cryptic Ceph health warnings.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。