惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Google DeepMind News
Google DeepMind News
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
Security Latest
Security Latest
P
Palo Alto Networks Blog
AWS News Blog
AWS News Blog
NISL@THU
NISL@THU
T
Threatpost
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Latest news
Latest news
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
WordPress大学
WordPress大学
J
Java Code Geeks
P
Privacy International News Feed
阮一峰的网络日志
阮一峰的网络日志
S
Schneier on Security
博客园 - 聂微东
Project Zero
Project Zero
美团技术团队
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Scott Helme
Scott Helme
I
Intezer
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
H
Hacker News: Front Page
S
Security @ Cisco Blogs
博客园 - 司徒正美
O
OpenAI News
Last Week in AI
Last Week in AI
L
LINUX DO - 热门话题
酷 壳 – CoolShell
酷 壳 – CoolShell
SecWiki News
SecWiki News
月光博客
月光博客
S
Security Affairs
The GitHub Blog
The GitHub Blog
P
Privacy & Cybersecurity Law Blog
S
Secure Thoughts
V
V2EX
S
Securelist
F
Fortinet All Blogs
W
WeLiveSecurity
D
Docker
博客园 - 三生石上(FineUI控件)
Simon Willison's Weblog
Simon Willison's Weblog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
C
Cyber Attacks, Cyber Crime and Cyber Security
V
Visual Studio Blog
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Webroot Blog
Webroot Blog
Engineering at Meta
Engineering at Meta

Proxmox Support Forum

[SOLVED] - Github Auth for Mirrors-Kernel Repo? [Automation] Mass migration tool for MS Win11/Server Proxmox GUI hang - not response is it possible to reject or quarantine spam based on conditions I set ? The PVENode task list in PVE9 is partially obscured due to the terminal font being too large. About 100% error reporting due to pveproxy.service hooks Kubernetes overlay networking breaks when upgrading from PVE 9.1 to PVE 9.2.3 Zentraler Speicher No space left on device Combine datastore and direct file archival to tape Kernel panic VFS: Unable to mount root fs on unknown-block (0,0) sobald ein 7.x Kernel verwendet wird. How to migrate disk of a VM from one ZFS to another Windows Server 2025 fails to boot after PVE 9.2 / Linux 7.0 Kernel upgrade Cannot Install Proxmox on T610 Poweredge with H700 PERC card sdn Config. gateway not reachable How to safely change domain/FQDN? Welche Filterquote erreicht ihr? NFS Share status unknown on 2 of 5 nodes Can't connect to PVE9 consoles [solved] Can't connect to PVE9 consoles [solved] [SOLVED] - Use secondary network for PVE commands Created cluster, one node storage gone BUG: proxmox mail gateway FROM = null bypass spam filtering Moving existing PBS from VMWare workstation to PVE cluster Does eBGP SDN fabric support external peering? Bug: PDM 1.1 not recognizing valid license status Proxmox GUI hang - not response PVE crashes unexpectedly Proxmox Backup Server 4.2 released! Advice ceph-osd crashes with kernel 6.17.2-1-pve on Dell system [META] Links on Proxmox Forum Website Hardwarer oder Software RAID Joining a cluster with already created guests VM PDM missing backup jobs from PVE / Log retention Remove VM.Monitor from all users/roles, PVE 9.2 Proxmox Freezing (new instalation) 9.2.2 - Intel 12700T No Web gui and random connection reset by peer [SOLVED] - i40e module for X710 Intel NIC Dutch Proxmox Day 2026 How pools use the space Corosync initiiert Reboot trotz Verfügbarkeit der Systeme Opt-in Linux 7.0 Kernel for Proxmox VE 9 available After PVE 8to9 upgrade, unable to check guest fs freeze status Problem with MegaRAID SAS3508 controller proxmox-kernel-7.0.2-6-pve failing network service Auto sync guest time after rollback of VM snapshot with RAM/state Broadcom BCM57504 (100G) bnxt_en TX timeout and NIC reset on Proxmox 8.1.5 — while BCM57414 (25G) works fine on same host QEMU 11.0 available on pve-test and pve-no-subscription as of now 350 MPM Solventless Lamination Machine for High-Speed Flexible Packaging Making sense of NVMe zfs and SMART errors [SOLVED] - PVE loses network connection after kernel upgrade to proxmox-kernel-7.0.0-3-pve [SOLVED] - Remove or reset cluster configuration. Proxmox 8.4.1 Fresh Install BCM57416 10G Ethernet Adapter Not Recognized PDM 1.1.1 unable to add AD realm with anonymous search [TUTORIAL] - Developer Workstation (Proxmox-VE 9) with cinnamon (LMDE7) SDN zone shows "pending" on peer nodes after node reboot (9.2.x) Cluster not quorate - extending auth key lifetime! Proxmox not rebooting properly (SOLVED) Proxmox 9 Stuck on loading initial ramdisk With new HA-Disarm Feature is there a Documentation for NUT Setup on Clusters? Proxmox 8.3 Installation Issue on ProLiant DL380 Gen9 Cluster networking setup LXC System images unavailable [SOLVED] - Fix: NVIDIA Drivers Failing after upgrade to Proxmox 9.2.2 (Kernel 7.0.2-6-pve) / NovaCore Conflict Install NUT directly on Proxmox VE and control guests from here driver usb for windows 7 System startup error and no network: Failed to start ifupdown2-pre.service - Helper to synchronize boot up for ifupdown. PBS backup space grow up constantly Proxmox Datacenter Manager 1.1 released! IPv4 not available in newly created VM Recommended Setup for Offsite Proxmox Backups? Hetzner Storage Box & Remote PBS Challenges duplicate, please delete this passthrought an USB device "by ID" to CT PDM Installer Freezes at 66% Tried PDM for the first time (version 1.1) - had issues PDM 1.1 automated install Suche Server-Provider für Proxmox connecting sdn to edge firewall SDN, IPAM & DHCP Migrating from read-only file system Ubuntu 26.04 installation fails for unknown reason Status Unbekannt nach Cluster Join Installing Proxmox Backup Server on Mac Mini (Late 2012) kernel 7.0 performance issue with zfs pools PVE becomes unreachable via ethernet but OS is running [SOLVED] - New 9.2 install - can't find 7.0.2-6-pve , not all the time [SOLVED] - Backup and dedupe a VM with LUKS Gibt es mit PVE 2.x ggf. Änderungen bei der RAM-Nutzung, bzw. deren Anzeige bei VMs? I need help for setting up backup solution Way more NAGware, very little functionality, bugs galore Root squashing virtiofsd with --uid-map Intel ixgbe Driver Update Fail Passkey Login (not 2FA) Roblox VM detection - can be overcome? [TUTORIAL] - ZFS-Autosnaptshot inkl. Rollback und Daten direkt recovern (Windows/Linux) How to stop PVE Kernel upgrade [SOLVED] - very long waiting to log in to lxc debian 11 ssh [TUTORIAL] - Configuring Fusion-Io (SanDisk) ioDrive, ioDrive2, ioScale and ioScale2 cards with Proxmox Increase maximum USB devices in vm.conf
Cannot remove disk for VM when not all Proxmox nodes are online
invalid@exam · 2026-06-01 · via Proxmox Support Forum

I'm currently using a 4 node proxmox cluster with a CEPH filesystem as storage. Of these 4 nodes only 3 of these are running and 1 is offline.

1740400524751.png
Which is intended for my purposes.

I have a VM (104) which is currently stored on an erasure pool. The erasure pool is only stored on the online nodes. The proxmox4 node is not participating in this CEPH pool.

However when I try to remove the disk for VM 104 (or in this case, the whole VM), proxmox shows an error that it can't acquire cfs lock:

Code:

()
trying to acquire cfs lock 'storage-DeveErasurePool_P1' ...
trying to acquire cfs lock 'storage-DeveErasurePool_P1' ...
trying to acquire cfs lock 'storage-DeveErasurePool_P1' ...
trying to acquire cfs lock 'storage-DeveErasurePool_P1' ...
trying to acquire cfs lock 'storage-DeveErasurePool_P1' ...
trying to acquire cfs lock 'storage-DeveErasurePool_P1' ...
trying to acquire cfs lock 'storage-DeveErasurePool_P1' ...
trying to acquire cfs lock 'storage-DeveErasurePool_P1' ...
trying to acquire cfs lock 'storage-DeveErasurePool_P1' ...
Could not remove disk 'DeveErasurePool_P1:vm-104-disk-0', check manually: cfs-lock 'storage-DeveErasurePool_P1' error: got lock request timeout
purging VM 104 from related configurations..
TASK OK

Only when I turn on proxmox4, I am able to remove this disk.

Hi,

What says the output `pvecm status` when only 3 nodes are online? Could you also please provide us with the output of `pveceph status` command.

The weird thing is, is that after booting up that offline proxmox host, and turning it off again, I can now still remove disks.

I've had this problem happen multiple items in the past though and the only thing that I could do to solve it was to start that offline proxmox host.

It's a bit hard to reproduce sadly.

Anyway, I've ran the commands as you requested:

Code:

root@proxmox1:~# pvecm status
Cluster information
-------------------
Name:             DeveCluster
Config Version:   8
Transport:        knet
Secure auth:      on

Quorum information
------------------
Date:             Mon Feb 24 16:31:47 2025
Quorum provider:  corosync_votequorum
Nodes:            3
Node ID:          0x00000001
Ring ID:          1.16dbb
Quorate:          Yes

Votequorum information
----------------------
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2 
Flags:            Quorate

Membership information
----------------------
    Nodeid      Votes Name
0x00000001          1 10.88.20.10 (local)
0x00000003          1 10.88.20.12
0x00000004          1 10.88.20.13
root@proxmox1:~# pveceph status
  cluster:
    id:     3d4ac63a-0d65-479a-a467-cf90a222c285
    health: HEALTH_WARN
            6 osds down
            1 host (6 osds) down
            Reduced data availability: 128 pgs stale
 
  services:
    mon: 3 daemons, quorum proxmox1,proxmoxdevenologynew,proxmox3 (age 23m)
    mgr: proxmox1(active, since 24m), standbys: proxmoxdevenologynew, proxmox3
    mds: 1/1 daemons up, 1 standby
    osd: 9 osds: 3 up (since 3m), 9 in (since 2h)
 
  data:
    volumes: 1/1 healthy
    pools:   8 pools, 321 pgs
    objects: 270.40k objects, 1.0 TiB
    usage:   2.1 TiB used, 3.1 TiB / 5.2 TiB avail
    pgs:     193 active+clean
             128 stale+active+clean
 
  io:
    client:   6.4 MiB/s rd, 18 MiB/s wr, 238 op/s rd, 78 op/s wr

I configured proxmox4 to not have any votes in quorum. (Since it's mostly offline)

@Moayad , shall I see how things go and come back to this thread when I run into the same issue again?

@Moayad , I just ran into this same error again:

1751626416062.png

Here's the pveceph status:

Code:

root@proxmox1:~# pveceph status
  cluster:
    id:     3d4ac63a-0d65-479a-a467-cf90a222c285
    health: HEALTH_WARN
            3 osds down
            1 host (6 osds) down
            Reduced data availability: 128 pgs inactive
            128 pgs not deep-scrubbed in time
            128 pgs not scrubbed in time
 
  services:
    mon: 3 daemons, quorum proxmox1,proxmoxdevenologynew,proxmox3 (age 3w)
    mgr: proxmox1(active, since 3w), standbys: proxmoxdevenologynew, proxmox3
    mds: 1/1 daemons up, 1 standby
    osd: 9 osds: 3 up (since 3w), 6 in (since 7w)
 
  data:
    volumes: 1/1 healthy
    pools:   8 pools, 321 pgs
    objects: 197.38k objects, 753 GiB
    usage:   1.5 TiB used, 993 GiB / 2.5 TiB avail
    pgs:     39.875% pgs unknown
             193 active+clean
             128 unknown
 
  io:
    client:   341 B/s rd, 138 KiB/s wr, 0 op/s rd, 21 op/s wr

@Moayad, this issue still exists. I always have to boot up all my machines in the proxmox network to remove a machine/disk.

I actually used AI to do some research and found the underlying issue:

After investigation, this is not a cfs-lock or quorum problem (the cluster is quorate and pmxcfs is healthy throughout). The real chain is:

The VM being removed lives on a healthy, fully-online RBD pool. However, Proxmox's disk-removal / VM-purge path enumerates all configured RBD storages (e.g. via rbd ls) to locate any disks belonging to the VMID, regardless of which pool the VM actually uses.

My cluster has additional RBD pools that are intentionally backed by a single host which is powered off as part of normal operation (the pool is meant to vanish when that host is off). With that host down, every PG in those pools goes stale+active+clean — both replicas were on OSDs on the offline host — so any client op against the pool, including rbd ls, blocks in librados indefinitely (no I/O timeout).

The enumeration step therefore hangs on the unavailable pool. The qmdestroy worker sits in select() waiting on the child rbd process (confirmed: child stuck in futex_wait against the offline pool), while holding the per-VM config lock. Subsequent removal attempts then fail with can't lock file '/var/lock/...' because the first operation never releases its lock. Powering the offline host back on un-stales the PGs, the blocked rbd ls returns, and all queued removals drain immediately, which is why the workaround has always been "turn all nodes on."

Reproduction (with an RBD pool whose backing host is offline):

```
timeout 10 rbd -p <offline_pool> ls # exits 124 (blocks until timeout)
```

while removing a VM stored on a different, healthy pool still hangs, because the purge enumerates the offline pool too.

Suggested fix: during VM/disk removal, RBD storage enumeration should either (a) honor a bounded rados_osd_op_timeout / client op timeout so an unavailable pool fails fast instead of blocking forever, or (b) skip storages that are disabled or restricted away from the node, or (c) only scan the storage(s) actually referenced by the VM config rather than all RBD storages. Any of these would prevent one deliberately-offline pool from blocking removal of unrelated VMs.
Operator workaround: restricting each single-host pool's storage definition to its own node (nodes <hostname> in storage.cfg) so other nodes don't enumerate it, or setting disable 1 while the host is off.

*End of AI research*

I honestly think option C might be the best since to me it makes sense to skip storages that are currently offline since they are not related to the VM we are handling.