惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

S
Securelist
O
OpenAI News
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
T
Threat Research - Cisco Blogs
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Google Online Security Blog
Google Online Security Blog
C
CXSECURITY Database RSS Feed - CXSecurity.com
N
News and Events Feed by Topic
S
Security Affairs
SecWiki News
SecWiki News
Project Zero
Project Zero
L
Lohrmann on Cybersecurity
P
Proofpoint News Feed
P
Palo Alto Networks Blog
L
LINUX DO - 最新话题
H
Hacker News: Front Page
Recent Commits to openclaw:main
Recent Commits to openclaw:main
I
Intezer
Simon Willison's Weblog
Simon Willison's Weblog
W
WeLiveSecurity
T
The Exploit Database - CXSecurity.com
K
Kaspersky official blog
The GitHub Blog
The GitHub Blog
I
InfoQ
云风的 BLOG
云风的 BLOG
雷峰网
雷峰网
B
Blog
IT之家
IT之家
AWS News Blog
AWS News Blog
Jina AI
Jina AI
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Google DeepMind News
Google DeepMind News
Spread Privacy
Spread Privacy
N
News and Events Feed by Topic
Security Latest
Security Latest
美团技术团队
C
Check Point Blog
WordPress大学
WordPress大学
T
Tenable Blog
S
Security @ Cisco Blogs
Last Week in AI
Last Week in AI
博客园 - 聂微东
月光博客
月光博客
博客园 - 【当耐特】
S
Schneier on Security
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
S
Secure Thoughts
Schneier on Security
Schneier on Security
C
Cisco Blogs
Cyberwarzone
Cyberwarzone

Proxmox Support Forum

[SOLVED] - Github Auth for Mirrors-Kernel Repo? [Automation] Mass migration tool for MS Win11/Server Proxmox GUI hang - not response is it possible to reject or quarantine spam based on conditions I set ? The PVENode task list in PVE9 is partially obscured due to the terminal font being too large. About 100% error reporting due to pveproxy.service hooks Kubernetes overlay networking breaks when upgrading from PVE 9.1 to PVE 9.2.3 Zentraler Speicher No space left on device Combine datastore and direct file archival to tape Kernel panic VFS: Unable to mount root fs on unknown-block (0,0) sobald ein 7.x Kernel verwendet wird. How to migrate disk of a VM from one ZFS to another Windows Server 2025 fails to boot after PVE 9.2 / Linux 7.0 Kernel upgrade Cannot Install Proxmox on T610 Poweredge with H700 PERC card sdn Config. gateway not reachable How to safely change domain/FQDN? Welche Filterquote erreicht ihr? NFS Share status unknown on 2 of 5 nodes Can't connect to PVE9 consoles [solved] Can't connect to PVE9 consoles [solved] [SOLVED] - Use secondary network for PVE commands Created cluster, one node storage gone BUG: proxmox mail gateway FROM = null bypass spam filtering Moving existing PBS from VMWare workstation to PVE cluster Does eBGP SDN fabric support external peering? Bug: PDM 1.1 not recognizing valid license status Proxmox GUI hang - not response PVE crashes unexpectedly Proxmox Backup Server 4.2 released! Advice ceph-osd crashes with kernel 6.17.2-1-pve on Dell system [META] Links on Proxmox Forum Website Hardwarer oder Software RAID Joining a cluster with already created guests VM PDM missing backup jobs from PVE / Log retention Remove VM.Monitor from all users/roles, PVE 9.2 Proxmox Freezing (new instalation) 9.2.2 - Intel 12700T No Web gui and random connection reset by peer [SOLVED] - i40e module for X710 Intel NIC Dutch Proxmox Day 2026 How pools use the space Corosync initiiert Reboot trotz Verfügbarkeit der Systeme Opt-in Linux 7.0 Kernel for Proxmox VE 9 available After PVE 8to9 upgrade, unable to check guest fs freeze status Problem with MegaRAID SAS3508 controller proxmox-kernel-7.0.2-6-pve failing network service Auto sync guest time after rollback of VM snapshot with RAM/state QEMU 11.0 available on pve-test and pve-no-subscription as of now 350 MPM Solventless Lamination Machine for High-Speed Flexible Packaging Making sense of NVMe zfs and SMART errors [SOLVED] - PVE loses network connection after kernel upgrade to proxmox-kernel-7.0.0-3-pve [SOLVED] - Remove or reset cluster configuration. Proxmox 8.4.1 Fresh Install BCM57416 10G Ethernet Adapter Not Recognized PDM 1.1.1 unable to add AD realm with anonymous search [TUTORIAL] - Developer Workstation (Proxmox-VE 9) with cinnamon (LMDE7) SDN zone shows "pending" on peer nodes after node reboot (9.2.x) Cluster not quorate - extending auth key lifetime! Proxmox not rebooting properly (SOLVED) Proxmox 9 Stuck on loading initial ramdisk With new HA-Disarm Feature is there a Documentation for NUT Setup on Clusters? Proxmox 8.3 Installation Issue on ProLiant DL380 Gen9 Cluster networking setup LXC System images unavailable [SOLVED] - Fix: NVIDIA Drivers Failing after upgrade to Proxmox 9.2.2 (Kernel 7.0.2-6-pve) / NovaCore Conflict Install NUT directly on Proxmox VE and control guests from here driver usb for windows 7 System startup error and no network: Failed to start ifupdown2-pre.service - Helper to synchronize boot up for ifupdown. PBS backup space grow up constantly Proxmox Datacenter Manager 1.1 released! IPv4 not available in newly created VM Recommended Setup for Offsite Proxmox Backups? Hetzner Storage Box & Remote PBS Challenges duplicate, please delete this passthrought an USB device "by ID" to CT PDM Installer Freezes at 66% Tried PDM for the first time (version 1.1) - had issues PDM 1.1 automated install Suche Server-Provider für Proxmox connecting sdn to edge firewall SDN, IPAM & DHCP Migrating from read-only file system Ubuntu 26.04 installation fails for unknown reason Status Unbekannt nach Cluster Join Installing Proxmox Backup Server on Mac Mini (Late 2012) kernel 7.0 performance issue with zfs pools PVE becomes unreachable via ethernet but OS is running [SOLVED] - New 9.2 install - can't find 7.0.2-6-pve , not all the time [SOLVED] - Backup and dedupe a VM with LUKS Gibt es mit PVE 2.x ggf. Änderungen bei der RAM-Nutzung, bzw. deren Anzeige bei VMs? I need help for setting up backup solution Way more NAGware, very little functionality, bugs galore Root squashing virtiofsd with --uid-map Intel ixgbe Driver Update Fail Help to fix Proxmox access issues after power cut Passkey Login (not 2FA) Roblox VM detection - can be overcome? [TUTORIAL] - ZFS-Autosnaptshot inkl. Rollback und Daten direkt recovern (Windows/Linux) How to stop PVE Kernel upgrade [SOLVED] - very long waiting to log in to lxc debian 11 ssh [TUTORIAL] - Configuring Fusion-Io (SanDisk) ioDrive, ioDrive2, ioScale and ioScale2 cards with Proxmox Increase maximum USB devices in vm.conf
Broadcom BCM57504 (100G) bnxt_en TX timeout and NIC reset on Proxmox 8.1.5 — while BCM57414 (25G) works fine on same host
invalid@exam · 2026-05-29 · via Proxmox Support Forum

Hello everyone,

I'm experiencing a recurring network failure on a Proxmox VE host using a Broadcom NetXtreme-E 100G NIC (BCM57504).
The NIC stops transmitting, triggers a TX timeout, and then fails to reset correctly. After that, the interface is disabled and vmbr0 loses connectivity until the host is rebooted.
Interestingly, the same host also has another Broadcom NIC (BCM57414 dual 10G/25G) using the same driver and kernel, and it works perfectly without any issues. The problem only happens on the 100G card.

Environment:
- Proxmox VE: 8.1.5
- Kernel: 6.5.13-3-pve
- Hardware: Datacom DM-SV01 server
- Bridge: vmbr0
- Workload: multiple VMs with moderate to high network traffic

Problematic NIC (100G):
- Model: Broadcom BCM57504 NetXtreme-E 100Gb
- PCI ID: 14e4:1751
- Interface: enp33s0np0
- Driver: bnxt_en
- Firmware: 226.0.145.1 / pkg 226.1.107.1

lspci -nn | grep -i ethernet

Code:

21:00.0 Ethernet controller [0200]: Broadcom Inc. BCM57504 NetXtreme-E [14e4:1751] (rev 11)

ethtool -i enp33s0np0

Code:

driver: bnxt_en
version: 6.5.13-3-pve
firmware-version: 226.0.145.1/pkg 226.1.107.1


Stable NIC on the same host (no issues):

  • Model: Broadcom BCM57414 NetXtreme-E dual 10G/25G
  • PCI IDs: 14e4:16d7
  • Interfaces: enp65s0f0np0 / enp65s0f1np1
  • Driver: bnxt_en
  • Firmware: 214.4.91.1 / pkg 216.0.333.11

lspci -nn | grep -i ethernet

Code:

41:00.0 Ethernet controller [0200]: Broadcom BCM57414 NetXtreme-E [14e4:16d7] (rev 01)
41:00.1 Ethernet controller [0200]: Broadcom BCM57414 NetXtreme-E [14e4:16d7] (rev 01)

ethtool -i enp65s0f0np0

Code:

driver: bnxt_en
version: 6.5.13-3-pve
firmware-version: 214.4.91.1/pkg 216.0.333.11

Problem description

After some time under load, the kernel reports:

Code:

NETDEV WATCHDOG: enp33s0np0 (bnxt_en): transmit queue 0 timed out
bnxt_en: TX timeout detected, starting reset task!
hwrm_ring_free failed
hwrm_ring_alloc failed
bnxt_init_nic err
nic open fail
vmbr0: port enp33s0np0 entered disabled state

Once this happens:

  • Network connectivity is lost
  • The interface does not recover automatically
  • Only a full reboot restores the NIC

What I already verified

  • Cable and switch ports are OK
  • Happens multiple times, not a one-time event
  • No SR-IOV enabled
  • Using standard Linux bridge (vmbr0)
  • No PCIe errors in dmesg besides the bnxt_en errors
  • Only the BCM57504 (100G NIC) is affected

Questions to the community

  1. Has anyone experienced similar issues with Broadcom BCM57504 or other NetXtreme-E cards on Proxmox 8?
  2. Is this a known bug with kernel 6.5.x and the bnxt_en driver?
  3. Would upgrading to kernel 6.8 help?
  4. Is there a recommended firmware version for this NIC on Proxmox?
  5. Are there any known driver module parameters or offload settings that improve stability?

Any feedback, patches, or workarounds would be greatly appreciated.

Thanks in advance for your help.

Regards,
Walisson

  • complete_log.txt

    34.4 KB · Views: 1

  • pve-version.txt

    1.5 KB · Views: 0

Weve got two clusters of 3 servers, each with a BCM57504 100 GBe Dual Port in it and it is used for the ceph network with full mesh routing (no switches).
I think we started with PVE 8.3 and are now up to 9.1.4 and never had any problems.

we had a full kernel panic (eventually) on pve 9.1.x on kernel 7.0.0-3-pve due to wedged driver on a similar card, BCM57508.

topology: each node has two dual 100g port cards, and each node has two lacp bonds, one for front of house networking, one for backend (ceph, migrations, etc).

failure mode: basically you can see bnxt_en panic each cpu until it gets all the way around and then the computer hardlocked

Code:

May 05 13:49:23 proxmox7 kernel: bnxt_en 0000:11:00.0 enhe0p0: NETDEV WATCHDOG: CPU: 35: transmit queue 16 timed out 5081 ms
May 05 13:49:23 proxmox7 kernel: bnxt_en 0000:11:00.0 enhe0p0: TX timeout detected, starting reset task!
May 05 13:49:23 proxmox7 kernel: bnxt_en 0000:11:00.0 enhe0p0: [0.0]: tx{fw_ring: 1025 prod: 44f0 cons: 44f0}
May 05 13:49:23 proxmox7 kernel: bnxt_en 0000:11:00.0 enhe0p0: [0]: rx{fw_ring: 2 prod: 75b} rx_agg{fw_ring: 3 agg_prod: e204 sw_agg_prod: 204}
May 05 13:49:23 proxmox7 kernel: bnxt_en 0000:11:00.0 enhe0p0: [0]: cp{fw_ring: 0 raw_cons: 1344945}
May 05 13:49:23 proxmox7 kernel: bnxt_en 0000:11:00.0 enhe0p0: [0.0]: cp{fw_ring: 49 raw_cons: 17be4c0}
May 05 13:49:23 proxmox7 kernel: bnxt_en 0000:11:00.0 enhe0p0: [0.1]: cp{fw_ring: 33 raw_cons: a9199d}
May 05 13:49:23 proxmox7 kernel: bnxt_en 0000:11:00.0 enhe0p0: [1.0]: tx{fw_ring: 1026 prod: 6caa cons: 6caa}

info:

Code:

from dmesg:

[    2.323354] bnxt_en 0000:11:00.0 eth0: Broadcom BCM57508 NetXtreme-E 10Gb/25Gb/50Gb/100Gb/200Gb Ethernet found at mem 70490010000, node addr <mac addr>

ethtool -i enhe0p0:
driver: bnxt_en
version: 6.17.13-12-pve
firmware-version: 233.0.152.6/pkg 233.1.135.7
expansion-rom-version:
bus-info: 0000:11:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

on kernel 6.17 under proxmox 9.1.x (and previously proxmox 9.0.x under its various 6.x kernels) we've had no issues. We did blacklist the bnxt_re module after the issue but, that probably wont be enough to stop it happening again. In googling around i did find suggestion that kernel 7 has some bnxt_en jank.

Last edited:

In googling around i did find suggestion that kernel 7 has some bnxt_en jank.

Could you post your full network configuration? I've seen several reports now and usually they were using some BCM cards with bonds and VLANs, but I couldn't yet tie them together...

Code:

cat /etc/network/interfaces

/etc/network/interfaces contents:

Basic gist is: two LACPs, each containing two of the ports, each nic has one port for each of the two LACPs (thus a nic card dying entirely should be ignored)

of those two bonds:

po7 / bond0 is for use by VMs via individiual subnets in SDN, which are built atop vmbr0

the other bond1 / po17 is for ceph and any back of house proxmox comms, including migration, etc.

IPs mildly redacted

lmk if you have any other questions.

Code:

auto lo
iface lo inet loopback

auto enhe0p0
iface enhe0p0 inet manual
        mtu 9000
#Port 1B - Onboard

auto enhe0p1
iface enhe0p1 inet manual
        mtu 9000
#Port 2B - Onboard

auto enhe1p0
iface enhe1p0 inet manual
        mtu 9000
#Port 1A - PCIe

auto enhe1p1
iface enhe1p1 inet manual
        mtu 9000
#Port 2A - PCIe

iface enipmi0p0 inet manual
#unused hw bond with IPMI

auto bond0
iface bond0 inet manual
        bond-slaves enhe0p0 enhe1p0
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer3+4
        mtu 9000
#bond for usernets on po7

auto bond1
iface bond1 inet manual
        bond-slaves enhe0p1 enhe1p1
        bond-miimon 100
        bond-mode 802.3ad
        bond-xmit-hash-policy layer3+4
        mtu 9000
#bond for admin/storage on po17

auto bond1.1549
iface bond1.1549 inet manual
        mtu 9000
#storage ip interface base bond slice (ceph)

auto bond1.1508
iface bond1.1508 inet manual
        mtu 9000
#admin ip interface base bond slice (admin/migration)

auto vmbr0
iface vmbr0 inet manual
        bridge-ports bond0
        bridge-stp off
        bridge-fd 0
        bridge-vlan-aware yes
        bridge-vids 2-4094
        mtu 9000
        bridge-disable-mac-learning 1
#base switch for usernets (used as basis for SDN individual vlans listed in pve)

auto vmbr1v1508
iface vmbr1v1508 inet static
        address 10.y.y.y/23
        gateway 10.y.y.254
        bridge-ports bond1.1508
        bridge-stp off
        bridge-fd 0
#admin interface ( migration, pve login, etc)

auto vmbr1v1549
iface vmbr1v1549 inet static
        address 10.x.x.x/24
        bridge-ports bond1.1549
        bridge-stp off
        bridge-fd 0
        mtu 9000
#storage interface (ceph)

also: yes, all 4 100g nics are the same broadcom model (but two are part of a supermicro "onboard" AOM card, and the other two are on a standard broadcom PCIe card. all 4 are running the same firmware listed in the first post.)

Last edited:

Greetings.

We got the same issue on some an old supermicro-board with an Intel X550 2-port 10G-base-T NIC. These two ports bonded with LACP.

Error: "bnxt_en <pci:id> <nic-name> NETDEV WATCHDOG: CPU: <cpu-id>: transmit queue <queue-id> timeout <time> ms"

Tip: By reloading the NIC-driver the network-connection can "sometimes" be restored without a reboot:

Bash:

ifdown -a
modprobe -r bnxt_en
modprobe bnxt_en
ifup -a

We are currently testing if this still happens with only a single NIC-port (without bond)

. The same issue without a bond configured (single Port).

Last edited:

Is there something interesting in the journal (journalctl -b)? You could also try to disable offloading as that has caused some issues in the past:
post-up ethtool -K <nic> rx-vlan-offload off

Happened to us again on the 'onboard' Supermicro AOC version of the BCM57508, PCIe version of the NIC in the same host still apparently unaffected.

We were at ~ 14 days uptime. I think it was the machine with the most uptime that went down. Rebooted most of the nodes left one to see if i can get another node to do this, because today's downtime was the same node in the cluster I posted about above.

This time, happened on kernel 6.17.13-13, which until today we thought 6.17 was bulletproof, so news to me.

let me know if there's any additional diag info I can provide.

I did switch the nic to manually configured link speed rather than autodetect since It's unclear if the flapping is causing bnxt_en to tweak out or the other way around.

-Dave

Last edited:

Items we changed today to try to see if this will stop happening:

configured 100g eth with reed solomon FEC in the bios, disabled autonegotiate.

using niccli changed the following nvm variables on all the cards to disable as much unused features as possible, and also inherit the PCIe variant of the cards default values as much as possible in the case that something wasnt being disabled. we're still on v233 of the firmware since so far, this has only happened on a single machine of 5 supposedly identical machines. keep in mind i'm only listing one example of each variable, not all 4 to 8 scopes per node:

Code:

disabled/constrained:
niccli -i 1 nvm -setoption support_rdma -scope 0 -value 0x00;    #disable RDMA support
niccli -i 1 nvm -setoption bw_limit -scope 0 -value 0x64  
#limits bandwidth used for ...some feature i've forgotten. not general traffic (64 was default on pcie nic)
niccli -i 1 nvm -setoption spdm_mctp_pcievdm -value 0      #disable signing comms to bmc/bios/etc
niccli -i 4 nvm -setoption mctp_discovery -value 0      #disable bios/bmc discovery (though ipmi still seems to discover it)
niccli -i 1 nvm -setoption enable_wol_on_acpi_pattern -scope 0 -value 0  # disable wake on lan patern
niccli -i 1 nvm -setoption magic_packet_wol -scope 0 -value 0    #disable wake on lan magic packet

enabled (to match PCIe variant of BCM57508):
niccli -i 1 nvm -setoption enable_sriov -value 1    #enable sr-iov to match PCIe card default
niccli -i 1 nvm -setoption enable_adapter_error_recovery -value 1  #enable adapter error recovery to match PCIe card default
niccli -i 1 nvm -setoption enable_crash_dump_to_host_ddr -value 1  #enable crash dump to host to match PCIe card default

Host was powered off for two minutes and then powered back on to ensure nvm defaults moved from set to active state.

also upped the fan speed on the chassis that was having the crash issues (default was set to optimal not optimal for heavy IO like the other nodes in the cluster.

at this point, now i guess we just run low criticality VMs on that node for a few weeks and see if anything changes with regards to reliability.

reminder that bnxt_re is still also denied at the modprobe level.

if anyone else has any suggestions for useful bits to twiddle, etc, let me know.

thanks,
-Dave

Last edited: