惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

T
The Exploit Database - CXSecurity.com
A
Arctic Wolf
K
Kaspersky official blog
T
Threat Research - Cisco Blogs
PCI Perspectives
PCI Perspectives
www.infosecurity-magazine.com
www.infosecurity-magazine.com
P
Privacy International News Feed
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
U
Unit 42
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Simon Willison's Weblog
Simon Willison's Weblog
P
Privacy & Cybersecurity Law Blog
O
OpenAI News
量子位
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
C
Cisco Blogs
AWS News Blog
AWS News Blog
Vercel News
Vercel News
Microsoft Security Blog
Microsoft Security Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
美团技术团队
T
Threatpost
S
Schneier on Security
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
C
Cyber Attacks, Cyber Crime and Cyber Security
Last Week in AI
Last Week in AI
C
CERT Recently Published Vulnerability Notes
Blog — PlanetScale
Blog — PlanetScale
C
Cybersecurity and Infrastructure Security Agency CISA
F
Full Disclosure
博客园_首页
N
Netflix TechBlog - Medium
Security Latest
Security Latest
有赞技术团队
有赞技术团队
Google DeepMind News
Google DeepMind News
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
The Register - Security
The Register - Security
Application and Cybersecurity Blog
Application and Cybersecurity Blog
Recent Announcements
Recent Announcements
博客园 - Franky
P
Palo Alto Networks Blog
Project Zero
Project Zero
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
H
Help Net Security
Hacker News: Ask HN
Hacker News: Ask HN
Cisco Talos Blog
Cisco Talos Blog
H
Heimdal Security Blog
The Hacker News
The Hacker News
博客园 - 【当耐特】
GbyAI
GbyAI

IT Notes - ha

IT Notes IT Notes IT Notes IT Notes IT Notes IT Notes IT Notes
IT Notes
Stefano Marinelli · 2018-09-16 · via IT Notes - ha

Sometimes, servers can become unresponsive, both physical and virtual, and you may be unable to connect to them, particularly when they are overloaded. In such cases, a watchdog can be a solution.

A watchdog device, assisted by a watchdog application, monitors the server to ensure it is active and healthy. Every 30 seconds (though this interval can be adjusted), the daemon checks if everything is functioning correctly. If it is, that's fine; if not, the watchdog device can perform certain actions. In my case, I usually request the device to execute a hard reboot of the server to restore its reliability.

Proxmox allows the installation and configuration of a watchdog device, enabling you to specify what actions to take when problems arise.

The easiest way to enable it is as follows: on the Proxmox server, navigate to /etc/pve/qemu-server/ (if no cluster has been configured) and edit the VM config file.

Add a watchdog device by appending this line to the VM definition:

watchdog: model=i6300esb,action=reset

This instructs Proxmox to perform a hard reset of the VM if it becomes unresponsive. Shut down and restart the VM.

This step is necessary, as the watchdog will be created at the next "start" of the VM, and a simple reboot will not suffice.

The next step is to install and configure the watchdog daemon inside the VM. Be cautious, as some GNU/Linux distributions (e.g., Ubuntu) may blacklist the watchdog kernel module. If this is the case, check /etc/modprobe.d/blacklist-watchdog.conf (if it exists). In my situation, I removed the i6300esb from the blacklist and added it to /etc/modules so that it would load at boot.

After installing the daemon, configure it as desired.

To test the entire setup, you can intentionally hang the kernel by executing the following command:

echo c > /proc/sysrq-trigger

After waiting for a few seconds, the VM should automatically restart.