惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
WordPress大学
WordPress大学
小众软件
小众软件
Cloudbric
Cloudbric
AWS News Blog
AWS News Blog
腾讯CDC
量子位
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Scott Helme
Scott Helme
Hugging Face - Blog
Hugging Face - Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
IT之家
IT之家
Jina AI
Jina AI
Attack and Defense Labs
Attack and Defense Labs
S
SegmentFault 最新的问题
Simon Willison's Weblog
Simon Willison's Weblog
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tailwind CSS Blog
Last Week in AI
Last Week in AI
博客园 - 【当耐特】
Google Online Security Blog
Google Online Security Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
罗磊的独立博客
L
LINUX DO - 最新话题
博客园 - Franky
博客园 - 叶小钗
Apple Machine Learning Research
Apple Machine Learning Research
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
AI
AI
C
Cisco Blogs
酷 壳 – CoolShell
酷 壳 – CoolShell
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
博客园 - 三生石上(FineUI控件)
雷峰网
雷峰网
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
云风的 BLOG
云风的 BLOG
I
Intezer
S
Securelist

Christian Hollinger

Building confidence in geospatial data I used my homelab to start an LLC: Meet SkaldMaps How I run and deploy docker services in my homelab with Komodo and a custom CLI More random home lab things I've recently learned New Website & Scala Days 2025 Announcement A Distributed System from scratch, with Scala 3 - Part 3: Job submission, worker scaling, and leader election & consensus with Raft My 2025 Homelab Updates: Quadrupling Capacity Why I still self host my servers (and what I've recently learned) Improving my Distributed System with Scala 3: Consistency Guarantees & Background Tasks (Part 2) Building a functional, effectful Distributed System from scratch in Scala 3, just to avoid Leetcode (Part 1) Migrating a Home Server to Proxmox, TrueNas, and zfs, or: How to make your home network really complicated for no good reason QGIS is the mapping software you didn't know you needed Tiny Telematics: Tracking my truck's location offline with a Raspberry Pi, redis, Kafka, and Flink (Part 2) Tiny Telematics: Tracking my truck's location offline with a Raspberry Pi, redis, Kafka, and Flink (Part 1) Functional Programming concepts I actually like: A bit of praise for Scala (for once) Scala, Spark, Books, and Functional Programming: An Essay Building a Data Lake with Spark and Iceberg at Home to over-complicate shopping for a House Writing a Telegram Bot to control a Raspberry Pi from afar (to observe Guinea Pigs) Raspberry Pi Gardening: Monitoring a Vegetable Garden using a Raspberry Pi - Part 2: 3D Printing Raspberry Pi Gardening: Monitoring a Vegetable Garden using a Raspberry Pi - Part 1 Bad Data and Data Engineering: Dissecting Google Play Music Takeout Data using Beam, go, Python, and SQL Why I use Linux RE: Throw Away Code? Use go, not Python or Rust! A Data Engineering Perspective on Go vs. Python (Part 2 - Dataflow) A Data Engineering Perspective on Go vs. Python (Part 1) Goodbye, WordPress - Hello, Hugo & nginx How a broken memory module hid in plain sight Tensorflow on edge, or – Building a “smart” security camera with a Raspberry Pi How I built a (tiny) real-time Telematics application on AWS A look at Apache Hadoop in 2019 Building a Home Server Analyzing Reddit’s Top Posts & Images With Google Cloud (Part 2 - AutoML) Analyzing Reddit’s Top Posts & Images With Google Cloud (Part 1) Analyzing Twitter Location Data with Heron, Machine Learning, Google's NLP, and BigQuery Data Lakes: Some thoughts on Hadoop, Hive, HBase, and Spark (Tiny) Telematics with Spark and Zeppelin Storm vs. Heron – Part 2 – Why Heron? A developer’s view Storm vs. Heron, Part 1: Reusing a Storm topology for Heron Update an HBase table with Hive... or sed
Moving a Proxmox host with a SAS HBA as PCI passthrough for zfs + TrueNAS
Christian Hollinger · 2023-10-26 · via Christian Hollinger

Introduction

I do not recommend virtualizing TrueNAS any longer, even with a device passthrough.

Plase see my 2025 update for an alternative.

My home server is a Proxmox cluster. Recently, one of the host’s SSDs indicated it needed a replacement.

I run TrueNas SCALE on it by passing through all my hard drives via LSI HBA so that zfs has access to the raw hardware, which makes the migration to a new SSD a bit tricker. For added difficulty, this process assumes the SSD is a NVMe drive on a system with a single m.2 slot.

This article just outlines the steps on how to do it, since it’s not super obvious. Please do not blindly copy paste commands, otherwise you will loose your data. You have been warned.

Names used: sleipnir.lan: Node that stays online. bigiron.lan: Node where we’ll replace a drive.

Note: If you’ve been here before, you might know that I normally don’t write “tutorial” style articles, but I found this obscure and somewhat interesting enough a case study to make an exception. No, I’m not procrastinating on part 2 of bridgefour, you are. I added docker support!

Prepare the server

Start by taking snapshots of your VMs and ensure you have backups and send them to your standby/backup server.

assets/backup

Since the TrueNAS VM doesn’t store any data itself, ensure its replica has a copy of your zfs snapshots, too:

assets/replication

Next, turn off all existing VMs on the host. If any of the VMs runs critical services, migrate them onto a different host (if they’re not HA in the first place).

assets/shutdown_vm.png

Lastly, prepare the network for the loss of node in the cluster, e.g. if you run a DNS resolver or any other network software that is not redundant on it.

Back up the zfs root of the host

The default zfs pool in Proxmox is called rpool. Create a recursive snapshot.

zfs snapshot -r rpool@new-root-drive

Confirm it exists:

root @ bigiron ~ zfs list -t snapshot | grep new-root-drive

rpool@new-root-drive 0B - 104K -

rpool/ROOT@new-root-drive 0B - 96K -

rpool/ROOT/pve-1@new-root-drive 5.12M - 9.73G -

# ...

This process is almost instant, since it only tracks metadata and doesn’t actually copy any of your data.

Next, send it to a suitable backup location. This can be an external hard drive or your backup server.

zfs send -Rv rpool@new-root-drive | ssh sleipnir.lan zfs recv rpool/bigiron-host-backup

Alternatively, on the same server on a different drive:

zfs send -Rv rpool@new-root-drive | zfs recv vms-bigiron/bigiron-host-backup

-R is recursive and -v gives useful output:

total estimated size is 153G

TIME SENT SNAPSHOT rpool/data@new-root-drive

18:09:57 8.27K rpool/data@new-root-drive

#...

If you get an instant result here, though, maybe consider being skeptical on how your computer actually copied (presumably) hundreds of GiB within seconds. zfs will not warn you that it just copied only the 104K root without -v.

Conceptually, you could also pipe this to a file, but I haven’t tried that myself.

Confirm it exists:

root @ bigiron ~ zfs list | grep vms-bigiron/bigiron-host-backup

vms-bigiron/bigiron-host-backup/ROOT/pve-1 9.52G 1.23T 9.51G /

# ...

Validate the backup by zfs importing it and poking around in it.

Prepare bootable USB drives

Flash a new Proxmox ISO onto a bootable USB drive (balenaEtcher is a good tool).

Prepare the cluster

Note: This process will remove the old node and re-add it, since the documentation warns against doing it differently. On paper, you are be able to skip this and restore the entire rpool later. That didn’t work too well for me. See the notes.

On the remaining node, remove the old node. Make sure to set the quorum to 1 first. You won’t need this if you’re actually left with a quorum.

pvecm expected 1

pvecm delnode bigiron

pvecm status

You might want to clean up /etc/pve/priv/authorized_keys.

Make sure /etc/pve/nodes/ doesn’t contain the node you just removed.

If you don’t do this, you will get into funny state territory:

cluster-state

Note how the node shows up as unavailable, but the server is actually online?

Your cluster doesn’t have consensus as to what’s happening here, and clearly you polling the state of bigiron.lan vs the UI state presented by, presumably, the other node(s) in the cluster differs. Also, all nodes have a quorum vote of 1, so good luck without an actual quorum (i.e., an uneven number of nodes).

Prepare the new SSD

Shut down the server (the cluster will now be degraded I you missed out on the previous step), connect an external screen and other I/O, switch out the SSDs, boot from USB, and install a new Proxmox environment. Ensure to format the drive as zfs.

assets/install

The system will reboot once it’s done. Poke around on the web UI to make sure everything looks fine.

Restore the backup

At this point, you have the new SSD installed, and the snapshot of the old SSD available either on the network or a different drive.

First, import the backup pool if you used an internal drive as backup target.

zpool import vms-bigiron -f

zpool import boot-pool -f

Confirm. Do the names match? Do the sizes match?

root@bigiron:~# zpool list

NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT

boot-pool 63G 6.92G 56.1G - - 8% 10% 1.00x ONLINE -

rpool 464G 1.18G 463G - - 0% 0% 1.00x ONLINE -

vms-bigiron 1.81T 540G 1.29T - - 2% 29% 1.00x ONLINE -

Also check zfs list -t snapshot.

Next, we’ll selectively grab the data dataset from the backup and write it to the new rpool.

nohup sh -c "zfs send -Rv vms-bigiron/bigiron-host-backup/data@new-root-drive | zfs recv -Fuv rpool/data" &

Beware of the -F flag:

If the -F flag is specified when this stream is received, snapshots and file systems that do not exist on the sending side are destroyed.

-v is what you think it is and -u just doesn’t mount the result.

grep "total estimated" nohup.out should give you something close to total estimated size is 153G from before. Mine was 135GB.

At this point, your brand new SSD should have a pool called rpool that contains all your old data and VMs. Do the usual validation. I also copied my nohup.out file.

Now, and this is critical, confirm that no two mountpoints overlap:

For me, this happened:

root@bigiron:~# zfs list | grep -E "/$"

rpool/ROOT/pve-1 1.32G 336G 1.32G /

vms-bigiron/bigiron-host-backup/ROOT/pve-1 9.52G 1.23T 9.51G /

Note the two identical mount points. I have no idea why this is allowed in zfs.

This caused:

root@bigiron:~# systemctl list-units --failed

UNIT LOAD ACTIVE SUB DESCRIPTION

chrony.service loaded failed failed chrony, an NTP client/server

lxcfs.service loaded failed failed FUSE filesystem for LXC

modprobe@drm.service loaded failed failed Load Kernel Module drm

And a lot of erratic behavior.

Fix by this by:

zfs set mountpoint=/vms-bigiron/bigiron-host-backup/ROOT/pve-1 vms-bigiron/bigiron-host-backup/ROOT/pve-1

And reboot.

Next, confirm everything’s happy via

root@bigiron:~# systemctl list-units --failed

UNIT LOAD ACTIVE SUB DESCRIPTION

0 loaded units listed

Confirm the HBA is set up correctly

Since we only restored the data dataset, we need to do this again. ssh into the new-old machine. If you’re brave and restored the entire rpool in the previous step, this will all be set.

Make sure to blacklist the HBA on the host. See the Wiki for details.

/etc/default/grub

Should contain the line:

GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt"

For AMD. For Intel:

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"

Run:

This will tell the kernel to use the CPU’s Input-Output Memory Management Unit (IOMMU). iommu=pt enables this only for pass through devices.

/kernel/commandline

Should exist and contain only

root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet amd_iommu=on iommu=pt

For AMD. For Intel:

root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt

Run:

For the record - I stole this from the reddit post from the acknowledgments section - I actually doubt this does anything.

/etc/modules

Should exist and contain only

# /etc/modules: kernel modules to load at boot time.

#

# This file contains the names of kernel modules that should be loaded

# at boot time, one per line. Lines beginning with "#" are ignored.

vfio vfio_iommu_type1 vfio_pci vfio_virqfd

On both Intel and AMD.

Run:

update-initramfs -u -k all

This loads the required kernel modules for PCI passthrough.

Now reboot the box.

Fix the cluster

If you run a single-node cluster, skip this. Or go on the internet and buy a couple of servers. I suggest the latter.

Since we removed the new-old node from the cluster earlier, all we need to do is the following on bigiron.lan:

Next, go to your backups in the Proxmox UI and restore the VMs. Again, if you restored the entire rpool, they should just show up.

backup-restore

Lastly, run an update.

echo 'deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription' >> /etc/apt/sources.list

apt-get update

apt-get upgrade

Validate the system

Ensure all disks exist. The Proxmox UI should not list the disks controlled by the HBA!

disks

Ensure all zfs pools exist. In my case, I replaced a 240G SSD with a 500G SSD, so that’s easy:

new-drives

Ensure all VMs exist.

Ensure your TrueNAS instance has the proper passthrough settings:

assets/vm-settings

Start the VMs and ensure they do what you want them to do.

Especially for TrueNAS, check that the pool(s) exist and you can access S.M.A.R.T attributes as a proxy for the passthrough working as indented.

truenas disks

You’re done! (& Notes)

Congratulations! Your comically over-engineered server is back online. Isn’t zfs neat?

On a side note - if you think you can just restore rpool wholesale (i.e., not just /data), this might break the cluster. The Proxmox docs warn against this. I tried that and couldn’t get it to work. In no particular order:

  • No ntp service would work, with odd errors I attribute to selinux or AppArmor (randomly no access to the pid file)
  • [quorum] crit: quorum_initialize failed: 2
  • CAN'T OPEN SYMLINK (/etc/cron.d/vzdump)
  • Permission denied - invalid PVE ticket

I do encourage you to try, though. Theoretically, I don’t really see a reason why this shouldn’t work. I believe my main problem was having overlapping / rootpoints and since Proxmox nodes should be cattle, not pets, quickly restoring a couple of VMs wasn’t the end of the world.

Acknowledgements

This thread was the main source of inspiration for when I originally set up the card.

This article does something similar, but not quite the same (it doesn’t use PCI passthrough or m.2 drives). It was useful to confirm that what I was planning to do here was feasible.

And of course, the official docs.