My server was, until recently, Ubuntu (ending up with 21.10), which ran my services as docker containers. I ran my server on zfs root & boot, which meant that I could switch between different operating systems relatively easily. Particularly nice was the ability to switch to a previous known running version in case of panic. I used this very occasionally, but it’s worthwhile insurance.
The services running on my system, as docker-compose stacks, included:
- Mail (mailu)
- Internal cloud file shares (owncloud & nextcloud oscillating between the two)
- Home automation (home assistant)
- NFS
- Samba
- FTP server
- Jitsi
- Database server
- 3 WordPress sites
- Piwigo gallery
I was generally happy with this for a few years. But the main shortcomings were:
- I had to develop my own backup server solutions
- The mailu container system was very fragile, requiring extreme orchestral manoeuvres in the dark for no particular reason on rebuilding
- Owncloud likewise. Things like Samba access to external store required me to build my own docker image.
- The home assistant stack was very large
- Let’s Encrypt certificate management involved a fragile pas-de-deux between my pfsense firewall and my ftp server container, whose only job was to provide writable access to my filesystems in a form that pfsense could understand.
I had watched the emergence of hypervisors and drooled over the promised capabilities and ease of use they offered.
So I transitioned to a hypervisor. In fact, I did it several times. I knew I wanted to run TrueNas Scale as my nfs/samba server (because I’d used FreeNas way back when), but I did not know if it was good enough as a hypervisor.
A transition in my case needed everything to be running with minimal downtime. I bought a Fujitsu 32GB RAM 8-core 4th generation Intel i7 for £150 on ebay, and in each of the transitions make installed the target hypervisor on this machine and transferred all the services to it before hosing the main server and transferring them all back. How easy it was to “transfer” depended on the hypervisor. My default mode of operation involved lots of “zfs send | ssh <other machine> zfs recv” operations. I moved the entire set of docker-compose stacks into a virtual machine under all the hypervisors, and brought up each stack in turn. For web services, I kept an nginx proxy pointing at the working version by manually editing its config files.
XCP-NG was my first attempt at a hypervisor. It has local roots (I live in Cambridge, UK), which biased me. It promised the most efficient virtualization. But, it really doesn’t play nicely with a zfs root, and has minimal tools for looking after data it hosts. I really don’t know if it’s promise of efficiency is met, but as we’ll see below, efficiency isn’t everything.
TrueNas Scale running bare-metal was my next attempt. It more or less does what I wanted, except that its support for virtual machines is limited. As long as you want to do what it supports, you are fine. But if you want to start passing hardware through to the virtualized machine, it gets a lot harder.
So I ended up with proxmox. I like the following features it offers:
- Can create a cluster and move virtual machines between nodes with little downtime
- A proxmox host can be used as a backup destination, and honours systemctl suspend. The Proxmox GUI also supports wake-on-lan for a node, which is a nice touch. So backing up can be power-efficient, combined with cron jobs to wake and sleep the backup target.
- It has support for backups (non-incremental) and replication (incremental). It supports snapshots – although it hides zfs backups. ZFS backups are sometimes used under the hood.
- Allows pass-through of whole disks and PCI devices relatively easily.
The downside is that there is a definite performance hit. It’s not the 1-3 percent quoted for containers, it’s more like 25% for whole disks passed through, based on a non-scientific sticking a finger in the wind. But, given that the virtualized TrueNas can support my nfs and samba clients at 1Gbps, I can live with the loss of potential throughput.
The main server proxmox configuration includes:
- My main server including
- 16 Core Xeon server, about 5 years old on Supermicro motherboard
- 64 GB ECC RAM
- A 2TB Samsung EVO SSD as boot disk and main VM store
- 4 x 8TB WD drives, which are passed through as whole block devices to the TrueNas VM
- 2 x 1TB NVME drives, which are passed through as PCI devices to the TrueNas VM
- 5 Virtual machines and one container running services
- TrueNas Scale running as one of those VMs providing Samba and NFS shares, and managing an incremental backup of those shares to the backup server
- Proxmox providing backups of those services to the local TrueNas Scale instance and one running on the backup server
- My backup server
- Fujitsu retired mid-range server
- 8 Core Intel i7, about 8 years old
- 32 GB RAM
- 1TB Samsung EVO as boot disk and VM store
- 1 x 8 TB WD drive, passed through as a whole block device to TrueNas
- TrueNas Scale acting as an incremental backup target from the main server TrueNas instance, and as a backup targed from the main server proxmox instance.
- Proxmox cluster member, replication target from main server proxmox instance.
So, I ended up with a two-cluster proxmox node, one of which is awake only during backup periods, and provides a backup and replication target for the other. I had unwound my overly-Byzantine docker-compose stacks into virtual machines with updates managed by the application and individual backup and disaster recovery. I had a potential migration target in case I needed to hose my main server, and a whole-server disaster recovery capability.
Switching services between backup and main servers is now easy. Switching between TrueNas Scale instances as hosting my data takes some work, as I have to bring everything down, perform an incremental zfs copy to the target. Bring up TrueNas shares on the backup server, adjust the local DNS server to point to the new location of the file server, and adjust the nfs mount paths, which embed the name of the pool providing the storage. <rant> Groan, thank you to TrueNas for actively preventing me from hiding the pool names. Given that one TrueNas server has fast and slow pools (NVMEs and hard disks) and the other does not, any NFS clients need to know this and be updated when moving to the backup server. This is a ridiculous state of affairs, and one TrueNas should have fixed years ago. But there you are. </rant>