The rest of the lab is a few ephemeral instances on Google, with dual A100s that spin up when I need to train things.
I put Ubuntu on the old beast, and never touch it. If the power goes out, it automatically comes on and Docker launches all the services when it comes up.
About the only thing that needs watching is the tiny SDR radio plugged into it, which I use for pure random numbers and talking to it with a hand held radio from the other house. Sometimes I have to unplug it and then plug it back in to get it back into service. No amount of finagling seems to fix it from software.
What this skips though is the complexity of services like NextCloud (stuck in maintenance mode again?), Immich (needs a compose file edit?), MineCraft worlds (Dad! my client is on another version again!), (dmn) AlbyHub (needs re-login and closed its channel).
But to be fair this is really getting quite minimal these days indeed. I didn't really realize it but I too have a mostly hand-off home-lab... Ok, then it's not really a lab anymore, its more "stable home-infra" ;)
1. I don't like surprise breakages. I am not prepared to fix a service my family uses midday on a Tuesday when I am working since it auto updated. I'd like to specifically make sure I have dedicated time and plan if something is going to go wrong.
2. My family HATES when things change. I try to run LTS versions of things, but annoyingly, some software like nextcloud doesn't have LTS version. One of the things my family likes the most, is that the stuff I host isn't constantly changing like commercial products. Having google photos change or netflix have a new interface randomly is very, very frustrating for them.
Since my homelab is completely internal, I avoid quickly doing updates (unless it is a critical security issue), and definitely avoid doing major version upgrades unless there is good value in it.
It's been normal for me for the past 3 years thanks to using NixOS for all server infrastructure.
It doesn’t change.
Many people keep swapping gear in so they can learn BGP on Cisco edge gear or run clusters on salvaged IB.
OP is not that person.
Indeed. And if you never test your recovery then you don't actually have a workable backup.
* Docker Compose files and various folders for containers live on an NFS share
* SQLite and other databases run off a local SATA SSD for speed and reliability
* Cronjob tarballs the critical stuff nightly and throws it on another NFS share to get ingested into Backblaze B2.
Now I just get to kick back and actually experiment with new things instead of babysitting a convoluted Proxmox upgrade or shunt onto a new container standard.
Does it run rootless? Not atm (blame FreshRSS, my sole holdout). Is it super secure? Probably not, but I’m not doing anything goofy like mounting the Unix socket into a container at the very least, and the server credentials don’t work anywhere else should it get popped. The blast radius is contained, and that’s more important to me than Enterprise-grade security for my homelab (a la Wazuh, another backlog project TBD).
Building/tinkering/playing around is fun, but once you are actually self-hosting services you rely on, it needs to "just work" or you will eventually burn out or lose interest. Especialy when you take on more users than just yourself. The day my wife cancelled her audible subscription because audiobookshelf was just as good (IMHO better) was a good day, but that only happens because it is stable/reliable.
I have been running Proxmox for 3 years and it has been rock solid
- Docker VM : Lots of containers with docker compose, a few examples are Plex, AdguardHome, *Arr stack...
- K3s VM: Mostly to learn keep up with kubernetes; my own apps running in there
- Postgresql VM: database for anything that needs one
Currently trying to simplify, moving the database to a docker container and testing if docker and k3s can coexist on the same system, at that point I might ditch Proxmox and move to NixOS. The only things I might miss are the option to create VMs to test random things, and VM snapshots, which make backups really simple.
I still upgrade mine manually every Friday with an ansible playbook; most of the time nothing breaks, but if it does I know I have time to fix it.
Recently one had their first baby, so they migrated from Fedora to RHEL, just to spend less time on upgrades. :D I thought that was cute. Like RHEL is so stable, even a first time parent can use it.
Don’t super care about updates. If it isn’t too ancient and not internet facing then it’s probably ok
I use a nearly identical alias for docker pull to keep my containers updated. To ensure everything stays running smoothly, I've built a lightweight watchdog (a mix of bash scripting and Uptime Kuma/Beszel) that monitors my services and containers and restarts them if they crash. This way, I rarely need to intervene manually.
For critical services (DNS, VPN, git, web search, crawler and mail, etc.), I add an extra layer of redundancy by running them on multiple servers across different locations. If one server fails, the others seamlessly take over. I also use DNS round-robin as a simple but effective way to handle load balancing and failover; no HaProxy, K8, expensive IP Takeover (ARP Spoofing) or BGP Anycast and VRRP/CARP, Proxmox or fancy orchestration tools required. If a node goes down, another watchdog script temporarily removes it from DNS, and traffic shifts to the remaining servers. Most often the services are self-healing. The best part? My deployment and monitoring are fully self-scripted (no Terraform, Ansible or BundleWrap). Moving services to a new server is as easy as running some scripts over SSH. Everything sets itself up automatically. Currently I run my services on 2 Pi's, 2 stratum 1 servers (from centerclick), and 8 VPSs that cost me around $40/month. It's a great example of how a little automation and redundancy can go a long way in keeping things cheap and reliable without unnecessary complexity.
I invest around 1-2h/month to maintain and (mainly) adjust my setup. Before I head multiple Proxmox instances and a backup server that cost me around $250/month, I was spending 1-2h/week just to keep everything running. The difference is night and day.
However, I've personally had bad experiences with consumer hardware like the Raspberry Pi and hardware failures. Most of the time, I didn't feel motivated to replace the hardware and set up all the services again (even if I had a backup). As an Unify alternative i can recommand GL-iNET; build modern hardware for OpenWRT with some additions and the hardware has enough power to run Wifi7, AdGuard and Tailscale or ZeroTier. (Before I run Protectli Vaults with a virtual PfSense, Tailscale and AdGuard on Proxmox and extra OpenWRT access points) I can recommand the Protectli Hardware over a Raspberry Pi, especially if you want to run a single server/hardware homelab.
Thanks for the inspiration; it's always refreshing to see others embracing simplicity!
After I set it up and stopped fiddling with it it's just run flawlessly for the last 6 months.
Yeah, right until the moment it bricks after an update.
Edit: zero minutes old already downvoted.
I don’t use docker, I’d rather create my own packages. And if a project is too trigger happy about requiring new dependency version, I drop them.
1. How often do I have to touch it during the next ten years?
2. How many of the times that I have to touch it are because I decided to do so?
3. How much pain is it to fix and understand if I had my mind erased?
This often works out in favour of dead simple solutions.Longer term goal is a sleek plug-and-play box anyone can connect to their ISP modem with minimal technical knowledge.
I'm currently running it on a Aoostar WTR Max NAS with my AT&T connection. Got another NUC connected to a Spectrum modem. My goal is to be able to flip back and forth between the two with a backup bundle within minutes.
Considering breaking up the router and app server functionality so they can be run separately. Another idea is to use custom a 3D printed case with Framework laptop motherboard and battery, switch, and wifi AP to make a true all-in-one box. I currently need an external switch, backup battery, and wifi access point.
Once the system feels mature, next steps would be things like federated tailnets with friends and family for things like distributed backups, compute/GPU, CDN, social networking, etc. Hoping that decentralized model training is cracked by someone at some point.
From a coding perspective I'm hoping to modularize everything (since it's NixOS) and add thorough testing and hardening. It's already relatively modularized considering it's built on Nix flakes.
Technology has come along way. But I think that in tech we should be careful to not fall prey to monkey see monkey do.
We should not be deploying technology in our homes to "mimick our employers"
Remember they are miserable for a reason.