• 4 Posts
  • 27 Comments
Joined 1 year ago
cake
Cake day: July 1st, 2023

help-circle



  • How do you upload a snapshot?

    Basically, as you said. Mount the data somewhere and back up its contents.

    I back up snapshots rather than current data, because I don’t want to stop the running containers that read and write from that data. I’d rather avoid the situation where the container is writing data while it’s being backed up. The back up happens shortly after the daily snapshot is made so the difference between current and snapshot data is small.


  • As others have said, with an incremental filesystem level mechanism, the backup process won’t be too taxing for the CPU. I have ZFS set up which makes this easy and I make hourly snapshots using sanoid which also get sent to another mirrored pair of connected drives using syncoid. Then, once a day, I upload encrypted daily snapshots to a bucket in the cloud using restic. Sounds complicated, but actually sanoid/syncoid and restic do all the heavy lifting. All I did is automate their schedules using systemd timers and some scripts to backup the right directories.




  • My configuration and deployment is managed entirely via an Ansible playbook repository. In case of absolute disaster, I just have to redeploy the playbook. I do run all my stuff on top of mirrored drives so a single failure isn’t disastrous if I replace the drive quickly enough.

    For when that’s not enough, the data itself is backed up hourly (via ZFS snapshots) to a spare pair of drives and nightly to S3 buckets in the cloud (via restic). Everything automated with systemd timers and some scripts. The configuration for these backups is part of the playbooks of course. I test the backups every 6 months by trying to reproduce all the services in a test VM. This has identified issues with my restoration procedure (mostly due to potential UID mismatches).

    And yes, I have once been forced to reinstall from scratch and I managed to do that rather quickly through a combination of playbooks and well tested backups.





  • Correct. And getting the right configuration is pretty easy. Debian has good defaults. The only changes I make are configuring it to send emails to me when updates are installed. These emails will also then tell you if you need to reboot in subject line which is very convenient. As I said I also blacklist kernel updates on the server that uses ZFS as recompiling the modules causes inconsistencies between kernel and user space until a reboot. If you set up emails, you will also know when these updates are ready to be installed because you’ll be notified that they’re being held van.

    So yea, I strongly recommend unattended-upgrades with email configured.

    Edit: you can also make it reboot itself if you want to. Might be worth it on devices that don’t run anything very important and that can handle downtime.


  • A few simple rules make it quite simple for me:

    • Firstly, I do not run anything critical myself. I cannot guarantee that I will have time to resolve issues as they come up. Therefore, I tolerate a moderate risk of a borked update.
    • All servers run the same be OS. Therefore, I don’t have to resolve different issues for different machines. There is then the risk that one update will take them all out, but see my first point.
    • That OS is stable, in my case Debian so updates are rare and generally safe to apply without much thought.
    • Run as little as possible on bare metal and avoid third party repos or downloading individual binaries unless absolutely necessary. Complex services should run in containers and update by updating the container image.
    • Run unattended-upgrades on all of them. I deploy the configuration via Ansible. Since they all run the same OS, I only need to figure out the right configuration once and then it’s just a matter of using Ansible to deploy it everywhere. I do blacklist kernel updates on my main server, because it has ZFS through DKMS on it so it’s too risky to blindly apply.
    • Have postfix set up so that unattended-upgrades can email me when a reboot is required. I reboot only when I know I’ll have some time to fix anything that breaks. For the blacklisted packages I will get an email that they’ve been held back so I know that I need to update manually.

    This has been working great for me for the past several months.

    For containers, I rely on Podman auto update and systemd. Actually my own script that imitates its behavior because I had issues with Podman pulling images which were not new, but which nevertheless triggered restarts of the containers. However, I lock the major version number manually and check and update major versions manually. Major version updates stung me too much in the past when I’d update them after a long break.


  • I expose my services to the web via my own VPS proxy :) I simply run only very few of them, use 2FA when supported, keep them up to date, run each service as rootless podman, and have a very verbose logcheck set up in case the container environment gets compromised, and allow only ports 80 and 443, and, very importantly, truly sensitive data (documents and such) is encrypted at rest so that even if my services are compromised that data remains secure.

    For ssh, I have set up a separate raspberry pi as a wireguard server into my home network. Therefore, for any ssh management I first connect via this wireguard connection.



  • Thanks for your reply! One thing I’m struggling with networkd is hysteresis. That is, toggling the interface down and then back up does not do what I expect it to. That is, setting the interface down does not clear up the configuration, and setting the interface up does not reconfigure the interface. I have to run reconfigure for that. I was hoping that the declarative approach of networkd would make it easy to predict interface state and configuration.

    This does make sense because configuration is not the same as operational state. However, what would the equivalent of ifdown (set interface down and remove configuration) and ifup (set interface up and reconfigure) be using networkd and networkctl? This kind of feature would be useful for me to test config changes, debug networking issues, disconnect part of the network while I’m making some changes, etc.