Moving House
Added 2023-03-04 21:40:53 +0000 UTCHey everyone, hope you’ve been well. This February was been a little hectic for myself, mostly due to a bout of illness that interrupted the regular flow of my life for a couple of weeks. The last week of the month also saw me become incredibly busy due to other factors outside my control so I had to postpone writing up this latest post for you guys. Still, enough excuses! I managed to do a few important things recently and, while I’m mostly putting a pin on other things I’ve promised to talk about like Matrix or the story list, I think it’s worth talking about them.
The main thrust of what I’ll be talking about is the fact that I migrated the server. To the cloud. Which is sort of a meaningless marketing thing, really, since the site has been running on a Virtual Private Server since its inception. An actual proper server requires either thousands of dollars to buy hardware outright or various times more what I pay now to rent. So, then, what do I mean by cloud? To make a boring story short, it’s just a slightly newer way that VPS are deployed to customers. Many servers and their operating systems are virtualized, that is to say real hardware apportions part of its resources to an installed environment that mostly otherwise works as normal. It used to be that THP ran on an old limited container-like thing called OpenVZ which had a few limitations for both the host machine and its users which made it fall out of fashion in favor of various KVM implementations among hosting services. The tendency has been towards giving more flexibility and options for customers to manage and specify what goes on in their virtual systems while reducing complexity and overhead for the hosts.
So then, again, the cloud. In the hosting space this is just an abstraction for a bunch of virtualization technologies and their management tools. Something like the FLOSS cloud-init have become a standard way to deploy new instances of your virtualized (usually, Linux) operating system while passing on a bunch of configuration options so that the server is set up and customized for its purpose. These “cloud” VPS can be set up quickly and spun down and deleted easily, according to the needs of the project and without the need for intervention from the hosting service beyond providing an interface (like fleio) to do so. These same standards allow for easy exporting of the whole system image, that is to say the virtualized system as it exists, so it can be transferred elsewhere if needed. Most hosts additionally allow easy allocation of more or less resources without needing to intervene manually. They’ll simply bill you more or less for what you want to use. There are a lot of other things that may be involved depending on the exact field but the general premise is greater control and ease-of-managment.
That brings us to the following question: why now? THP was working alright. Honestly, it would have been fine to leave the site running in its former Legacy KVM instance for the indefinite future. Below I’ll list a few reasons in no real order of importance:
- The legacy hosting service was raising its prices slightly. It would have been easily covered by the margin I have on the patreon goals. That said, being able to easily add more resources as needed to the site would be nice.
- The version of the operating system, Debian Linux, was nearing its next versioned release. That would require me to update it in a few months and potentially deal with a few teething issues that happen every upgrade. For a long time I’ve also been annoyed at some of the way the distro does things and had been thinking about changing to something else anyways.
- There was some software, like the mailing software or web server, that I had set up but would have liked to reconfigure from scratch and/or with newer versions so that I could do more things with them.
- I already had experience with this cloud service thanks to setting up the matrix server and another personal thing I run.
- The idea of “future-proofing” or at least making it easier to just migrate to other providers by a simple saving and restoring of an image done by me was attractive.
- I needed something to distract myself from my ill health a little.
The “how” was a bit more difficult. Out of curiosity I got a quote for how much it’d cost to have this transition managed by the hosting company and I got a three-figure response. I had planned to do it myself but that solidified my commitment. So I loaded up my own cloud-formatted image and spun up my own server. I had learned my lessons from my earlier attempts for other things and had my own basic setup done in only a few minutes. I also had prepared some other things ahead of time, such as making recent backups of configuration files from the old server for various programs and making a list of tasks I would have to do before the migration was fully complete, ordered by importance.
I did what I could, installing programs and setting up things that wouldn’t require the disruption of the old still-running server but most of that was quickly done. So I rolled up a redirect on the old server so that a page stating that maintenance work was being done and that the site would be back online later. This was because I would have to stop new events happening on the old server so as to really transfer all the needed data. Invariably, a message about work being done on a webserver still gets people asking whether or not the server is down because apparently reading a concise message in plain language is tricky.
Double and triple-checking that all the data had been transferred, I began to set up the webserver and its database backend first. As I was using newer software on the cloud server, I couldn’t just copy and paste my old configs and ended up tweaking quite a few things according to my liking and our needs. I did run into a few issues that required more intervention but things mostly worked as planned. I could do the next important thing which was to transfer the DNS record to the new server IP address and update the corresponding SSL certificates. These days, because of security, getting the SSL part wrong can lock you out of the server as it is assumed that there will be a valid address for most requests so it’s important to get that part done smoothly. Luckily Let’s Encrypt is pretty lenient about do-overs and I already knew exactly what I had to do.
Just like that, in a space of ten minutes from setting up the certificates, traffic to THP started to be received by the new server. With the maintenance page also set on the new server, I got to the next bits of thankless tasks. That is to say, starting to secure things further. This is when firewall rules are hardened, making suspicious or malformed requests get blocked and when access to the machine can only be done in a few specified ways and on a few specified ports. While, yeah, you can do that from the very beginning, I prefer to always err on the side of caution and be relatively permissive at first so that I don’t lock myself out accidentally (this has happened to every system admin on the planet at least once, even if they deny it.)
I could go on and on about the minutiae but let’s just say that I crossed out most of the 30ish items on my checklist quickly enough. Other, non-critical things like extra scripts that I run I left for me to do at my own leisure. I brought the site back up and things worked more or less fine. Until they didn’t. I realized at some point that posting was broken; it had worked fine in my testing environment. I took things down again and ran through logs and found nothing too informative. There were a few warnings about deprecated things that would be removed in future versions and the like, which I expected since I was using newer versions of things like PHP, but no smoking gun.
I ended up debugging set by step the posting process, creating a private instance of the site which only I could access on the cloud server. I eventually got my answer, finding that the reason was something a little esoteric. To overly simplify, there was an issue with how encrypted IPv6 addresses were being stored and because the way that some internal functionality with PHP had changed, the values that were created were (always? potentially?) longer than the length of the relevant column in the SQL database. Easy enough fix, alter the column properties and done. Site could be brought back up.
This should illustrate how any sort of server migration, even if well-planned can have its hiccups. The fact that it was an IPv6-specific problem also was something that I could not have foreseen in my local test machine as my own local addresses will always be IPv4 and they will never be transformed into a long-enough value to trigger the error. Should I need to migrate again in the future, the fact that I can just clone the state of the system and export it to any cloud service should ensure that I don’t have to do all that manual work again either.
As for the other warnings and issues with PHP I encountered: they were non-fatal and did not seem to affect the normal working of the site in most cases. So I decided to fix them carefully in my test environment first. They’ve all been sorted as far as I could tell and more work has been done along the lines of the cleanup I mentioned last time around (because you can’t justdo one thing and be done with it) and will be rolling out those changes with the story list stuff sometime soon™.
On the whole, however, I would say it was a smooth transition. The tools have gotten better and my own experience has made most of the common pitfalls entirely avoidable. I’ve already set up a few extra things that weren’t on the older server to my liking and it should make my life as a sysadmin a little easier to boot. Ha, ha, geddit?
While there’s more I could talk about, I’ve said a lot this time around, so I’ll end on that high note. I don’t want to jinx it but I think I’ll have cool things to share within a week or two. Until next time, take it easy!
Comments
The work of a sysadmin is never done. Thank you for being the one to do it, because holy hell has the little bit of amateur hour stuff I've done been confusing enough.
Benjamin Oist
2023-03-05 00:18:41 +0000 UTC