Effortless fallback server Dec 2015

All hardware breaks, occasionally. In the case of a server - it’s usually the storage which fails eventually, be it SD cards, eMMC, an SSD, or a rotating disk (remember those?).

The new server setup at JeeLabs relies completely on a single Odroid XU4 in a CloudShell.

What if it breaks, gets hacked, stolen, or otherwise ceases to work?

Here’s the backup plan: we set up a second server, with fairly modest hardware, and we keep it around for when things turn sour. In this case, an Odroid U3 (no longer produced):

201312222305368236

A smaller setup would also have sufficed. Even a Pi Zero (with a WiFi dongle, for example).

The reason we can use something a lot less powerful here, is that we’re going to enable only the static site serving part. No Redmine, no Ruby On Rails, no MySQL. Just files & images.

As before, we can use Nginx, which is more than up to this task. In fact, we can re-use the exact same configuration files, assuming the different sites were all set up via separate config files in /etc/nginx/sites-available/. All we need to do is add this minimal “noredmine” config, which reports the (temporary) absence of that particular site:

server {
  server_name jeelabs.net www.jeelabs.net;
  access_log /var/log/nginx/noredmine.access.log;
  return 503;
}

The rest stays the same. We just need to get hold of an up to date copy of all the websites. Here is the crontab entry for user www-data on this backup server:

$ crontab -u www-data -l
# m h  dom mon dow   command

0 1 * * 1 rsync -a --delete xudroid::www html1 && touch html1
0 1 * * 2 rsync -a --delete xudroid::www html2 && touch html2
0 1 * * 3 rsync -a --delete xudroid::www html3 && touch html3
0 1 * * 4 rsync -a --delete xudroid::www html4 && touch html4
0 1 * * 5 rsync -a --delete xudroid::www html5 && touch html5
0 1 * * 6 rsync -a --delete xudroid::www html6 && touch html6
0 1 * * 7 rsync -a --delete xudroid::www html7 && touch html7

As with the main server backup, we’ll grab a copy of all the static sites, and manage them as a rotating set of backups - that way it becomes very easy to make the server return site info for any of the recent days (just in case a static site was damaged, or compromised).

Note that this is using rsync in a different way: the main server has been set up to serve its static files via rsync, which is indicated by the subtle double colon in the path. Note also that the main server allows reading these files, not writing, so it can’t be messed with. There is no security issue, since all these files are for public serving anyway. By using the rsync server protocol instead of the usual rsync-over-ssh, we avoid encryption overhead. The difference is quite noticeable: these updates will only take only about 3 seconds.

The original plan was to shut down this backup server most of the time. Just wake it up to pull the latest static sites over once a day and then go back off to save power. But it turns out that the U3 only draws 250 mA - that’s 1.25W of power, i.e. a mere 11 KWh/year, less than one sunny day’s worth of solar power production at JeeLabs!

So the plan has been adjusted: let’s keep this server on permanently, syncing its copies once in a while, and standing by for immediate switchover if the main server fails. Flipping that switch is extremely simple: we just change the central FritzBox internet router to forward port 80 (and perhaps also 443 for HTTPS) to this server instead of the main one.

While in backup mode, all requests to JeeLabs.net will report 503 Service Unavailable, which is nice because it also lets spiders / search engines know that the issue is temporary.

Meanwhile, the main server can be taken off-line, fixed, wiped, replaced - whatever.

So there are now three pieces involved for a permanent server setup: the XU4, the U3, and the router. The router includes an Ethernet switch, which is most convenient, but we still need three power plugs to keep it all running. Let’s take that one step further and power everything off the single 12V adapter which came with the FritzBox. We can do this by adding a 12-to-5V step-down regulator - here is the concoction which ties it all together:

DSC 5301

On the right: incoming 12V and pass-through to the router. On the left: 5V for the XU4 and the whole thing mounted on the U3. There are two power lines to the U3 because it also includes a LiPo-powered UPS, as you can see here:

DSC 5302

This UPS approach adds to the cost - and it’s not even useful for its task as internet server. What good is a UPS-powered-server after all, when the router and FTTH modem aren’t?

Well.. the reason for this is that our “fallback web server” can also act as an excellent server for home monitoring and automation: it’s always on, it’s not (normally) connected to the outside world, its CPU is idle, it’s a nice Linux box, and look at the nice unused USB ports!

For monitoring & control, keeping such a system alive across power outages is very useful: it can continue to log data from all the nodes on our local WSN, as well as control them. As a nice bonus, we can now also accurately log AC mains power outages themselves with this.

And if you want to make this home server accessible from outside, simply add an extra port redirection in the router. There are numerous other ways to extend this configuration, for example if you need a local file, print, or A/V server which is not accessible from outside.

So there you have it: a dual-system server, eh… “farm” (well, they do have 12 cores total!), with redundancy to keep the main static web servers going and a nice central home server.

The only moving parts are the two fans, and they’re off most of the time.

Weblog © Jean-Claude Wippler. Generated by Hugo.