Why doesn't this dream setup exist? Dec 2015

The beauty of an all-static website, is that it’s all data, and that it is fully defined by a set of files on the disk. By copying it, or restoring it to some previous state, we can instantly adjust the appearance of the website. No databases to launch, and no application servers.

You would think that static websites are limited-purpose. But what is a weblog, or even a discussion forum, other than a set of pages which occasionally changes during the day?

The key design issue is not really the outcome, i.e. the resulting website and its appearance, but the input feeds and the trigger conditions for change. If a website is changed by just one person, such as a personal weblog for example, then all you need is a static website generator. There are lots of them - with over 120 listed on the StaticGen website!

The mechanism is always similar: start up a local app, which watches for file changes in a specific source directory and it’ll regenerate a fully static website in its output directory, often with a simple self-refreshing mechanism for looking at the result as you edit a file.

Here is a nice diagram, describing how it works with the Pelican generator, for example:

Pelican flow

It’s just content, as text files using some simple markup convention like Markdown or ReStructured Text, a bunch of HTML templates, CSS, and maybe also some JavaScript.

There are some very nice and fast static website generators, such as Hugo (written in Go). You keep the “source” on your laptop, update and regenerate the site as needed, and then copy the output over to where the actual (static!) web server lives.

But what about wikis, forums, and other such collaborative sites?

The difference here is that more people will be generating and updating content. We can no longer work off a local copy on some laptop. But does that mean we’ll need a LAMP stack?

Not necessarily. The benefits of a fully static site are worth it - so let’s look for alternatives. With static files, maintaining a fallback server is easy, as you’ve seen in the previous article about the new JeeLabs setup. A static site can be placed in maintenance mode, disallowing changes, but maintaining read and search access as before. It’s easy to keep history, and revert to it. It’s also very easy to set up a new server quickly, since all you need is Nginx.

There are ways to do this. Some wikis work entirely off static files, using very simple fetch and store requests via the web server to add or edit pages. But they usually keep only the original source files, not the rendered HTML on disk, letting either the web server or the web client generate the actual HTML as needed. Other wikis work like this:

NewImage

The entire website is available as HTML pages, and that’s all the web server needs to use for people browsing the site. For editing, the web server talks to a separate “backend” app, which can hand it the source format of a page (e.g. Markdown), and which accepts new versions. It stores these (perhaps with history), but it also generates updated HTML pages.

This scheme is extremely useful, because it’ll continue to work in read-only mode with just the web server and generated HTML files. Such as from a fallback server. This can even be used in parallel with the rest of the system, for replicated high-performance sites.

There are a few details which need attention.

First of all, some changes affect a large number of pages (or even all, if a sidebar changes). It would be expensive to fully re-generate a site on each such change, if only a few pages are actually visited between changes. One solution is to generate pages in a lazy fashion: don’t generate all the HTML right away - just invalidate all the affected pages, i.e. remove them from the pool of saved static pages. Then, whenever the web-server reaches a missing page (that 404 we all dread), it instead asks the backend to generate quickly that specific page for it. Meanwhile, at a low priority, the backend generates the remaining missing pages so that all of them will exist again soon. This reduces wasted effort on an active site.

The second issue is that we’ll want a far more immediate update experience. When looking at a page, we’d like to have it updated right away when someone changes it, not just when we refresh it in the browser. This can be done with a websocket and some extra machinery:

NewImage

The web server does even less now - it just serves static pages, images, and other “assets”. But to support dynamic behaviour, the server also lets each web client use a WebSocket for dealing with asynchronous events. Through this (bi-directional!) WebSocket, the backend can tell each interested web client about changes, such as a specific page being modified.

While it would be possible to let the backend deal with WebSockets itself, a far more flexible setup is to use MQTT for this - which is completely generic and scales really well.

What we have now, is a setup which can be as dynamic as the application needs, while still supporting a “degraded” read-only backup service using just a web server plus static files.

In normal operation, both green boxes above stay running at all times, i.e. the Nginx web server and the Mosquitto MQTT server. But the static generator/updater backend can be taken down and replaced at will - for maintenance, upgrading, development, whatever.

And if you think this is just to create a nice wiki or forum, think again: this very same architecture can be used for almost anything. Home monitoring & automation, anyone?

The key is the support for a totally read-only configuration, serving just some static files. If all else fails, you can put these files on a USB stick, set up a static server (anything will do!) and be back with a functional-but-frozen setup in no time. It doesn’t have to be a public site - the same would work for a home system (but again, with all its functions frozen).

Now let’s put all this in perspective: of course the objective is not to treat static data as the ultimate setup. It’s just to establish a baseline: if the lowest common denominator of a site is its frozen state, then we have taken the choice of software out of the equation. We can now design and implement our fancy systems using whatever tools we like. And if we later change our mind, we can tear it down, replace some part(s) of it, and bring it back up. Tinkering on a “live site” becomes as risk-free as working with a “dead” (i.e. static) one. Presenting a demo and exploring different designs is trivial - just bring a copy of the files.

Much of the flexibility comes from the fact that Nginx and Mosquitto remain app-agnostic. They do their thing, but are not part of the development cycle, other than a minor config. And if you’re familiar with reactive programming, you’ll recognise the cycle in this design.

It’s hard to overstate the generality and modularity of such an approach. Multiple back-ends, each generating different parts of a website - even a set of sites using virtual hosting. The internet-facing server continues to operate, regardless of backends quitting or failing.

The. Essence. Is. Static. Files. - Why doesn’t everyone build systems this way?

(back to jeelabs.net: wouldn’t it be grand if there were a wiki + forum using this approach?)

Weblog © Jean-Claude Wippler. Generated by Hugo.