Architecture: it's all about data! Jan 2016
Code vs. data… there are many ways to go into this oh so important dichotomy.
Here is a famous quote from a famous book, now over four decades old (as you
can tell from its somewhat different terminology, i.e.
Show me your flowcharts, and conceal your tables, and I shall continue to be mystified; show me your tables and I won’t usually need your flowcharts: they’ll be obvious. – Fred Brooks, “The Mythical Man Month”
Data really tends to be the most important aspect of a long-term design process, because:
- code matters while our program is executing, data is what stays around when it is not
- code is what we invent and produce to deal with a task, data is what comes in as facts
- code evolves as we better understand a task, data needs to be saved and kept intact
Very often, software development is like a constant shuffle: we write code, we run it, it collects, generates, and transforms some data, we save that data to disk, we stop the code and replace it with an improved version, and then the whole process starts over again. We’re continuously alternating between run mode (with the code frozen) and stop mode (with the data frozen):
There are clearly exceptions to this view of the software development process: when we store HTML pages as data, it really is part of the software design, in the same way that our code is.
But the model breaks down with JET, which needs to be running 24 hours a day, 7 days a week. As far as the hub is concerned, there is no stop mode. We don’t want to lose incoming data.
This means the design of the central data structures and formats must be frozen from day one. Of course we’ll need to be able to add, change, and remove any data flowing through the system, but its shape and semantics should be fixed, as far as the logic and code in the hub is concerned.
This is not as hard as it may seem. The hub is a switchboard. There is very little data which it needs to understand. If it can collect data, pass it around, and save it, it won’t care what that data is. And that’s where MQTT’s “pub/sub” and Bolt’s “key/value” concepts make things easy:
- there are topics (a plain text string, with slashes and some minor conventions)
- these topics determine the routing of incoming and outgoing messages
- and there are values (message “payloads” in MQTT terminology)
- for MQTT, the mechanism is called publish-subscribe, or “pub/sub” in short
- for Bolt, the topic is the (hierarchical) key under which a value is stored on disk
- the values can be anything and can often be treated as an opaque collection of bytes
The only exceptions are the messages which control the behaviour and operation of the hub itself. These need to be specified early on, and frozen - hopefully in such a way that all further changes can remain 100% backwards-compatible. Again, this is not necesssarily a very hard requirement to meet: if we start off with a truly minimally-viable set of special hub messages, then every subsequent change can be about adding new message conventions for the hub.
Adding message types, formats, rules, and semantics to a running system is far less intrusive than changing what is already in use. Even if the first hub can only pass messages as-is through MQTT and not save them in Bolt, quite a few features in JET can be tried and built already. As we figure out the best messaging design for this, we can start by implementing this in a separate JET Pack before messing with the hub. This can be done on our development machine, as a pack which includes Bolt and connects to the rest of the system like any other pack: over MQTT.
With respect to data formats, one more design decision will be imposed: the values / payloads which need to be processed and understood by the hub will use JSON formatting. It may end up getting used in lots of places, but that’s not a hard requirement as far as the hub is concerned.
Messaging is the heart of JET (i.e. data) - not logic or processing (code) !
What about sensors, actuators, and tying into the physical world? - Same story, really: we can implement it first as a separate pack, and then choose to move that functionality into the hub, if it works well, is super-robust, and if it simplifies the flow and structure of the entire setup.
What about the front-end then, the web server which lets us see what’s going on in our house, control appliances, and define automation rules? - Again: we can start with a separate pack.
You might recognise some concepts from an old project at JeeLabs, called JeeBus - and many of the design aspects of JET are indeed similar, even if based on different technology which didn’t even exist at the time. It’ll be interesting to see how this approach plays out this time around.
As an architecture, JET embraces decoupled development, because this will allow the city-like properties mentioned in the intial requirement specs. If JET is about evolving software over a long time span, then it has to be able to evolve from a tiny nucleus (the hub) right from the start.
In a nutshell: JET/Hub is the place where data thrives - JET Packs are where code thrives.