Yesterday’s post was about the three (3!) ways HouseMon manages data. Today I’ll describe the initial archive design in HouseMon. Just to reiterate its main properties:
- archives are redundant – they can be reconstructed from scratch, using the log files
- data in archives is aggregated, with one data point per hour, and optimised for access
- each hourly aggregation contains: a count, a sum, a minimum, and a maximum value
The first property implies that the design of this stuff is absolutely non-critical: if a better design comes up later, we can easily migrate HouseMon to it: just implement the new code, and replay all the log files to decode and store all readings in the new format.
The second decision was a bit harder to make, but I’ve chosen to save only hourly values for all data older than 48 hours. That means today and yesterday remain accessible from Redis with every single detail still intact, and for things like weekly, monthly, and yearly graphs it always resolves to hourly values. The simplification over progressive RRD-like aggregation, is that there really are only two types of data, and that the archived data access is instant, based on time range: simple file seeking.
Which brings me to the data format: I’m going to use Plain Old Binary Data Files for all archive data. No database, nothing – although a filesystem is just as much a database as anything, really (as “git” also illustrates).
Note that this doesn’t preclude also storing the data in a “real” database, but as far as HouseMon is concerned, that’s an optional export choice (once the code for it exists).
Here’s a sample “data structure” of the archive on disk:
archive/ index.json p123/ p123-1.dat p123-2.dat p124/ p124-1.dat
The “index.json” file is a map from parameter names to a unique ID (id’s 1 and 2 are used in this example). The “p123″ directory has the data for time slot 123. This is in mult1ples of of 1024 hours, as each data files holds 1024 hourly slots. So “p123″ would be unix time 123 x 1024 x 3600 = 453427200 (or “Tue May 15 00:00:00 UTC 1984″, if you want to know).
Each file has 1024 hourly data points and each data point uses 16 bytes, so each data file is 16,384 bytes long. The reason they have been split up is to make varying dataset collections easy to manage, and to allow optional compression. Conceptually, the archive is simply one array per parameter, indexed by the absolute hour since Jan 1st, 1970 at midnight UTC.
The 16 bytes of each data point contain the following information in little-endian order:
- a 4-byte header
- the 32-bit sum of all measurement values in this hour
- the 32-bit minimum value of all measurements in this hour
- the 32-bit maximum value of all measurements in this hour
The header contains a 12-bit count of all measurement values (i.e. 0..4095). The other 20 bits are reserved for future use (such a few extra bits for the sum, if we ever need ‘em).
The average value for an hour is of course easy to calculate from this: the sum divided by the count. Slots with a zero count represent missing values.
Not only are these files trivial to create, read, modify, and write – it’d also be very easy to implement code which does this in any programming language. Simplicity is good!
Note that there are several trade-offs in this design. There can be no more than 4095 measurements per hour, i.e. one per second, for example. And measurements have to be 32-bit signed ints. But I expect that these choices will work out just fine, and as mentioned before: if the format isn’t good enough, we can quickly switch to a better one. I’d rather start out really simple (and fast), than to come up with a sophisticated setup which adds too little value and too much complexity.
As far as access goes, you can see that no searching at all is involved to access any time range for any parameter. Just open a few files, send the extracted data points to the browser (or maybe even entire files), and process the data there. If this becomes I/O bound, then per-file compression could be introduced – readings which are only a small integer or which hardly ever change will compress really well.
Time will tell. Premature optimisation is one of those pitfalls I’d rather avoid for now!