Computing stuff tied to the physical world

The logic of data logging

In Software on Mar 22, 2009 at 00:01

The point of several projects here is to monitor energy consumption, and probably also some weather data because it’s so easy to add. Where “monitoring” means logging as well as keeping measurement data in a database for quick access. But how do you make sure that all “readings” get logged, yet still retain the ability to adjust and extend the underlying database and software? It would be awful if each non-trivial change required conversion.

Here are a few basic design decisions I took for JeeMon:

  • Readings come in as text from the attached JeeNode, one line per reading.
  • All readings are time-stamped and appended to a logfile, in text format.
  • To keep things manageable, logfiles roll over to a new one on 0:00 GMT each day.
  • The first entry in a new logfile is the name and size of the previous one.
  • The database is treated as a cache: it can be reconstructed from the logs at any time.

And here’s a requirement I want to meet:

  • It must be possible to restart JeeMon at any time without losing readings.

The idea then is to “catch up” with the logs on each restart where possible, and to clear the cache db and reload it from scratch when the software changes are more substantial.

So what does it take to not lose any readings? Keeping in mind that JeeMon will run either on a little embedded Linux module right next to the central JeeNode, or on a personal Mac, Windows, or Linux computer. Well, first of all, this is why the central JeeNode has a backup battery and dataflash memory: it stays on at all times, and continues to receive and collect packets from all the other JeeNodes even when there is no JeeMon running. The duration of this autonomous operation will be fairly modest: a few hours or perhaps one day. Enough to handle brief power loss, to mess around with configurations, and to re-connect things occasionally.

This means there are now three places where data gets added and must be kept in sync with the rest:

Picture 1.png

(Four places if you also count the final destination: dynamically updating browser windows)

  • The dataflash memory gets a copy of each reading collected by the central JeeNode.
  • All readings must end up in the log files when JeeMon is running.
  • The cache database stores decoded values in an efficient internal format.
  • And lastly, the browser(s) present more or less detailed results, summaries, and derived info.

JeeMon will start up in logfile “catch up” mode: the current content of the dataflash memory will be checked against the last logfile entry. The first task is then to request old data from the JeeNode to bring the log files up to date. During this time, the JeeNode continues to add new incoming data to the dataflash.

Once all missed data has been transferred, the JeeNode switches to “real-time” mode, saving new readings to dataflash and sending them out to JeeMon at the same time. In real-time mode, the log files will track all readings as soon as they come in.

That’s just part of the story, though. Now JeeMon enters “db sync” mode. With information from the cache db, new entries and new log files are scanned and processed, with new readings added to the database. Once that is done, JeeMon switches to normal real-time operation.

All log files are kept online. Rebuilding a database from scratch is as simple as deleting the current one and restarting JeeMon. Since a full rebuild might take quite some time, the internal JeeMon webserver always starts up in a “please-wait” mode and switches to the real server after real-time operation has been enabled.

Part of the above logic has now been implemented. JeeMon resumes its logs, syncs / rebuilds its database as needed, and processes new readings while running. The firmware in the JeeNode to save and replay readings to and from dataflash is still work-in-progress, however.

So while the JeeHub / JeeNode needs to stay powered up, I can now restart JeeMon at will. Which is great, because that streamlines development.

  1. Hi Jean-Claude,

    I have just discovered your blog yesterday, I must say this is really amazing. I spent last couple of months working on similar idea. My sensor network however uses much simpler RF communication – I use Quasar AM 433MHz modules QAM-TX1 / QAM-RX2. I also use ATTiny13/45 for sensors and ATMega8/168 for receiver. Tinys are really dirt cheap, for what they are capable of doing.

    Anyway I really like your idea of JeeHub / JeeHub. As I am facing very similar kind of issues. I will take a look more closely on your implementation.

    Keep up good work!

    Cheers, Stan

Comments are closed.