Computing stuff tied to the physical world

Software evolution

In Software on Mar 17, 2011 at 00:01

With all the focus on sensors and embedded hardware, I’ve lost track a bit of the other side of the equation – monitoring all that incoming data, and (later on) using it to control devices. The receiving end – i.e. software running on a central PC (or embedded Linux box) has not kept up with the rest of the Jee world.

One reason was that my setup for collecting sensor data around the house has been running quite smoothly. Producing these graphs, updatable via a browser refresh:

See this, this, and this post for more details about how it was done. The JeeMon 2009 setup has been running almost non-stop on a low-power NAS, called the Bubba II (now replaced by newer models). There were only about half a dozen system restarts in these past two years, some due to power outages, some due to devices needing a reset, but that’s about it.

Trouble is… all development on this has stagnated. I did start on a JeeMon 2010 successor, but that has only been used for newer projects, such as the ookScope. In the end, I didn’t really want to disrupt my working data-collection setup and just kept the old one going. So now I’ve got two years worth of detailed logging and a local web site which is showing its age and not very useful for meeting my new requirements – let alone being able to perform more meaningful stats and controlling devices for home automation.

Last summer a student project was started at the nearby Utrecht University to try and come up with a general infrastructure which would be able to do a lot more than what I had. The result is a new system called “JeeBus” (more details coming soon on this weblog).

While JeeBus does provide a fair set of interesting features, one of the issues that kept bothering me is that normally in software development you have to either run the code or make changes to it. The “conventional” operation mode is to stop the server, edit the code, and then restart it to pick up the changes. With a bit of luck, there may be the option to re-install certain parts – but usually this is limited to “drivers” and “extensions”.

I find this stifling. Having to restart an app just to try out a one-line change is much more disruptive during active development than such a simple stop/re-start would suggest, because each time you also have to get the process back to the state you were working on. And with dynamic scripting languages, it’s a bit silly to have to jump through such hoops – which really stem from the edit-compile-run cycle of statically-bound environments, decades ago.

So I’ve started scratching my itch, and implemented a small core “hub” which starts up functionally empty with just enough capability to accept remote procedure calls (RPC‘s) and to inject plugins into a (local or remote) running process. The last version of each plugin is saved and is automatically loaded again after a restart. The result is a JeeMon process which starts off as a blank slate and evolves into a full-fledged app – web server, gui, hardware interface, background task, anything.

So far, development in a “live” process looks promising. There are less and less situations where I need to restart. I’ve set up a little tool to push all changed plugins to a remote hub, and that really completely changes the landscape of software development for me. No need to take down a real-time system anymore, which is what most of all this is about when it comes to physical computing and devices. Bugs generate stack traces, but the hub continues to function, so re-installing a fix usually solves the problem. And changing code in a working system has never been easier. This matters a lot, because I really really want to be able to “grow” a system over time.

Starting and stopping a process which is designed to run non-stop is odd. Let’s see if this new design will make it unnecessary in most cases – during active development as well as for tweaking a working setup at a later date.

  1. Lisp is a language and a runtime environment that allows you to do just what you want: patch code while it’s running, without having to stop it. Modify a function, compile it, and the next time the function is called, it’s the new version that is executed. Please not that I have never taken the time to teach myself Lisp, and what I wrote is about the sum of my knowledge about it :)

    • A Lisp environment such as SBCL is way ahead of most programming languages (scripting and otherwise) in many respects, and I’d love to use that. It has been ahead of its time for half a century or so. But it’s not portable enough and requires too much memory for use on tiny embedded Linux boards, which is a low-end option I’d like to keep open.

  2. I use Ruby, which does not have to be stopped. Ruby is particularly interesting since there is another project for programming Arduino in ruby.

  3. Isn’t this really the webserver model, where there is a main (thin) service which picks up the message and then spawns a child process which is configured to handle the particular extension?

    You certainly don’t have to go round restarting them just because you’ve updated one of your perl/ruby/cgi pages.

    • Yes, very similar – and probably an important reason why PHP is so popular. But I’m doing it at a more fine-grained level, replacing functions on the fly – no re-spawning, and not necessarily just web pages. It also works for decoding serial ports without resetting the connection, for example.

  4. Not that I would recommend switching over to another programming environment (you should use what you can speak fluently :-). I just wanted to note that the functionality of code mofication, recompilation while software keeps on running is available for most modern program languages/environments.

    For example, this functionality is available in C# where you can modify the code, rebuild the program, and attach it again to the running process. We used this functionality in a large greenhouse environment where restarting the program was no option.

    • There are many trade-offs – C#, like Lisp, is not quite a good fit for the low-end embedded hardware I want to be able to support. I’m impressed that it can do on-the-fly modification – I know even C/C++ can do that, but the complexity of doing so appears to be fairly high.

      It’s unfortunate that language choice affects so much of a project. Everyone wants to use a different one, and the world (especially the OSS world) is hopefully fragmented, with no end in sight. Oh well, best I can do is use what makes me productive, as you say, and aim for clear interface points – such as a RESTful web server design, and JSON as low-cost exchange format. It’s going to be a mix of technologies anyway, with C/C++ on the physical computing devices, and HTML / CSS / JavaScript for web interfaces.

  5. I noticed something like this in ESB’s : the desire to be able to hot-plug software modules in and out of a system. the kind of interface requires looks a lot like the comms you use “over the air”

  6. I’m not sure what you are trying to accomplish, but to me it seems overly complex (I don’t, by any measure, mean to sound condescending, I just don’t fully understand the problem yet). Why are you building a webserver? And isn’t a GUI just a view on the data? Which, in it’s simplest form, just a webpage? Which is stateless and thus always restartable? The hardware should just sent signals to a server. The protocol could be relatively easy and not subject to too much change.

    • In order: 1) webserver – not really building, just enabling a ready-to-use embedded one, because low-end hubs will also need a web server, 2) GUI – not every GUI will be web-server based, 3) stateless – yes, that makes a huge difference, but where I want to go isn’t stateless alas, but real-time updates with server push.

      I’m all with you on pushing for simplicity. I’ve been doing webpage-based polling for some time now. In fact, that’s exactly how I view the info now: a DataTable in jQuery, which uses Ajax to refresh its data. It works, but 1 sec refresh loads down a little embedded Linux box too much for my tastes, so change-driven updates would be better. I just don’t see how to set up a spreadsheet-like dependency mechanism which flows from incoming hardware events to storage, aggregation, calculation, and formatting with the normal stateless web server/client approach.

      Of course things don’t have to be built that way, but this is where my interest lies…

  7. I have my home automation system up and running. It’s a client-server thing, done from scratch in Java, communicating with physical computing nodes (programmed with the Arduino platform). I have a server application that takes care of everything (gathering physical data, processing, database archiving) bar the visual part. That part is done using another application, the client, that connects to the server for supervision, control and analysis. It works great!

    But the edit-compile-run cycle is also driving me mad. Although the server has a persistence mechanism (keeping track of elements’ status even when it stops ungracefully), being able to update the server while running would be great, and even the client could use this method (even if restarting the client is fairly quick and completely innocuous on the server).

    It’s been on my list of things to study, but alas, time has not let my dig deeper. So I’m eagerly looking for results here, and if I can contribute somehow, count me in!

    Have fun!

  8. My Google searches have come to very little fruition; can somebody shed some light on this subject, and give a few pointers on the web to where this “dynamic linking” is being done today? As far as I can tell, there’s the development side of it (like XCode’s Fix and Continue), and the runtime/deployed situation.

    In fact, the PLCs (Programmable Logic Controllers) we use in my job work just like this (I sometimes get to program them, being a do-it-all programmer for the industry); we can substitute an FC (a function) in runtime, and the PLC does not stop. It just receives the new version of the function, and changes it’s reference to the new one between cycles, stopping only on error.

    Thanks!

    • With dynamic scripting languages, everything is either interpreted or on-the-fly compiled, so a reload into running code becomes a lot simpler than with statically compiled languages. I do it all the time in Tcl with JeeMon – in effect any proc or method can be replaced at run time, and will start to be used once there is no longer a thread of execution using the previous version. One important detail is that you don’t just want to adjust code, but also adjust live data structures. This is feasible at least for data which acts like a cache, i.e. stuff which can be reconstructed with relative ease. And my impression so far is that with a database-backed design, lots of things can be implemented as caches.

  9. JCW: I haven’t yet had time to look through your software work, but where are you putting the “process” code? I mean, imagine you have an air intake system in your house, and the ventilation fans in each room need to be started when the appropriate conditions are met (incoming temperature > room temperature, for example).

    Where are you thinking of putting this code, in the hardware (JeeNodes) or in the software (JeeMon/JeeBus)?

    • Eventually, I’d like to support both (even a mix). My first goal though, will be to have this logic in the central software. It gets interesting once you go into generating sketches for the hardware nodes, and uploading them automatically.

Comments are closed.