This is the third of a four-part series on designing big apps (“big” as in not embedded, not necessarily many lines of code – on the contrary, in fact).
Because I couldn’t fit it in three parts after all.
Let’s talk about the big picture, in terms of technologies. What goes where, and such.
A decade or so ago, this would have been a decent model of how web apps work, I think:
All the pages, styles, images, and some code fetched as HTTP requests, rendered on the server, which sits between the persistent state on the server (files and databases), and the network-facing end.
A RESTful design would then include a clean structure w.r.t how that state on the server is accessed and altered. Then we throw in some behind-the-secnes logic with Ajax to make pages more responsive. This evolved to general-purpose client-side JavaScript libraries like jQuery to get lots of the repetitiveness out of the developer’s workflow. Great stuff.
The assumption here is that servers are big and fast, and that they work reliably, whereas clients vary in capabilities, versions, and performance, and that network connection stability needs to stay out of the “essence” of the application logic.
In a way, it works well. This is the context in which database-backed websites became all the rage: not just a few files and templating to tweak the website, but the essence of the application is stored in a database, with “widgets”, “blocks”, “groups”, and “panes” sewn together by an ever-more elaborate server-side framework – Ruby on Rails, Django, Drupal, etc. WordPress and Redmine too, which I gratefully rely on for the JeeLabs sites.
But there’s something odd going on: a few days ago, the WordPress server VM which runs this daily weblog here at JeeLabs crashed on an out-of-memory error. I used to reserve 512 MB RAM for it, but had scaled it back to 256 MB before the summer break. So 256 MB is not enough apparently, to present a couple of weblog pages and images, and handle some text searches and a simple commenting system.
(We just passed 6000 comments the other day. It’s great to see y’all involved – thanks!)
Ok, so what’s a measly 512 MB, eh?
Well, to me it just doesn’t add up. Ok, so there’s a few hundred MB of images by now. Total. The server is running off an SSD disk, so we should be able to easily serve images with say 25 MB of RAM. But why does WP “need” hundreds of MB’s of RAM to serve a few dozen MB’s of daily weblog posts? Total.
It doesn’t add up. Self-inflicted overhead, for what is 95% a trivial task: serving the same texts and images over and over again to visitors of this website (and automated spiders).
The Redmine project setup is even weirder: currently consuming nearly 700 MB of RAM, yet this ought to be a trivial task, which could probably be served entirely out of say 50 MB of RAM. A tenfold resource consumption difference.
In the context of the home monitoring and automation I’d like to take a bit further, this sort of resource waste is no longer a luxury I can afford, since my aim is to run HouseMon on a very low-end Linux board (because of its low power consumption, and because it really doesn’t do much in terms of computation). Ok, so maybe we need a bit more than 640 KB, but hey… three orders of magnitude?
In this context, I think we can do better. Instead of a large server-side byte-shuffling engine, we can now do this – courtesy of modern browsers, Node.js, and AngularJS:
The server side has been reduced to its bare minimum: every “asset” that doesn’t change gets served as is (from files, a database, whatever). This includes HTML, CSS, JavaScript, images, plain text, and any other “document” type blob of information. Nginx is a great example of how a tiny web server based on async I/O and written in C can take care of any load we down-to-earth netizens are likely to encounter.
Let me stress this point: there is no dynamic “templating” or “page generation” on the server side. This takes place in the browser – abstracted away by AngularJS and the DOM.
In parallel, there’s a “Real-Time Server” running, which handles the application logic and everything that needs to be dynamic: configuration! status! measurements! commands! I’m using Node.js, and I’ve started using the fascinating LevelDB database system for all data persistence. The clever bit of LevelDB is that it doesn’t just fetch and store data, it actually lets you keep flows running with changes streamed in and out of it (see the level-livefeed and multilevel projects).
So instead of copying data in and out of a persistent data store, this also becomes a way of staying up to date with all changes on that data. The fusion of a database and pubsub.
On the client side, AngularJS takes care of the bi-directional flow between state (model) and display (view), and Primus acts as generic connection library for getting all this activity across the net – with connection keep-alive and automatic reconnect. Lastly, I’m thinking of incorporating the q-connection library for managing all asynchronicity. It’s fascinating to see network round trip delays being woven into the fabric of an (async / event-driven) JavaScript application. With promises, client/server apps are starting to feel like a single-system application again. It all looks quite, ehm… promising :)
Lots of bits that need to work together. So far, I’m really delighted by the flexibility this offers, and the fact that the server side is getting simpler: just the autonomous data acquisition, a scalable database, a responsive WebSocket server to make the clients (i.e. browsers) come alive, and everything else served as static data. State is confined to the essence of the app. The rest doesn’t change (other than during development of course), needs no big backup solution, and the whole kaboodle should easily fit on a Raspberry Pi – maybe something even leaner than that, one day.
The last instalment tomorrow will be about how HouseMon is structured internally.
PS. I’m not saying this is the only way to go. Just glad to have found a working concoction.
Your latest web client/server mechanics appear very cool. I am certainly no expert. I am attempting to use Arduino+the TI CC3000 WiFi chip (via AdaFruit’s CC3000 offering).
The client side library for Arduino provided by AdaFruit provides the ability to send simple HTTP commands. I’d like a model in which I go from Arduino client to server (versus having a middle ‘big brother client’ to which the Arduino sends simple requests and the ‘big brother’ knows how to best communicate – e.g: via the newer APIs and protocols you are exploring).
My domain expertise is not web server side programming. Thus I was hoping you might recommend a client/server side protocol+api (e.g.: stick with REST and PHP or is my lack of web server programming knowledge limiting my choices too much?) that would be “best” given the constraints of the environment I note above.
Thank you for your great posts!
For a client-side setup such as Arduino + ethernet, I’d stick to REST calls and JSON-formatted data. See the way
Pachube, ehCosm, eh Xively have set things up to receive commands from low-end embedded devices.If you want to build (parts of) the server side yourself, then yeah – use what’s available to you. With REST+JSON as convention over the wire, you can always replace either side with something else later.
Another option would be to use MQTT as over-the-wire protocol. There’s an Arduino library for that. It’s not really a “middle big brother client”, but you’ll need to run the MQTT server somewhere and have your PHP etc server subscribe to that (both can be running on the same machine).
I’m keeping all options open for HouseMon – projects such as Mosca can make it talk to just about any messaging standard out there, and supporting a simple REST protocol should be easy.
THANKS! After running servers for my children’s software game company, my lesson learned is I REALLY do NOT want to run a server. I ultimately want to build product. Server uptime. The expense and stress of running it myself (getting enough bandwidth speed/cost for cooling, equipment, backup equipment because the main unit can AND WILL fail…blah blah…gnashing of teeth, etc.) plus the sheer amount of time dedicated to do so is so not worth it for me.
So now I am looking at Xively, Parse.com, Heroku, AWS (have used before and liked)…
The current “bummer” with Xively and Parse is they assume Arduino’s Ethernet client – which relies on the W5100 chip which is of course different than the CC3000 I am using. Given my pathetic knowledge of Web programming, I am bumbling through choices. I have successfully sent data to a PHP Service on my Mac’s MAMP Web Server. I’m going to see about moving it to AWS. AND I’d like to see if I can cruft up the REST commands needed by Xively to get back their “datastreams.” I like Xively. It seems to provide a nice abstraction. I am concerned that it is a startup and thus does not have the street creds AWS has – years of ups and downs, many customers, backed by a large company that has hired and built “best of class” cloud services (I see Amazon.com 2nd to Google in their knowledge and skills to run efficient and effective consumer based cloud services that I would depend on)……
Plus, using a lower abstraction than Xively means it would be easier to find a skilled person at a reasonable rate to maintain and improve server side code. Especially in the case of PHP (with JSON structured data spewing back and forth).
Tradeoffs. Tradeoffs. I realize your scenario is different from mine. I also admire your exploration and knowledge.
Thank you (again) for your reply.
Comparing WordPress to a “modern” web application seems a little unfair to me. When heavy-weight applications like WordPress, Typo3 and so on were created, client side processing was still in its infancy. I’m planing to add WiFi connectivity to a project of mine and don’t want to rely on services like Xively. After reading through this series I realised that I only need to implement a very simple web server on the microcontroller and then do all the heavy-lifting in the browser. So thank you very much for that.