Computing stuff tied to the physical world

No wait, it’s a language runtime

Another approach, is to create a complete language environment on the µC. Usually interpreted, although JIT-like designs and even small compilers are als0 possible.

With a high-level language running on the µC, everything moves up a notch (or two) in terms of expressibility and language power. No need to stick to C or even C++.

This is the premise behind languages such as “Embedded Lua” (eLua) and “Embedded JavaScript” (espruino). Scripting on embedded hardware, who wouldn’t want that?

The price to pay is that these languages require a fairly hefty µC, with 256 KB or more flash and 64 KB or more RAM. One reason may be that these languages are heavily reliant on strings and byte arrays, and hence a memory allocator / garbage collector of some sort.

Even with a hefty 512 KB flash and 128 KB RAM, the complexity of applications which can be run in these environments can be quite limited (perhaps up to a thousand lines of code).

But “hefty” is not a good companion of “very low-cost” and “ultra low-power” …

Another approach is to implement a byte-code interpreter, and pick a somewhat lower-level language as target. More stack oriented, more emphasis on numbers and fixed-size objects, and possibly also no automatic memory management.

Such a runtime will work in much smaller enviroments, such as Bitlash, ArduinoBASIC, or even interpreted C – they’ll fit on an Arduino: 32 KB flash, 1 KB EEPROM, and 2 KB RAM!

Note that to be generally useful, the software must also offer a way to enter and edit source code for these languages, however crude. Only then will it it possible to develop stand-alone on the µC itself and jettison the link to some “big” computer. With source code and an editor at hand, you get a very high level of “tinkerability”, since this means the code can be inspected and tweaked long after the original design was created. Whereas with the “host vs target” split approach to cross-compilation, keeping the source and all the tools around and working can be a huge challenge over longer periods of time. Who hasn’t built an embedded project, used it for a while, and then lost the ability to update the code later?

BASIC

Perhaps this explains the popularity of BASIC: a simple language, easily extended to access some hardware, yet small enough to fit into a µC including a simple editor.

Lots of BASIC interpreters have been written over the years, and some of those have been ported to today’s µCs, which are after all much more capable than the ones from the 1970’s. A good example is the MaxiMite and its associated MMBasic. It’s not that small, though:

MMBasic requires a CPU with 32 bit integers and pointers supported by an ANSI or C89 C compiler.

In its minimal version MMBasic typically compiles to about 94K of flash and requires just over 5K of RAM plus a minimum of 4K for the stack (9K total). More memory is required to store programs, variables, etc so the interpreter will require at least 16K of RAM to run a small BASIC program. Generally 32K RAM or more is preferred as that will allow for larger programs and arrays.

FORTH

And then there’s this odd-looking language from the early 1970’s called Forth:

    0 value ii        0 value jj
    0 value KeyAddr   0 value KeyLen
    create SArray   256 allot   \ state array of 256 bytes
    : KeyArray      KeyLen mod   KeyAddr ;

    : get_byte      + c@ ;
    : set_byte      + c! ;
    : as_byte       255 and ;
    : reset_ij      0 TO ii   0 TO jj ;
    : i_update      1 +   as_byte TO ii ;
    : j_update      ii SArray get_byte +   as_byte TO jj ;
    : swap_s_ij
        jj SArray get_byte
           ii SArray get_byte  jj SArray set_byte
        ii SArray set_byte
    ;

    : rc4_init ( KeyAddr KeyLen -- )
        256 min TO KeyLen   TO KeyAddr
        256 0 DO   i i SArray set_byte   LOOP
        reset_ij
        BEGIN
            ii KeyArray get_byte   jj +  j_update
            swap_s_ij
            ii 255 < WHILE
            ii i_update
        REPEAT
        reset_ij
    ;
    : rc4_byte
        ii i_update   jj j_update
        swap_s_ij
        ii SArray get_byte
        jj SArray get_byte +
            as_byte SArray get_byte  xor
    ;

(that’s a full implementation of the RC4 encryption algorithm, by the way)

The way to get to grips with this, is to realise that Forth uses Reverse Polish Notation. Thus “1 2 + 3 *” means “((1 + 2) * 3)“, in fact. RPN is also known as Postfix notation – and the PostScript “printing language” is essentially a (heavily adapted) variant of Forth.

Forth is known for its extremely compact notation and implementation. The language implementation is extremely small and efficient, because 1) its primitives are very close to machine code and 2) it is built entirely in and on its own mechanisms. A few kilobytes of flash memory and a few hundred bytes is all it takes to build up an interactive development environment (though nowhere the same as today’s IDE’s!). Source editing, machine code assembly and disassembly, multitasking – it has all been done many times over in Forth.

AmForth is one example, written specifically for AVR µCs and also ported to the MSP430.

Mecrisp is another Forth implementation for the MSP430, but this one has made the jump to the ARM platform (not just Stellaris). Like many Forth implementations, it is made of a bootstrap portion written in assembler code, and then the rest is in Forth.

Pygmy Forth takes a different approach: there is a host-side application which works together with a small bootstrap running on the embedded µC – it generates the code which the minimal runtime then takes further, while also acting as console and editor.

Chuck Moore, the original inventor of Forth, created what is probably the most extreme (and minimalistic version), called colorForth. Mentioned here only for completeness.

It’s an oddly fascinating language, which has later been categorised as a concatenative language, spawning interesting languages such as Joy, Factor, and … PostScript.

Forth allows diving in and exercising different parts of a design using just a serial connection. This is crucial during development, but may also come in handy when trying out something new or chasing a bug much later on. The code remains 100% introspectable.

A major drawback of Forth is that it doesn’t mix well with C/C++. There’s no simple way to integrate code from both worlds. The approaches are just too different. In the past, this has led to (commercial) projects re-implementing entire TCP/IP and even Bluetooth (!) stacks.

So there you have it: from a fairly complete and large JavaScript implementation which allows writing code in the same way as you would for a web application, to a wide range of common and not-so-common programming languages made to run on the µC, all the way back to one of the first “embedded” environments ever, using the Forth stack language.

Is any of this a viable path for long-term software development of a bunch of nodes sprinkled around the house? Perhaps not – despite the benefit of being able to tinker directly on µC hardware, this all seems a bit ad-hoc. While getting a specific node to do specific things is important, we also want the benefit of source code control and the ability to manage and replicate nodes to multiple locations. A per-node setup won’t get us there.

But the basic concept is tempting: the more abstract and high-level our little µC’s can be made to work, the simpler our instructions can be to make them perform specific tasks.

[Back to article index]