Computing stuff tied to the physical world

Fascinating concurrency

In AVR, Software on Feb 4, 2010 at 00:01

There is a new language for the Arduino / JeeNode / ATmega328, called Occam-π.

I found out about it yesterday, at http://concurrency.cc/ – it’s high level, and it supports parallel programming. The current development environment release is for Mac OS X, with Windows and Linux coming soon.

Here is a complete program with 4 blinking LED’s, one on each DIO pin of the JeeNode ports:

Screen shot 2010-02-03 at 01.13.19.png

That’s it. Compiles to roughly 2 Kb. Each extra blink adds just 20 bytes, btw.

And yes, it really makes four LEDs blink at an independent rate:

DSC_1167.jpg

There is slightly more to it than that, but this is mind-blowing stuff. The “parallelism” is simulated, of course. Looks like the ATmega can do around 6000 context switches per second (i.e. parallel task switches).

There is a roughly 20 Kb interpreter part that needs to be uploaded once (which is why this requires at least an ATmega328). After that, the IDE will upload just the bytecode for your program, i.e. 2 Kb in the above case.

B R I L L I A N T .

Imagine hooking up the RF12 driver to this – there’s plenty of room for the extra 3 Kb or so. And for doing all sorts of things… in parallel! My earlier complaint post about how awful it is to do several things at once on an ATmega board might just have been wiped off the table.

Looks like I’ve got some very serious learning ahead of me to try and get to grips with all this.

  1. For a bit of background. It seems to be based on the “transterpreter” (http://www.transterpreter.org/), which is a simulator for the INMOS Transputer, which had this type of parallelity on chip. Today we have XMOS XCore (http://en.wikipedia.org/wiki/XCore_XS1-G4), which is styled as the spiritual successor to the transputer, based on the descriptions. The language, OCCAM, is based on Hoare’s CSP (http://en.wikipedia.org/wiki/Communicating_sequential_processes).

    • @Andreas:

      You’re exactly correct. We’re certainly not trying to hide that, but our target audience—artists, makers, beginning students, etc.—don’t need that level of detail at the beginning. (That said, I put this in the preface of the book, so… perhaps I’m actually contradicting myself here.)

      The XCore does not yet support occam-pi, and XC is rather convoluted, especially if you wear your “beginner goggles.” We find occam to be a nice, small language for introducing fundamental concepts regarding concurrency and parallelism, and it is then far easier for people to move onto XC, Go, Erlang, etc.

      That, and a lot of these languages require a lot more runtime support, meaning you can’t execute in 20K of flash and a few words of RAM.

  2. Hi Jean-Claude,

    I wanted to drop a note and let you know that we’re glad to support/help in any reasonable way. I’ve been eyeing the Jee nodes for some environmental sensing collaborations here at Allegheny, so seeing our VM running on your boards is a real treat.

    The book that we’re working on will grow, as will the support library. We want things that should be easy to be as simple as possible, no simpler. If you start poking anything with a stick that we don’t have in the library yet, we’re happy to help add it, or (better) help you add it and get it in the tree.

    Keep up all the awesome work you do here.

  3. Heavy. The world of Arduino keeps surprising me. They also have a Freeduino variant board that runs of a single AA battery.

    • @Andreas:

      Yes. We’ll release that soon. Omer Kilic (the PhD student who did the heavy lifting on that design) has ordered our first batch of prototype boards; we’ll post the full design openly (and perhaps make a few available) when its done.

  4. If they need a 20Kb interpreter, it looks like a bit overhead to me. I’m currently working on a small timer/scheduler C++ class for cooperative tasks. It’s something like one thousand lines of code, and using AVR’s Timer2 with both comparators. I will publish the code in a month or two, once my reimplementation of RF12 with timeouts, etc is completed. Basically, I run an ISR every 1ms, looked through a linked-list for all tasks. Expired tasks are flaged. A loop (in the main context) is responsible for checking potential tasks regularly. For realtime tasks, I use Timer2’s second comparator to have a more precise timing, and run the tasks within the interrupt’s context.

    • @Laurent:

      We could have used protothreads, or some other scheduler. You’re right that 20K feels heavy, but that gets you a full programming language. What that also means is you get a very mature compiler (which continues to be actively developed), and more importantly, a compiler that can check for aliasing, type errors, and even tell you if you’ve built your concurrent process network incorrectly.

      For example, it is impossible to have a race hazard in occam-pi. If you deadlock, we tell you at runtime. Technically, we could give you a heads up at compile-time, but it doesn’t scale well.

      It is “heavy” because we’ve never worked hard to make it tiny. We’ve worked hard to make it correct, and given time, we can make it smaller/faster/etc. For example, one PhD student in the group did a prototype that back-ends to LLVM (it’s in a branch of our tree). This runs really fast, and hits a number of embedded targets. The VM is smaller, and runs in places the LLVM backend cannot.

      Your library will probably be excellent, but it will not allow me to achieve the goals that I have for supporting beginners in learning to design parallel programs. They will still be responsible for “compiling” parallel code to sequential code, and implementing the concurrency with your library. And your library cannot do any checks to help them out.

      As always, different design criteria will lead to different places. Our tools may not work well for some things, but they will work well for many use cases where concurrency is a boon.

  5. Hi Jean-Claude,

    Love your Jee Node boards, very inspiring, been following you blog for a while. Here is a bit more info that you may be interested in given thios post and some of your previous experiments with code etc..

    Xmos have a range of chips that are event driven processors, thes can be used in areas similar to MCUs but also replacing FPGAs in some cases, they are a whole new idea for this space.Unlike FPGAS however they are easy to program just like MCUs. This is one of the reasons I choose to use them for the Amino project (http://folknologylabs.wordpress.com/2010/01/03/opensource-hardware-a-way-forward-part-1/). Xmos have a programming language called XC which is like an extended C with concurrency primitives which is very easy to use and supports the eventy driven metaphors, it too has roots in Hoare’s CSP. And yes The one of the co-founders of Xmos is David May of Transputer fame. By teh way I do not work for Xmos but I am working with them to get all of their code opensource (some of it is already), their toolchain is already opensource based on GCC/LLVM. You should take a look I can imagine an Jee Node based on the OSH Amino would be amazing.

    regards Al

  6. Wow – great comments, all of ’em. The one extra issue I’d like to bring into this is low-power use, i.e. sleep modes. At some point it’s going to matter for “nodes” and WSN’s, and in my experience you can’t really add that sort of functionality after the fact.

    Great, several ways to go forward. It’s pretty clear that the current “sketch” approach with interwoven activities and timers is reaching its limits, and that there are real alternatives. I actually hope all of the above directions will work out – as Matt said “different design criteria will lead to different places”. Fascinating places!

    • Hi JCW,

      Actually on this subject event driven processors are ideal for power efficiency because there power envelopes can be event driven or event modulated. That is the cores can be awoken by specific IO events and processed before returning to sleep. This is much simpler and more effective the the old skool interrupt driven approach, its much easier to get your head around compared to abstract interrupt processing. You should take a look if dynamic power processing on remote nodes is your thing.

    • @jcw:

      We did low-power for the Blackfin B537. We’re able to put the processor into deep sleep (automatically, in some cases) in our runtime because we know so much about the current state of our scheduling queues. We know 1. which processes are waiting on a timer, 2. which processes are waiting on external events, and 3. which are waiting on communications internal to the VM.

      It is, of course, possible to give the programmer control as well—which is probably useful/necessary in many cases. However, this is another point where I’m excited about getting a student involved (or just exploring it myself) and looking at how we can intelligently do power management when we’re working with a concurrency-aware runtime.

      In other words, another place where we have some clear overlap for collaboration.

  7. There is a few people asking about you on twitter Jean-Claude. There is a JCW there but we don’t thinks its you, you should at least register jeelabs. Let us know if you do and I’ll pass it on.

  8. @Matt Jadud: If my post gave the expression of dissing your project you have my apologies. That was certainly not my intention. By adding the background info I attempted to give something a bit more to the audience here, which I believe/suspect is more technical, and less beginner as well.

  9. Another multi-core chip I came across recently is the GA-nnn series (n in 4, 32, 40, 144 cpus), see http://greenarrays.com/, or http://greenarrays.com/home/products/index.html .

    That is for the Forth people, Chuck Moore is involved, IIRC he designed the core cpu (c18), and the whole cpu array on top.

    No actual hardware ships yet. Software (arrayForth, simulators, …) is available.

    Even less for beginners than XCore, I suspect.

  10. @Matt I’m glad.

  11. humm… nice, this new language. But if I need multi-tasking on my AVR I’ll stick to something like Femto OS. It already does it, it has been around for a while (reliable) and it is small. http://www.femtoos.org/ Unless someone gives me a good reason to change ;-)

    • No good reasons to change. The occam compiler prevents race hazards, and our runtime detects deadlock.

      But I’m not interested in sales; our premise is that it’s better to write concurrent code in a language that lets you express it directly, instead of having to translate fundamentally parallel notions into sequential code yourself. That, and we’re interested in teaching how to directly express parallel ideas in code, not how to implement these things in a sequential language. Hence, our tools have a specific purpose.

  12. Very interesting, thanks for the heads up! Will check out Femto as well — any other multi-tasking tools ofr AVR?

  13. In the WinAVR web site you can find some info on the possible OS’s. http://winavr.sourceforge.net/WinAVR-user-manual.html Look for the point 5.1 Operating Systems. Look out for the Elektor of February 2010 (at least the French edition) in page 14 you have a nice article on Femto OS.

Comments are closed.