Computing stuff tied to the physical world

Push-scanning with coroutines

In Software on Oct 18, 2012 at 00:01

The “benstream.lua” decoder I mentioned yesterday is based on coroutines. Since probably not everyone is familiar with them (they do not exist in languages such as C and C++), it seems worthwhile to go over this topic briefly.

Let’s start with this completely self-contained example, written in Lua:

Screen Shot 2012 10 15 at 01 23 15

Here’s the output generated from the test code at the end:

  (string) abcde
  (string) 123
  (number) 12345
  (boolean) false
  (number) 987
  (boolean) false
  (number) 654
  (number) 321
  (number) 99999999
  (boolean) true
  (string) one
  (number) 11
  (string) two
  (number) 22
  (string) bye!

There is some weird stuff in here, notably how the list start, dict start, and list/dict end are returned as “false”, “true”, and the empty list ({} in Lua), respectively. The only reason for this is that these three values are distinguishable from all the other possible return values, i.e. strings and numbers.

But the main point of this demo is to show what coroutines are and what they do. If you’re not familiar with them, they do take getting used to… but as you’ll see, it’s no big deal.

First, think about the task: we’re feeding characters into a function, and expect it to track where it is and return complete “tokens” once enough data has been fed through it. If it’s not ready, the “cwrap” function will return nil.

The trouble is that such code cannot be in control: it can’t “pull” more characters from the input when it needs them. Instead, it has to wait until it gets called again, somehow figure out where it was the last time, and then deal with that next character. For an input sequence such as “5:abcde”, we need to track being in the leading count, then skip the colon, then track the actual data bytes as they come in. In the C code I added to the EmBencode library recently, I had to implement a Finite State Machine to keep track of things. It’s not that hard, but it feels backwards. It’s like constantly being called away during a meeting, and having to pick up the discussion again each time you come back :)

Now look at the above code. It performs the same task as the C code, but it’s written as if it was in control! – i.e. we’re calling nextChar() to get the next character, looping and waiting as needed, and yet the coroutine is not actually pulling any data. As you can see, the test code at the end is feeding characters one by one – the normal way to “push” data, not “pull” it from the parsing loop. How is this possible?

The magic happens in nextchar(), which calls coroutine.yield() with an argument.

Yield does two things:

  • it saves the complete state of execution (i.e. call stack)
  • it causes the coroutine to return the argument given to it

It’s like popping the call stack, with one essential difference: we’re not throwing the stack contents away, we’re just moving it aside.

Calling a coroutine does almost exactly the opposite:

  • it restores the saved state of execution
  • it causes coroutine.yield() to return with the argument of the call

These are not full-blown continuations, as in Lisp, but sort of halfway there. There is an asymmetry in that there is a point in the code which creates and starts a coroutine, but from then on, these almost act like they are calling each other!

And that’s exactly what turns a “pull” scanner into a “push” scanner: the code thinks it is calling nextChar() to immediately obtain the next input character, but the system sneakily puts the code to sleep, and resumes the original caller. When the original caller is ready to push another character, the system again sneakily changes the perspective, resuming the scanner with the new data, as if it had never stopped.

This is in fact a pretty efficient way to perform multiple tasks. It’s not exactly multi-tasking, because the caller and the coroutine need to alternate calls to each other in lock-step for all this trickery to work, but the effect is almost the same.

The only confusing bit perhaps, is that argument to nextChar() – what does it do?

Well, this is the way to communicate results back to the caller. Every time the scanner calls nextChar(), it supplies it with the last token it found (or nil). The conversation goes like this: “I’m done, I’ve got this result for you, now please give me the next character”. If you think of “coroutine.yield” as sort of a synonym for “return”, then it probably looks a lot more familiar already – the difference being that we’re not returning from the caller, but from the entire coroutine context.

The beauty of coroutines is that you can hide the magic so that most of the code never knows it is running as part of a coroutine. It just does its thing, and “asks” for more by calling a function such as nextChar(), expecting to get a result as soon as that returns.

Which it does, but only after having been put under narcosis for a while. Coroutines are a very neat trick, that can simplify the code in languages such a Lua, Python, and Tcl.

  1. Question: how do coroutines compare with e.g. the FSM approach? Does the stack saving/recalling and context switch take much time and memory?

    I think usually with such things you make the choice where to spend the time: in developing/writing code or in executing it: whichever is more precious wins the contest..

  2. I don’t really know – I would expect that with languages where both options are supported it probably doesn’t make that much of a difference. Best is to optimise (only) when it bites, as you say. Either way, it’s useful to know about both techniques, IMO.

  3. These days I much prefer Python, but regarding C, I thought I’d mention ‘Duff’s device’ , just in case some of your readers may not be aware of it:

Comments are closed.