Skip to content

A multi-tasking kernel

For quite some time now, I’ve been exploring ways to add concurrency to the JeeH library. ChibiOS and FreeRTOS come to mind, but they are too extensive and intrusive for my tastes. It’s the end of 2022, winter is coming, and so I’ve decided to build my own … heck, why not.

Starting from scratch is both fun and daunting. Luckily, there is a lot of inspiration to be gotten from what already exists. I found the older XINU and TRIPOS designs particularly interesting, due to their simplicity and generality. Given that all this has to run on resource-contrained microcontrollers, the KISS principle really matters here.

The goals of this endeavour are:

  • to provide a simple pre-emptive multi-tasking kernel
  • to run on ARM Cortex family microcontrollers such as STM32
  • to integrate seamlessly with the rest of JeeH (and C++)
  • to keep all potential race conditions out of application code

The choice of ARM Cortex as platform is based on: 1) my familiarity with it, 2) its truly superb hardware support for multitasking, and 3) JeeH’s focus on STM32 µCs. The integration with JeeH means that this code also uses PlatformIO as build environment. As for race conditions: as you will see, these can be fully tamed with a TRIPOS-like message-based kernel design.

For want of a better name, this code will use the os::* namespace convention in C++.

Messages

The central mechanism for concurrency is message-passing. Everything revolves around sending and receiving messages, which are defined in the jeeh::* namespace as:

struct Message {
    int8_t   mDst;  // destination task or device
    int8_t   mTag;  // often used as request code
    uint16_t mLen;  // often used as payload size
    void*    mPtr;  // often used as payload data
    Message* mLnk;  // for internal use in chains
};

There are two system calls involved: os::put() sends a message to a specified “task” or “device” and os::get()) retrieves the next incoming message in a task. Sending is non-blocking, but receiving will block as long as no messages are available.

Messages, as defined above, are 12 bytes each: a 1-byte destination, 7 bytes of arbitrary payload data, and 4 bytes used internally for chaining messages in queues. Messages are owned by their senders, and need to stay around until sent back to them (with optional reply values inside). Messages are only passed around by reference, never copied. Because of this, a caller can easily subclass messages in C++ to add custom fields and methods:

struct MyBigMessage : Message {
    MyObject obj;
    AnotherObj *ptr;
    char buffer [100];

    int myMethod (int abc) { ... }
};

As long as senders and receivers agree on types, they can exchange whatever data they like.

Tasks and devices

A “task” is an independent process, which has its own stack and can be interrupted and switched by the kernel as needed. A context switch can only happen when a system call is made or an interrupt causes a blocked higher-priority task to become runnable again.

The os::fork() system call creates a new task, given a memory area to use as stack and a function to run. Each task ceases to exist when their function returns.

All messages sent to a task are placed in a queue, to be retrieved by that task in successive calls of os::get. Blocked (i.e. suspended) tasks consume no CPU time.

Messages can also be sent to a device driver (these have negative id’s). Drivers interface to built-in µC hardware peripherals, and will manage all their setup and interrupts. Device drivers (and their interrupts) do not have their own stack, they all run in a shared kernel stack. One way to look at this is that tasks run in “user mode”, while drivers run in “kernel mode”.

Device drivers cannot block. They can only act on received messages, (briefly) save them, initiate hardware actions, respond to interrupts, and send replies back to the calling tasks.

This design is literally the same as in TRIPOS, albeit with different terminology.

Blocking vs. non-blocking

To reiterate: os::get always blocks, os::put never does (but its calling task might get context-switched away for a while). There is a standard “timer” device (with fixed id -1), which accepts messages with a timeout value and returns them after that number of milliseconds.

As a basic blocking example, the following code will suspend a task for 100 ms:

Message m {-1, 0, 100};
os::put(m);
os::get();

Note that the os::get return value is ignored. Given that put-followed-by-get is a common idiom, there is a os::call() wrapper which does just that, thus acting as blocking send:

Message m {-1, 0, 100};
os::call(m);

The os::delay() system call is an even nicer shorthand for this common operation.

There is no way to do a non-blocking receive, but a os::put followed by os::delay and then os::get achieves nearly the same thing: the first reply to come back determines whether the original request was completed or the timeout was triggered.

The os::revoke() system call can be used to remove a message which is currently waiting in some task queue. When sent to a device driver, os::revoke will instruct it to abort any operations it may have started for this request. There is no way to know how far the request already made it, but if it was still in some waiting queue then it will not be processed.

Race conditions

This kernel design aims to avoid all nasty interrupt-related race conditions, where you have to constantly “guard” and “lock” things to avoid problems with different bits of code trying to access or modify the same data. It does this by treating tasks as the central unit of all processing, and relying on message-passing for synchronising all actions between them. This is somewhat similar to the design of the Go language, with its channels and goroutines.

The solution hinges on a couple of key design choices:

  • all messages are passed around and queued/unqueued in an atomic way and task queues are only accessible from application code via the os::get and os::put system calls
  • IRQs can run at any time, even during system calls, but to send a reply back to a task they must drop to a low-priority state which runs after system calls are complete

There are no “disable interrupt” calls in the kernel. Everything is done via automatic interrupt priority changes, using ARM’s PendSV and SVC hardware features. Tasks and driver code may get interrupted at any time, but this will never interfere with moving messages around, because those actions always postpone themselves to when it’s safe to do so.

All the hard work w.r.t. hardware interrupts is relegated to their device drivers, and even that code tends to be easy to “get right” due to the low-priority signaling from IRQ handlers.

In a way, messages represent time - and by being able to queue them up, all time-critical work can be managed without the complexity which is so often associated with concurrency.