Skip to content

Multi-tasking example

One design goal of JeeH’s multi-tasking is that things should be as simple as possible, on the implementation side and w.r.t. how it is exposed for actual use. Here is a minimal example:

#include <jee.h>
using namespace jeeh;

void demo (Message&) {
    Pin led ("B3");
    led.mode("P");

    led = 1;
    os::delay(100);
    led = 0;
}

int main () {
    uint32_t stack [250];           // 1. start multi-tasking
    Tasker tasker (stack);

    uint32_t demoStack [50];        // 2. create a new task
    os::fork(demoStack, demo);

    while (tasker.count > 0)        // 3. wait for it to finish
        os::get();
}

This demo just blinks an LED once and then stops. The essential setup takes place in main().

The main task

The code required to start multi-tasking can be very simple, as demonstrated in this example:

  1. A Tasker object is created, with a stack area for all interrupts and system calls.
  2. Then a task is created, with its own stack and running the demo() function.
  3. Lastly, main waits for demo to end, which sends a message and lowers the task count.

The stack area supplied to Tasker will be used for all exceptions: SVC system calls, PendSV context switches, the SysTick periodic interrupt, and all hardware interrupt handlers used in the device drivers. This dual-stack approach reduces stack space requirements in all other tasks.

Once the multi-tasker has been set up, the main() code itself continues as task zero. Just like any other task it can send and receive messages, create new tasks, and call os::delay (which is simply a blocking call to the clock device). The main task’s stack size is limited only by available RAM (minus what’s needed for static data and heap allocations). That makes it a very convenient spot to allocate memory areas for all other task stacks and buffers.

Automatic sleep mode

There is no special “idle task”. Instead, JeeH’s multi-tasker takes advantage of a very clever feature in ARM µCs, called sleep on exit: this flag is set whenever all tasks are blocked, and it automatically enters low-power sleep mode until the next interrupt. Once there is a runnable task again, the sleep-on-exit bit is cleared and normal CPU execution resumes.

Task functions

The demo() function will execute in the stack context of the new task. It has two arguments:

  1. a Message reference with some information about this new task
  2. a void* pointer to arbitrary data

The message object contains the following information:

  • mDst: the id of the parent task
  • mTag: the id of this new child task
  • mPtr: a pointer to the stack area specified in the call to os::fork()
  • mLen: the size of this stack area in words

The mPtr and mLen values are the same as in the call to os::fork. This area is now reserved for multi-tasking and may not be changed by application code in any way.

The second argument is simply passed on from os::fork‘s fourth argument, which was not specified in this example and which defaults to nullptr, i.e. zero.

When the task function returns, the task is considered to have ended and that first message argument will be handed back to os::put() to signal its parent. Once the parent receives this special “end of life” message, it can extract data from it if needed, and then free the stack area.

Summary

With JeeH’s multi-tasker, it should be easy to add multi-tasking to an embedded application, even after the fact. Everything is based on asynchronous message sends (i.e. os::put()) and synchronous message receives (i.e. os::get()). Tasks can be created with os::fork() and parent tasks can wait for their completion and then release all resources allocated for them.

JeeH does not use or need a memory allocator - that choice is entirely up to the application.

Device drivers are integrated using the same message mechanisms to interface to hardware. Device handlers (tasks) can provide higher-level APIs , e.g. file systems and network sockets.

Additional concurrency mechanisms could be built on top of these primitives, if needed.

Addendum: build sizes

It’s a bit early days to draw conclusions (let alone start optimising!), but it may be interesting to have a brief look at the current build sizes of a demo app which uses this new multi-tasker.

Here is the source code of a few different variations of a very simple LED blinker:

// most basic led blinker

#include <jee.h>
using namespace jeeh;

int main () {
    Pin led ("B3");
    led.mode("P");

    for (auto i = 0; i < 5; ++i) {
        led = 1;
        msBusy(100);
        led = 0;
        msBusy(400);
    }
}
// multi-tasking led blinker in main task

#include <jee.h>
using namespace jeeh;

int main () {
    uint32_t stack [250];
    Tasker tasker (stack);

    Pin led ("B3");
    led.mode("P");

    for (auto i = 0; i < 5; ++i) {
        led = 1;
        os::delay(100);
        led = 0;
        os::delay(400);
    }
}
// multi-tasking led blinker with a separate task

#include <jee.h>
using namespace jeeh;

void demo (Message&) {
    Pin led ("B3");
    led.mode("P");

    for (auto i = 0; i < 5; ++i) {
        led = 1;
        os::delay(100);
        led = 0;
        os::delay(400);
    }
}

int main () {
    uint32_t stack [250];               // 1. start multi-tasking
    Tasker tasker (stack);

    uint32_t demoStack [50];            // 2. create a new task
    os::fork(demoStack, demo);

    while (tasker.count > 0)            // 3. wait for it to finish
        os::get();
}
// multi-tasking led blinker sending messages to a blink task

#include <jee.h>
using namespace jeeh;

Pin led ("B3");

void demo (Message&) {
    while (true) {
        os::get();
        led = 1;
        os::delay(100);
        led = 0;
    }
}

int main () {
    led.mode("P");

    uint32_t stack [250];
    Tasker tasker (stack);

    uint32_t demoStack [50];
    os::fork(demoStack, demo);

    for (auto i = 0; i < 5; ++i) {
        Message msg {1};
        os::put(msg);
        os::delay(500);
    }
}
// multi-tasking led blinker using messages to trigger each blink

#include <jee.h>
using namespace jeeh;

Pin led ("B3");

void demo (Message&) {
    while (true) {
        os::get();
        led = 1;
        os::delay(100);
        led = 0;
    }
}

void loop (Message&) {
    for (auto i = 0; i < 5; ++i) {
        Message msg {1};
        os::put(msg);
        os::delay(500);
    }
}

int main () {
    led.mode("P");

    uint32_t stack [250];
    Tasker tasker (stack);

    uint32_t demoStack [50];
    os::fork(demoStack, demo);

    uint32_t loopStack [50];
    os::fork(loopStack, loop);

    while (true)
        os::get();
}
// multi-tasking led blinker with statically allocated stacks

#include <jee.h>
using namespace jeeh;

Pin led ("B3");

void demo (Message&) {
    while (true) {
        os::get();
        led = 1;
        os::delay(100);
        led = 0;
    }
}

void loop (Message&) {
    for (auto i = 0; i < 5; ++i) {
        Message msg {1};
        os::put(msg);
        os::delay(500);
    }
}

int main () {
    led.mode("P");

    static uint32_t stack [250];
    Tasker tasker (stack);

    static uint32_t demoStack [50];
    os::fork(demoStack, demo);

    static uint32_t loopStack [50];
    os::fork(loopStack, loop);

    while (true)
        os::get();
}
  • demo1.cpp is a basic LED blinker which does not multi-task, using JeeH’s Pin for GPIO
  • demo2.cpp starts up the multi-tasker, using the os::delay() function for timed suspend
  • demo3.cpp launches a task for blinking, then waits in the main task for it to end
  • demo4.cpp uses a single-blink task and sends five timed messages to it from main
  • demo5.cpp launches a 2nd task with a loop sending the timed messages
  • demo6.cpp allocates the task stacks in static memory instead of on the main stack

Here are the corresponding build sizes for the Nucleo-L432:

$ arm-none-eabi-size .pio/build/*/firmware.elf
   text    data     bss     dec     hex filename
   1348      16    1576    2940     b7c .pio/build/demo0a/firmware.elf
   5692     116    1596    7404    1cec .pio/build/demo0b/firmware.elf
   7388     124    1912    9424    24d0 .pio/build/demo1/firmware.elf
   8028     124    1920   10072    2758 .pio/build/demo2/firmware.elf
   8072     124    1920   10116    2784 .pio/build/demo3/firmware.elf
   8076     128    1916   10120    2788 .pio/build/demo4/firmware.elf
   8080     128    1916   10124    278c .pio/build/demo5/firmware.elf
   8088     128    3316   11532    2d0c .pio/build/demo6/firmware.elf
   8580     128    3316   12024    2ef8 .pio/build/demo7a/firmware.elf
  • demo0a is a special build based on demo1.cpp, which defines JEEH_NO_TASKER to omit all the multi-tasking support from JeeH
  • demo0b adds a linker flag to pull in the printf code (and some related parts of the standard I/O library), again based on demo1.cpp
  • demo7a is a build of demo6.cpp, enabling debugging output + assertions in JeeH

The reason for these additional builds, is that since the multi-tasker also includes a fault handler, which uses printf(), all the demo*.cpp builds will pull in some 6 kB of additional code, not really related to JeeH itself. This way it’s easier to see what code is involved.

The increase in code size for the multi-tasker is about 700 bytes. The increase in data size is some 320 bytes (mostly due to a few tables for tasks, devices, and interrupt dispatching).

Allocating stacks in static memory will allocate some memory in the BSS region. There’s no benefit, it just leads to a bit of overhead on startup, to zero out this additional static memory.

These results are a snapshot of the current code base. As you can see, it’s very lightweight.

Addendum: less bloat

Seeing the things pulled in by printf(), it really makes little sense to include so much of the stdio library in an embedded context. While there is definitely a use for performing file I/O and serial I/O, these are rarely used in the same way as on UNIX / POSIX systems, which is what #include <cstdio> was originally meant for.

Here are the same demo builds as above, but with a custom printf() implementation and some extra code to avoid pulling in abort(), kill(), raise(), and a few other tidbits:

$ arm-none-eabi-size .pio/build/*/firmware.elf
   text    data     bss     dec     hex filename
   1348      16    1576    2940     b7c .pio/build/demo0a/firmware.elf
   2356      16    1576    3948     f6c .pio/build/demo0b/firmware.elf
   3872      20    1896    5788    169c .pio/build/demo1/firmware.elf
   4524      20    1896    6440    1928 .pio/build/demo2/firmware.elf
   4568      20    1896    6484    1954 .pio/build/demo3/firmware.elf
   4572      24    1900    6496    1960 .pio/build/demo4/firmware.elf
   4576      24    1900    6500    1964 .pio/build/demo5/firmware.elf
   4584      24    3300    7908    1ee4 .pio/build/demo6/firmware.elf
   5028      24    3300    8352    20a0 .pio/build/demo7a/firmware.elf

This still includes a very usable implementation of printf(), the DMA-based UART driver, and the multi-tasking kernel. It’s just a bit leaner and meaner, allowing all this functionality to be used even in the diminutive 16/32 kByte flash versions of some low-end STM32 µCs.