Skip to content

Loadable extensions

This is an exciting new feature: JeeH is going modular - a proof-of-concept design shows that a device driver can be compiled, saved, and loaded separately from the main application code. This means that an embedded application will be able to load what it needs at runtime, instead of having to be built and flashed in one piece. Or replace it with an improved driver later on.

It’s all done with C++ vtables. This mechanism is very convenient from C++, but it takes some careful maneuvering around the construction and destruction of objects to make it all work.

The main application sets up a “dispatch vector” which external modules then use to tie into that main code on demand.


The dispatch support code will only be included if main() contains a call to jeeh::enablesModules(). Without it, external modules are not supported, but also add no overhead in the main application.

JeeH modules

In JeeH, an external module must be derived from a Module class, and provide a ModHeader instance with its string name, size, and entry point. Although ARM could in principle support position-independent code (PIC), this is very limited for ARM Cortex µCs without memory-management unit (MMC). For this reason, modules in JeeH must adhere to some strict rules:

  • A driver module can’t have any static (or global) variables, only code and read-only data (both of which will usually be stored in flash memory)
  • A task module relaxes this so that both code and static data can be used, but with the data either moved out of the way during task switches, or using a soon-to-be-described “tricked” MMU mechanism based on external static RAM.
  • Lastly, there is the simple read-only data module, for which there are no restrictions.

The above is less restrictive than it might seem:

  • One approach is to first figure out where in flash a module is to be loaded, and then adjust the linker script before producing the final object code.
  • Another idea would be to compile the same module in two different addressses and compare the results to generate a relocation table.
  • For now, I’ll go with the first approach. This means that modules will have to be stored in flash at the precise location they were built for.

As for not using any static / global variables: this is quite easy in C++, by placing all data (and pointers to data) in the instance as members. Here is the boilerplate setup for such a module:

#include <jee.h>

struct MyModule : Module {
    static Module* setup (void* ptr) { return new (ptr) MyModule; }
    void teardown () override { this->~MyModule(); }
    MyModule () { ... }
    ~MyModule () { ... }
    virtual int foo (Blah& blah) { ... }

MOD_HEADER(MyModule) // see the macro defined in "jee-os.h"

All the externally callable methods have to be virtual, and they can do whatever they like, with all mutable state stored somewhere in their custom MyModule instance.

The setup() and teardown() definitions are required. They circumvent C++’s standard constructor / destructor mechanism, which creates linkage problems for separately-compiled extensions. Note the absence of the virtual keyword for ~MyModule.

Loading external modules

As first proof-of-concept, I’ve compiled the UART-over-DMA driver as external module. This compiles to under 1.5 kB of code, and has been manually uploaded to a fixed area in flash as first test. The first few bytes of such a module are the ModHeader prefix, which can then be loaded from the main app as follows:

auto hdr = (ModHeader const*) 0x0800'8000;
if (hdr->magic != ModHeader::MAGIC)
    return ... // can't proceed

auto uart = (Driver*) hdr->setup(malloc(hdr->size));


UartConfig cfg { "A2:7,A15:3", 115200, USART2.ADDR, EN_USART2, 80,
            Irq::USART2, Irq::DMA1_CH6, Irq::DMA1_CH7, 0, 6-1, 7-1, 2, 2 };

Message msg { 0, 'i', sizeof cfg, &cfg };

Note how the caller allocates memory to hold the module’s instance data (it could also have been placed on the stack), before initialising things further.

At this point, a new driver for UART #2 is now active. This is the console on the STM32L432 Nucleo board I’m using for testing. Everything has been set up: interrupts, DMA in both directions, blocking I/O, context switching … the works.

Where to go from here

The code that’s working so far is still rough, and very much proof-of-concept’y.

There are several directions to take this further, apart from obvious cleanups and working out the exact API and dispatch vector design and final implementaion details.

One area which needs more work is managing flash memory, now that it can be used in a far more modular fashion. I have a “Mapped Readonly File Store” implementation waiting in the wings, which supports internal and (mapped) external QSPI flash as file systems. It should be a good fit for storing extension modules, as it supports contiguous files and wear-levelling.

Another big question is how far all this modularity can - and should - meaningfully be taken. Just a few drivers? A generic kernel? The entire application? Feature packs and upgrades?

The source code for all this is at