PendSV+SVC on Cortex M¶
The design of JeeH’s new multi-tasking kernel was heavily
influenced by the hardware support present in ARM Cortex M microcontrollers.
I find the design of the PendSV
and SVC
exception handlers
brilliant, making everything so simple (and so well suited for C/C++).
The problem¶
Multi-tasking is about maintaining the illusion that each task runs independently, and that switching between tasks is just something the kernel does “in the background” to give them each some CPU cycles to work with.
On top of this, the kernel needs to keep interrupts “out of view” as far as tasks are concerned. Tasks should not care or even see any of that activity.
But interrupts and context switches can be brutal: just reading a value from memory into a register, incrementing it, and writing it back can wreak havoc, if an interrupt happens to trigger in the midst of this, changing that same memory location. Not only are such race conditions awfully difficult to avoid, they are also fiendishly hard to detect, isolate, and fix later on.
Solutions, solutions, …¶
These problems have been known for decades. With numerous ways to try and avoid them in the first place: semaphores, mutexes, critical sections, monitors, spinloops, … and much more.
At the most basic hardware level, it’s all caused by interrupts, which can (and will) happen at the most inconvenient times if the application is left to run long enough. While every CPU will merrily execute one instruction after the other, it often has very few mechanisms to make some sequences atomic, i.e. executing without interruptions.
The easy way out is also the oldest solution to it all: disable interrupts while executing these specific instruction sequences. It works, but you can get into trouble with nested code which wants to protect a different sequence in the same way (hint: IRQ enable at the end). Worse still, disabling all interrupts because one of them might affect this code is a bit of a big hammer.
Another option would be to adopt one of the existing RTOS frameworks. Too complex for me …
I’ve decided to follow another path: take inspiration from the TRIPOS design from the 1970’s and implement a purely message-based mechanism in C++, integrated with the rest of JeeH.
Context switching¶
Apart from race conditions, another troublemaker is context switching: the need to pause one task, save its entire CPU state, switch to a different task, and restore state from that context.
The way to do this is to save all registers on the task’s stack, and then change to a different stack and restore all registers from it. There’s some overhead involved, but it works well.
But what if the task was in the middle of one of those critical sections? Or had its interrupts disabled? Or was currently in the middle of processing an interrupt? There are clearly a lot of pesky little details to deal with …
Stack requirements¶
Interrupt handlers are similar to context switches: they save some registers on the current stack, do their thing (quickly), restore the registers, and leave with the task back in control.
This needs some spare stack space to work. And it needs it on every task, since interrupts can occur while any of them is running.
On ARM Cortex, an interrupt stack save needs at least 32 bytes, plus whatever the interrupt handler itself uses. If hardware floating point is present and enabled, the stack space requirements will increase by at least another 64 bytes.
A stack overhead of 100 bytes is not much. But it adds up, not just for tasks but also for coroutines (although these are not –yet– supported in C++). Having lots of small tasks and coroutines can greatly simplify an application, but the stack requirements w.r.t. memory (which can’t be shared and is hardly ever used) are not enticing on µCs with limited memory.
A separate kernel stack¶
The ARM architecture has a number of tricks to help with all this. To start with that last issue of stack space overhead: ARM CPUs support a dual-stack mode, whereby all state is saved on the “process” stack pointer (PSP), but then the CPU switches to a “main” stack pointer (MSP) during the processing of all interrupts and exceptions. All interrupts and even nested interrupts will then use this main stack.
Problem solved: a (single! shared!) main stack for “kernel” mode, and separate task stacks for “user” mode, with no need to reserve lots of space for worst-case interrupts.
But there can be a problem with this approach. What happens if an interrupt triggers, which saves some more registers because it needs them itself and doesn’t want to clobber the values expected by the current task?
These registers will be saved on the current (i.e. main) stack. Then, a context switch is requested, and lastly the interrupt handler returns. Oops … those carefully saved values will end up being “restored” to the wrong task. Chaos ensues.
ARM’s PendSV¶
This is where ARM’s PendSV
comes in: it’s an exception which can be triggered
by setting a specific bit in the CPU’s hardware registers. The (brilliant) trick
is that it’s an asynchronous interrupt request (unlike SVC
), which will be
triggered ASAP - but not necessarily right away.
The second half of this trick is to make PendSV’s priority the lowest in the system, so that it will only take place when no other interrupt is active (or when all interrupts have returned - same thing). As a result, the PendSV handler can assume that it is the only interrupt currently running. It may still get interrupted, but their effect will be gone when PendSV resumes.
Caveat: on ARM CPUs, a HIGH priority is denoted by a LOW numeric value!
The key point here is that PendSV is always the first stack frame on the main stack. There can not be any other interrupt state on the main stack at this point in time. In other words: the main stack is essentially empty - switching process stack contexts in PendSV is always 100% safe.
In short: when a task requests a context switch via PendSV, it will immediately take place. And when an interrupt handler (or exception) does the same, the switch will take place after the interrupt handler returns, at which point the PendSV handler gets launched and takes control.
The beauty of this is that there are no special cases - if PendSV does all context switching.
ARM’s SVC¶
PendSV does not address the problem of interrupts messing up atomic sequences.
This requires the use of another mechanism: ARM’s supervisor call instruction,
SVC
. This is a CPU instruction which triggers a synchronous interrupt
request to the SVC handler. It’s essentialy a “call the SVC handler NOW”
request - but unlike a a function call, it’s also an interrupt, i.e. it does the
same saving-of-registers and switching-to-MSP as all other interrupts.
SVC is used for two reasons: 1) it switches to the main stack, which means all the code inside the SVC handler runs without needing stack space on the task’s stack, and 2) like all interrupt handlers, it runs at a specific IRQ priority level, blocking all lower-priority interrupts from running (until SVC exits).
The second part of this approach is therefore crucial: the IRQ priority of SVC is set to slightly higher than PendSV’s (which was set to the lowest possible).
Here’s what this accomplishes:
- while SVC is active (i.e. its handler is running), PendSV cannot activate
- in other words, while SVC is active, no task contexts will be switched
- most interrupts can run at a higher priority than SVC, and therefore interrupt at will
- it’s not essential, but the PendSV handler code could use SVCs if it wants to
That last point is a subtle one. Since SVC acts as an interrupt handler, and since it prevents all lower-priority interrupts from running, SVC can not be used from inside the SVC handler. It’s a synchronous request, which means that it must run without delay - and it can’t, because it’s locking itself out. Calling SVC while running inside SVC leads to a hard bus fault: it’s a big nono.
In JeeH’s multi-tasker , SVC is used as “system call”, i.e. a way to enter “kernel mode”. IOW, system calls can not recursively make system calls. It’s no big deal, but it’s an important detail.
How it all fits together¶
There is still one missing aspect: interrupt handlers cannot make system calls, and yet they need a way to somehow signal tasks that something important has happened. In the case of JeeH’s multi-tasker, interrupt handlers need a way to send a reply message once the device driver has completed the request.
Again, PendSV comes to the rescue: there is a set of flag bits, one per driver, which
will be set by interrupt handlers when they return true
. In this case, a
PendSV request is also posted. When an interrupt handler exits in this way, it will
cause the PendSV handler to run ASAP.
In addition, the PendSV
handler is extended to perform one additional task:
check-and-clear all those flag bits, and call the corresponding driver’s
finish
function when set. So finish
always runs inside PendSV
, when no
context switching or SVC call can take place. It can do the final non-urgent
work, fill in replies, and send the reply message(s) back to its original
task(s).
I lied: there’s a subtle difference on ARM Cortex M0 and Cortex M0+ cores.
The “setting and clearing” of bit flags is the only place where true
atomic protection is needed, because it is done at interrupt time, and
(unlike PendSV and SVC) any of these interrupts might interrupt each
other. On ARM Cortext M3/M4/M7 cores, gcc
provides special “builtin
atomic” functions which can perform this task with LDREX
and STREX
instructions. On Cortex M0/M0+, a super-brief “interrupt disable/enable”
section is the only way to guarantee atomicity. So yeah … on those cores,
there will be brief global IRQ lockouts.
Lastly, PendSV ends with doing what it already did: figure out which task needs to run next, and perform the actual context switch. Note that a reply may affect the choice of this task if it just caused a higher-priority task to become runnable.
The end result of all this effort uses PendSV
and SVC
in such a way that
context switching and IRQ signalling through message replies take place without
ever interfering with each other:
- interrupt handlers can run at any priority, unimpeded by any kernel (SVC) activity
- IRQ handlers delegate actual task signalling to
finish
, which will be called “ASAP” - PendSV calls
finish
whenever driver interrupts need it, and never inside an SVC call - SVC system calls from tasks can juggle messages queues without interference from IRQs
- PendSV will only switch contexts when it is safe to do so (i.e. no interrupts, no SVC)
In the meantime, tasks remain blissfully ignorant of it all, just calling
os::put
and os::get
…