Simple variable packet data May 2016
Until now, most of the wireless sensor nodes here at JeeLabs have been using a simple “map C/C++ struct as binary” approach as payload format. The advantage of this is that it simplifies the code in C (once you wrap your mind around how structs and binary data are stored in memory, that is) - but it’s also a bit C-specific, compiler-dependent, and not truly flexible.
With Forth now taking over both sides of the wireless RF link, it’s time to revisit this approach.
Here is a new design, which can efficiently package and send a number of numeric values in a generic form, and which is also easy to decode at the receiving end. This format and design is not language-specific, in fact it has been the basis of several commercial and open-source products since 1991. It’s also used in the p1scanner code in a previous release of HouseMon.
The main idea is to transform integers into variable-length byte sequences. A packet is then simply a series of such values, starting with a packet format ID.
This particular encoding represents small positive integers more compactly than larger ones. And it’ll work with arbitrarily large integers (the current Forth implementation can deal with up to 64-bit values). Negative integers can be handled in several ways, to be described further on.
The following example is for 32-bit integers, the most common size on ARM µC’s, since that’s the size of a native int and of a value on the Forth stack.
Here is how a 32-bit integer is converted to a sequence of 1..5 bytes:
As you can see, bits are grouped 7 at a time, with the high bit set only in the last byte. So all a decoder has to do is to advance through input bytes until it finds one with the high bit set, while accumulating and shifting the result 7 bits at a time.
Here’s the trick: leading zero’s are skipped. Encoded byte sequences never start with a 0-byte:
If the entire value is 0, then this emits a single byte with only the high bit set, i.e. 0x80.
For negative values, with all the high bits set to one, we have several options:
- just ignore the issue, and accept that negative values will always be encoded as 5 bytes
- add an offset to make it positive (and subtract it after decoding) - probably the simplest solution, but it requires knowledge in the decoder about which values to adjust and how
- convert the value
abs(N) * 2 + sign(N)- and convert it back in the decoder
This approach leads to a compact binary data packet for simple integer values. By adding the convention that the first integer (a single byte, i.e. 0..127) represents the type of packet, we can play various tricks to make things more compact for certain cases.
Here is an actual encoded packet sent out by the JeeLabs Energy Monitor:
There are 6 bytes with the high bit set, therefore 6 values:
0x02A8= 2 * 128 + 40 (i.e. 0xA8-0x80) = 296
0x078E= 7 * 128 + 14 (i.e. 0x8E-0x80) = 910
One potentially very useful property of this encoding is that the 0-byte never appears as the first byte in these variable-byte encodings. Which means that the 0-byte can be inserted as special “escape mark” for all sorts of purposes. It could be followed by an alternate representation for a small string for example, or even be the start of a much richer “separator” convention to allow encoding complete JSON data structures. Note that 0-bytes can appear inside byte sequences when some of the intermediate 7 bit groups are all zero.
And since only the last byte of each encoded byte sequence has its high bit set, we could also traverse multiple values in reverse order, if needed.
For now, the special 0-bytes are not used. They leave the door open for future enhancements.