Computing stuff tied to the physical world

Binary packet decoding

In AVR, Software on Dec 7, 2010 at 00:01

The RF12 library used with the RFM12B wireless radio on JeeNodes is based on the principle of sending individual “packets” of data. I’ve described the reasons for this design choice in a number of posts.

Let me summarize what’s going on with wireless:

  • RFM12B-based nodes can send binary packets of 0..66 bytes
  • these packets can contain any type of data you want
  • a checksum detects transmission errors to let you ignore bad packets
  • dealing with packet loss requires an ACK + re-transmission mechanism

Packets have the nice property that they either arrive intact as a whole or not at all. You won’t get garbled or inter-mixed packets when multiple nodes happen to send at (nearly) the same time. Compare this to some other solutions where all the characters sent end up in one big “soup” if the sending happens (nearly) simultaneously.

But first: what’s a packet?

Well, loosely speaking, you could say that a packet is like one line of text. In fact, that’s exactly what you end up with when using the RF12demo sketch as central receiver: a line of text on the serial/USB connection for each received packet. Packets with valid checksums will be shown as lines starting with “OK”, e.g.:

    OK 3 128 192 1 0
    OK 23 79 103 190 0
    OK 3 129 192 1 0
    OK 2 25 99 200 0
    OK 3 130 192 1 0
    OK 24 2 121 163 0
    OK 5 86 97 201 0
    OK 3 131 192 1 0

Let’s examine how that corresponds with the actual data sent by the node.

  • All the numbers are byte values, shown as numbers in the range 0..255.
  • The first byte is a header byte, which usually includes the node ID of the sender, plus some extra info such as whether the sender expects an ACK back.
  • The remaining data bytes are an exact copy of what was sent.

There appears to be some confusion about how to deal with the binary data in such packets, so let me go into it all in a bit more detail.

Let’s start with a simple example – sending one byte:

Screen Shot 2010 12 06 at 22.32.26

I’m leaving out tons of details, such as calling rf12_recvDone() and rf12_canSend() at the appropriate moments. This code is simply broadcasting one value as a packet for anyone who cares to listen (on the same frequency band and net group). Let’s also assume this sender’s node ID is 1.

Here’s how RF12demo reports reception of this packet:

    OK 1 123

Trivial, right? Now let’s extend this a bit:

Screen Shot 2010 12 06 at 22.38.13

Two things changed:

  • we’re now sending a larger int, i.e. a 2-byte value
  • instead of passing length 2, the compiler calculates it for us with the C “sizeof” keyword

Now, the incoming packet will be reported as:

    OK 1 57 48

No “1″, “2″, “3″, “4″, or “5″ in sight! What happened?

Welcome to the world of multi-byte values, as computers deal with them:

  • a C “int” requires 2 bytes to represent
  • bytes can only contain values 0..255
  • 12345 will be “encoded” in two bytes as “12345 divided by 256″ and “12345 modulo 256″
  • 12345 / 256 is 48 – this is the “upper” value (the top 8 bits)
  • 12345 % 256 is 57 – this is the “lower” value (the low 8 bits)
  • an ATmega stores values in little-endian format, i.e. lowest-range bytes come first
  • hence, as bytes, the int “12345″ is represent as first 57 and then 48
  • and sure enough, that’s exactly what we got back from RF12demo

Yeah, ok, but why should we care about such details?

Indeed, on normal PC’s (desktop and mobile) we rarely need to. We just think in terms of our numbering system and let the computer do the conversions to and from text for us. That’s exactly what “Serial.print(12345)” does under the hood, even on an Arduino or a JeeNode. Keep in mind that “12345″ is also a specific representation of the abstract quantity it stands for (and “0×3039″ would be another one).

So we could have converted the number 12345 to the string “12345″, placed it into a packet as 5 bytes, and then we’d have gotten this message:

    OK 1 31 32 33 34 35

Hm. Still not quite what we were looking for. Because now we’re dealing with ASCII text, which itself is also an encoding!

But we could build a modified version of RF12demo which converts that ASCII-encoded result back to something like this:

    OKSTR 1 12345

There are however a few reasons why this is not necessarily a good idea:

  • sending would take 5 bytes instead of 2
  • string manipulation uses more RAM, which is scarce on an ATmega
  • lots of such little inefficiencies will add up, once more data is involved

There is an enormous gap in performance and availability of resources between a modern CPU (even on the simplest mobile phones) and this 8-bit few-bucks-chip we call an “ATmega”. Mega, hah!

But you probably didn’t really want to hear any of this. You just want your data back, right?

One way to accomplish this, is to keep RF12demo just as it is, and perform the proper transformation on the receiving PC. Given variables “a” = 57, and “b” = 48, you can get the int value back with this calculation:

Screen Shot 2010 12 06 at 23.11.04

Sure enough, 57 + 48 * 256 is… 12345 – hurray!

It’s obviously not hard to implement such a transformation in PHP, Python, C#, Delphi, VBasic, Java, Tcl… whatever your language of choice is.

But there’s more to it, alas (hints: negative values, floating point, structs, bitfields).

Stay tuned… more options to deal with these representation details tomorrow!

Update – Silly mistake: the “rf12_sendData()” call doesn’t exist – it should be “rf12_sendStart()”.

  1. A two byte value can be computed more efficiently with bitshifting: result = (((int) a) << 8) | ((int) b); // combine both a and b Or the other way around: a = (char) ((result & 0xFF00) >> 8 ); // take the most significant byte and move it to the right b = (char) result & 0x00FF; // take the least significant byte, it’s already in the right position

    Using bitshifts you can also transfor for example floats; as long as both compilers use the same floating point implementation.

    • It’s a good point to make, but my hunch is that in many cases GCC will generate the same code for both. I haven’t actually checked this case, though.

Comments are closed.