Computing stuff tied to the physical world

Pointer addressing

Now we get to pointers. Ah, those finicky pointers, the curse of C and C++!

There are two sides to the story in C/C++: pointer semantics and pointer syntax.

First the semantics of pointers. Recall that memory is a bunch of bytes:

Memory

The leftmost one is at “address 0” (0x0000), the rightmost one is at “address 9” (0x0009). Suppose you need to store two variables x and y in memory, and an array of 3 bytes, z. Furthermore, suppose variable x is at address 0x0003, y at address 0x0007, and z is mapped to the three bytes at addresses 0x0004..0x0006. Our total memory use is 5 bytes:

Memory vars

In C, we can use x, y, and z[0], z[1], z[2] as symbolic notation for those memory slots:

byte x, y, z[3]; // this declares two variables and an array

No magic inolved. The array is indexable, so we could have a variable “i” and loop over it:

for (int i = 0; i < 3; i++)
    printf("z[%d] is %d\n", i, z[i])

When writing code, we could pass a specific value of i around, to refer to specific element of z, without passing around z itself. I.e. we pass arount the element index, not its value.

This is very important, because now we can write code to change something in z, such as:

void zDouble (int pos) {
    z[pos] = 2 * z[pos];
}

And then call it as:

zDouble(1);

The result is of course that whatever value was in z[1] will be doubled in value when zDouble returns. We have passed around the index as reference to one item of z. The implication is that we can write code which changes some part of z, even though we don’t know or care which part when the code is written. We can also easily double every item:

for (int i = 0; i < 3; i++)
    zDouble(i);

An array is a convenient way to aggregate identical “things”. Pretty trivial stuff.

Pointers

But what if we want to write a routine to double either x or y?

We can’t pass x in, since then doubling it will lead to a local copy which has the new value:

void wontDouble (int v) {
    v = 2 * v; // whoops, v is a local copy!
}

wontDouble(x); // x will be copied in, but not changed itself
wontDouble(y); // y will be copied in, but not changed itself

We need a way to refer to x or y, not just fetch its value. Here is how:

byte* p; // can also be written as "byte *p" or even "byte*p"

The new variable “p” is declared as a pointer to a byte. That pointer too will have some address in memory – on an ARM Cortex, a pointer takes 4 bytes of memory (let’s assume our total memory is larger than those 10 bytes shown above, else we’ll have a problem).

Now we can make p point to x:

p = &x; // set p to the address of x

Looking inside p, we would see the value 0x0003, since that’s where x is in memory. Note that p is not the value. We can’t double p itself. We can only double what it points to:

*p = 2 * *p;

Lots of “*” symbols in there: a dereference, a multiplication, and another dereference. Where “dereferencing” means that you take the pointer (0x0003 in this case), and look at what it points to. This is denoted by “*p“, and can be used wherever a variable is allowed. Note in particular that “*p” can be used on both sides of the “=” assignment operator.

To interpret the above C statement, we’ll need to read it from right to left:

  1. fetch the value to which p points, i.e. the contents of address 0x0003
  2. multiply by 2
  3. store the result back into where p points, i.e. again address 0x0003

So now we can write the function which doubles an arbitrary byte:

void anyDouble (byte* ptr) {
    *ptr = 2 * *ptr;
}

anyDouble(&x);
anyDouble(&y);

Bingo! We now have a general purpose doubling routine. It even works for z elements:

anyDouble(&z[1]);

Doubling is not very interesting, but the mechanism remains when writing more complex code: by passing pointers, we can refer to data in memory and allow making changes.

Notations

So far, C’s notation should be reasonably easy to grasp. But it can get complicated, due to various shorthands and equivalent notations. This also doubles all elements of z:

void anyDouble2 (byte* ptr) {
    *ptr *= 2; // shorthand for "*p = *p * 2"
}

for (int i = 0; i < 3; i++)
    anyDouble2(z+i); // same as "&z[i]"

The reason for this is that for any array, if you omit the index altogether, you get a pointer to the first element of the array: “*(z+i)” is 100% equivalent to “z[i]“.

And “*&anyVar” is the same as “anyVar” – taking the address and then dereferencing it is a no-op. That’s the whole idea. Note also that “&*anyVar” is usually meaningless and will be flagged as syntax error.

It’s worth getting really used to this & and * notation, even though it’ll look weird at first.

Null pointers

Pointers are dangerous in C/C++, because errors can have serious consequences. We’re passing a reference to some place in memory around, and if that’s the wrong place, and we were to double what’s stored there, we can completely derail our program. Since on ARM 99.99% of the addresses are reserved, many incorrect pointer uses will crash (and trigger a “fault” exception). More common (and nasty, alas) are off-by-one programming mistakes.

A pointer referring to address 0x0000 is a special case: this is often used in code to signal that the pointer is not to be used. That’s why you’ll see code such as:

if (p != 0) { // or even "if (p) ..."
    ... *p ...
}

Passing zero as value for a pointer is then a way to say: skip doing things during this call.

On most ┬ÁC’s there will be flash memory at address 0x0000. Fetching it is harmless, though probably also meaningless, but storing something in a null pointer will crash.

Pointers to ints and structs

Everything so far used individual bytes, but the same works for integers:

int map[10];
int* ip;

ip = map;
for (int i = 0; i < 10; i++) {
     printf("map[%d] = %d\n", i, *ip);
     ip += 1; // advances the pointer by one int size!
}

When adding 1 to ip, the C/C++ compiler will actually add whetever the size of an int is. If ip points to address 0x0100, adding 1 to ip will step it through 0x0104, 0x0108, etc!

Pointer arithmetic depends on the width of the things the pointer points at.

The result is that the above loops looks the same and is written in the same style regardless of the type of the data elements. Recalling the previous example, this also works on structs:

MyStruct data[50];
MyStruct* dp;

dp = data;
for (int i = 0; i < 50; i++) {
     printf("data[%d].a = %d, data[%d].b = %d\n", i, dp->a, i, dp->b);
     dp += 1; // advances the pointer by one MyStruct size!
}

One more extremely common shorthand in C/C++: “dp->a” is equivalent to “(*dp).a“.

Pointers are a convenient way to pass a reference to an entire data structure into functions, with the function then internally using “dp->a“, “dp->b“, etc. to fetch or set one of the fields of that data structure. This lets us write code to manipulate more abstract concepts.

More shorthand

Here is a particularly obtuse way of copying a C string from one buffer to another:

char from[10], to[10];
char* p = from;
char* q = to;
while (*q++ = *p++)
    ;

Let’s de-cypher that (and hope we rarely come across such a cryptic implementation):

  • at the start of the while, p points to from[0] and q points to to[0]
  • the condition in the while is an assignment statement (“=”, not “==”!)
  • we dereference p (i.e. from[0]), and store it in the dereferenced q (i.e. to[0])
  • the “++” at the end means “increment after use”, also called post-increment
  • so we increment both p and q (now pointing to from[1] and to[1], respectively)
  • the assignment is done, its value is whatever got assigned
  • in C, strings end with a zero byte
  • zero is false when used in an if or while, everything else is true
  • so if we just copied a zero byte, exit the while loop
  • else rinse and repeat, with p and q now both advanced by one position

With pointers, you have to slow down and take the time to understand what’s going on. But once you become familiar with their notation and all the C/C++ shorthands, it really is an extremely powerful concept. It’s also very close to what the machine code is doing.

Conclusion: pointers are tricky but not magic. And certainly not worth being afraid of!

[Back to article index]