Computing stuff tied to the physical world

Avoiding memory use

In Hardware on May 25, 2011 at 00:01

On an ATmega328, memory is a scarce resource, as I’ve mentioned recently. Flash memory is usually OK, I’ve yet to run into the 30..32 Kb limit on code. But the crunch comes in all other types of memory – especially RAM.

This becomes apparent with the Graphics Board, which needs a 1 Kb buffer for its display, and with the EtherCard, which can often not even fit a full 1.5 Kb Ethernet packet in RAM.

The Graphics Board limitation is not too painful, because there’s the “JeePU“, which can off-load the graphics display to another JeeNode via a wireless connection. Something similar could probably be done based on I2C or a serial interface.

But the EtherCard limitation is awkward, because this essentially prevents us from building more meaningful web interfaces, and richer web server functionality, for example.

The irony is that there’s plenty of unused RAM memory, just around the corner in this case: the ENC28J60 Ethernet controller chip has 8 Kb RAM, of which some 3.5 Kb could be used for further Ethernet packet buffers… if only the code were written differently!

In fact, we have to wonder why we need any RAM at all, given that the controller has so much of it.

The problem with the EtherCard library, is that it copies an entire received frame to RAM before use, and that it has to build up an entire frame in RAM to send it out.

I’d like to improve on that, but the question is how.

A first improvement is already in the EtherCard library: using strings in flash memory. There’s also basic string expansion, which you can see in action in this code, taken literally from the etherNode.pde example sketch:

Screen shot 2011 05 24 at 22 57 26

The $D’s get expanded to integer values, supplied as additional arguments to buf.emit_p(). This simplifies generating web pages with values (and strings) inserted, but it doesn’t address the issue that the entire web page is still being constructed in a RAM buffer.

Wouldn’t it be nice if we could do better than that, especially since the packet needs to end up inside the ENC28J60 controller anyway?

Two possibilities: 1) generate directly to the ENC28J60’s RAM, or 2) generate a description of the result, so that the real output data can be produced when needed.

For now, this is all just a mental exercise. It looks like option #1 could be implemented fairly easily. The benefit would be that only the MAC + IP header needs to stay in RAM, and that the payload would go directly into the controller chip. A huge RAM saving!

But that’s only a partial solution. The problem is that it assumes an entire TCP/IP response would fit in RAM. For simple cases, that would indeed be enough – but what if we want to send out a multi-packet file from some other storage, such as an external Memory Plug or an SD card?

The current EtherCard library definitely isn’t up to this task, but I’d like to think that one day it will be.

So option #2 might be a better way: instead of preparing a buffer with the data that needs to be sent, we prepare a set of instructions which describe what is needed to generate the buffer. This way, we could generate a huge “buffer” – larger than available RAM – and then produce individual packets as needed, i.e. as the TCP/IP session advances, packet by packet.

This way, we could have some large “file” in a Memory Plug, and its contents would be copied to the ENC28J60’s RAM on-demand, instead of all in advance.

It seems like a lot of contortions to get something more powerful going for the EtherCard, but this approach is in fact a very common one, called Zero Copy. With “big” computers, the copying is done with DMA and special-purpose hardware, but the principle is the same: don’t copy bytes around more than strictly needed. In our case, that means copying bytes from EEPROM to ENC28J60, without requiring large intermediate buffers.

Hm, maybe it’s time to create a “ZeroCopy” library, sort of a “software based generic DMA” between various types of memory… maybe even to/from external I/O devices such as a serial port or I2C device?

It could perhaps be modeled a bit like Tcl’s “channels” and “fcopy” command. Not sure yet… we’ll see.

  1. This is huge! I like the thorough thinking which you put into this.

    I like the idea of the ZeroCopy library, nobody did something like this?

    Would such a library also be useful in the opposite way, like receiving a stream of tcp/ip packets? At least it would be possible to parse such a stream for a specific symbol … Or do I get there something wrong?

  2. Yes, could work for the receiving end too. You have to decide what to do with incoming data. Searches, and copying to EEPROM or such would be a possibility – random access to various parts of a large incoming request would be harder. For web requests, we could create a bit of utility code to pick up URL arguments from a GET or POST request, and put the values in RAM. Scanning for one a specified set of strings would also be possible. Basically, anything stream-like could probably be made to work.

Comments are closed.