The Graphics Board is going to enable a bunch of fun projects around here.
Unfortunately, the ST7565 library is a bit slow in one specific way – display updates. Since everything is buffered in RAM, most other operations are actually quite fast, but to get that data onto the gLCD takes time.
I had to find out how much time, of course. Here’s my test sketch:
All the loop does is send the same RAM buffer contents to the display, over and over again. The time it takes in microseconds is sent to the serial port, and the result is quite bad, actually:
- 126 milliseconds, i.e. 8 refreshes per second, max!
The good news is that it’s a clear bottleneck, so chances are that it can be found and then hopefully also avoided. Sleuthing time!
The ST7565 core primitives, responsible for getting the data out to the display were coded as follows:
Guess what: the shiftOut() in the Arduino library is written with calls to digitalWrite(). That rings a bell.
Ah, yes, good OLD digitalWrite() again, eh? Of course …
So I rewrote this code using fixed pin numbers and direct port access:
And sure enough, it’s almost exactly TEN times faster:
- 12.3 milliseconds, i.e. 80 refreshes per second.
Needless to say, I’m going to leave these changes in my copy of the ST7565 library – even though that means it’s no longer general purpose since the pin assignments are now hard-coded. A 10-fold performance increase of something which really needs to be snappy and responsive is non-negotiable for me.
Here is a copy of the ST7565 code I use.
Could this bottleneck have been avoided?
The ST7565 library was clearly written as general purpose library, so making it usable with any set of pin assignments makes a lot of sense. The price paid in this case, was a 10-fold drop in performance, plus a few extra bytes of RAM used in the ST7565 class.
I’ll revisit this topic some other time, to discuss the trade-offs and implications of compile-time vs run-time logic, as well as tricks such as: putting the pin choices in a header file, using #include for source files, pre-processing and code generation, and C++ templates.
For now, I’m just happy with the 80 Hz refresh rate :)
Update – I had to slow down the SPI clock a bit, because the display was not 100% reliable with the code shown above. The fix is in the ZIP file.
Woooo, that’s a serious improvement!
Fast enough to simulate grey scales. Next step, mpeg4 playback ;o)
Except that the LCD itself is pretty slow. Don’t expect too much from any kind of animation…
The main reason I did this was to reduce the processing load on the ATmega.
Will have to make this change in my copy of the ST7565 library as well. I have a DOGM GLCD on a breadboard right now that uses the ST7565 controller, but the page order and the column numbers are different than the GLCD that both you and Adafruit use. So I had to make those changes in the in the library to get things to display correctly.
You can get some of the flexibility of variable pins back by storing pointers to the PORTs (DDRx, PINx and PORTx) and the mask or bit offset inside the class for each of the pins you need. This will cost several cycles as you load in the pointer, deference it and then run the set/clear instructions, but will still be faster than Arduino’s DigitalWrite. See the avr-libc FAQ entry on passing ports as parameters to functions[0] for a short example that should help you out.
http://avr-libc.nongnu.org/user-manual/FAQ.html#faq_port_pass
Good point. Thanks for the pointer and examples on that page.
In case it’s of use to you – I think there is a (tiny) error on that page:
Shouldn’t that be?
The white space is significant here, I think …
I have a Question: What for is the extra Ram in the yC ? The GLCD has a Ram , so isn’t it faster to write direkt to the GLCD ram instead of copy a shadow ram permanently to the GLCD ??
You can only write to this GLCD, so without access to the contents, you couldn’t change individual pixels and do things like draw lines or circles. Keeping a 1 Kb buffer in RAM is actually an advantage when you need to do lots of read-modify-write operations.
Display update could be optimized further by not sending all of RAM each time, but that requires a bit more logic. One idea would be to track the smallest enclosing rectangle that needs to be refreshed, for example.
When you write to the display you have to write an entire byte, 8 pixels. I believe the bytes are arranged vertical on this display. So if you wanted to set all 8 pixels (which were in the same byte) to a specific pattern, then yes, you could write them direct to the LCD without any need to read the current contents.
The problem occurs if you are not writing all 8 pixels in a byte, but maybe 6 from one byte, and 2 from the following byte below, or maybe not even 8. Maybe you just want to set 2. If you just write the bytes as they are, any other pixels set on the bytes you write to will be turned off.
Now here’s where some shortcuts and speed increases might be available to grab. So far JC has shown us text information being displayed. He’s using a font which is based on the 6×8 layout (ah-ha 8 pixels high), and so far he’s been placing them in positions where Y%8=0, so we don’t care what is already in the byte, as our character is using all 8 vertical pixels in each of its 6 columns and it is aligned to the byte boundaries of the display, so in this situation, using the display as a 8 row text device, you could bypass the ram storage and write direct.
I used to do a lot of direct display playing back in the days before windows, so as soon as my panel arrives I will be digging into the library like a man possessed (and discovering how bad my memory has become!).
Awesome, indeed! I have a k0108-based glcd with parallel 8 bit interface. In theory this should be more efficient. In practice, anything but! No way to reach 20 fps. With new Arduino glcd library (beta) http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1279128237/0 witch is speedy when compared to the old ks0108 lib…
And yours can do 80 fps!