LCDs need a lot of bandwidth Mar 2016

So far, we have created two display implementations for Mecrisp Forth: a 128x64 OLED display, connected via (overclocked) I2C, and a 320x240 colour LCD, connected via hardware SPI clocked to 9 MHz. While quite usable, these displays are not terribly snappy:

Fortunately, there are much faster options available, even on low-end STM32F103 chips. They are based on STM’s Flexible Static Memory controller (FSMC), a hardware peripheral which can map various types of external memory into the ARM’s address space. This requires a lot of pins, because such interfaces to external memory will be either 8-bit or 16-bit wide.

But the results can be quite impressive. To access an LCD controller connected in this way, you can now simply write to specific memory addresses in code.

Let’s try it out, using the Hy-MiniSTM32V board from Haoyu. It has an STM32F103VC µC on board, i.e. 80-pins, 256K flash, 64K RAM. Still not enough to keep a complete display copy in RAM, but as you’ll see, this no longer matters. The implementation is available on GitHub.

The code is just under 100 lines, a bit lengthy for inclusion in this article. Some of the highlights:

: tft-pins ( -- )
  8 bit RCC-AHBENR bis!  \ enable FSMC clock

  dup PE7  io-mode!  dup PE8  io-mode!  dup PE9  io-mode!  dup PE10 io-mode!
  dup PE11 io-mode!  dup PE12 io-mode!  dup PE13 io-mode!  dup PE14 io-mode!
  dup PE15 io-mode!  dup PD0  io-mode!  dup PD1  io-mode!  dup PD4  io-mode!
  dup PD5  io-mode!  dup PD7  io-mode!  dup PD8  io-mode!  dup PD9  io-mode!
  dup PD10 io-mode!  dup PD11 io-mode!  dup PD14 io-mode!  dup PD15 io-mode!
  drop ;

As mentioned, we need to set up a lot of GPIO/O pins for this, and of course they have to match with the actual connections on this particular board.

Next, we need to set up three registers in the FSMC hardware (that last write enables the FSMC):

: tft-fsmc ( -- )
  [...] FSMC-BCR1 !
  [...] FSMC-BTR1 !
  [...] FSMC-BWTR1 !
  1 FSMC-BCR1 bis! ;

For full details, see GitHub and the - 1,100-page - STM32F103 Reference Manual (RM0008).

So much for the FSMC. We also need to initialise this particular “R61505U” LCD controller on our board, which requires sending it just the right magic mix of config settings on startup:

create tft:R61505U
    E5 h, 8000 h,  00 h, 0001 h,  2B h, 0010 h,  01 h, 0100 h,  [...]
decimal align

: tft-init ( -- )
  tft-pins tft-fsmc
  tft:R61505U begin
    dup h@ dup $200 < while  ( addr reg )
    over 2+ h@ swap  ( addr val reg )
    dup $100 = if drop ms else tft! then
  4 + repeat 2drop ;

And that’s about it. But here is the interesting bit with respect to the FSMC:

: tft! ( val reg -- )  LCD-REG h! LCD-RAM h! ;

That little definition is our sole interface to the LCD, and it just writes two values to two different memory addresses, now mapped by the FSMC.

This same approach can probably be used with a huge variety of LCD displays out there, as long as they are connected via a parallel bus and the µC has support for FSMC. You “just” need to connect the LCD properly, set up all the GPIO pins and the FSMC to match (including proper read/write timing), and initialise the LCD controller with its matching power-up sequence.

The rest is mostly boilerplate to provide the 3 definitions needed by the display-independent graphics.fs library from Mecrisp:

$0000 variable tft-bg
$FFFF variable tft-fg

: clear ( -- )
  0 $20 tft!  0 $21 tft!  $22 LCD-REG h!
  tft-bg @  320 240 * 0 do dup LCD-RAM h! loop  drop ;

: putpixel ( x y -- )  \ set a pixel in display memory
  $21 tft! $20 tft! tft-fg @ $22 tft! ;

: display ( -- ) ;  \ update tft from display memory (ignored)

And here’s the result of running all this code with the Mescrisp graphics demo:

(with apologies for the low image quality of this snapshot)

So now we’re back to displaying stuff on the screen, just like the previous two display implementations. But with the above FSMC-based code, a clear screen takes just 30 ms!

As you can see, the “clear” word above simply brute-forces its way through, by setting each screen pixel in a big loop. That’s 5,000 16-bit writes per millisecond, i.e. 200 ns cycle time.

Which goes to show that performance is the result of optimising (only) the right things!

Weblog © Jean-Claude Wippler. Generated by Hugo.