Composite video from ARM
Oct 26, 2016

Some weblog posts tend to take me further away from the usual topics. No worries… it’s just that I want to explore some “outer edges” to better understand complexity & speed trade-offs between µCs and FPGAs.

So… let’s generate a video signal with a µC!

This is the HyTiny-based STM32F103 RF Node Watcher I used before, re-purposed here to generate a composite video signal.

Black-and-white composite video is very easy to generate – it needs just two output pins and two resistors. There’s an amazing TVout project for Arduino to show how.

In this case, I want to use ARM and Forth, as a coding exercise and to find out about the performance of this setup. It took a little tinkering to get all the timing loops right:

omode-pp pa2 io-mode!  \ vout
omode-pp pa3 io-mode!  \ sync

: zero  pa3 ioc! pa2 ioc! ;
: black pa3 ios! pa2 ioc! ;
: white pa3 ios! pa2 ios! ;

: zr-us ( u -- ) pa3 ioc! pa2 ioc! us2 ;
: bl-us ( u -- ) pa3 ios! pa2 ioc! us2 ;
: wh-us ( u -- ) pa3 ios! pa2 ios! us2 ;

: pulses ( u -- ) 0 do zero 2 us2 black 29 us2 loop ;
: vsyncs ( u -- ) 0 do zero 29 us2 black 2 us2 loop ;

: bbb  0 do  4 zr-us  6 bl-us  1 bl-us  48 bl-us  1 bl-us  4 bl-us  loop ;
: www  0 do  4 zr-us  6 bl-us  1 wh-us  48 wh-us  1 wh-us  4 bl-us  loop ;
: wbw  0 do  4 zr-us  6 bl-us  1 wh-us  48 bl-us  1 wh-us  4 bl-us  loop ;

: try  \ draw a big rectangle
      5 vsyncs
      5 pulses
     32 bbb
     16 www
    236 wbw
     16 www
      2 bbb
      4 pulses
  key? until ;


The full code is here, with 4 different tests. The us2 function is a modified microsecond delay which doesn’t use the systick timer, because all interrupts have to be disabled to get a glitch-free signal.

There’s quite some overhead in the current ios! and ioc! routines to set and clear an I/O pin. This limits the attainable horiontal resolution, since each scan line has to be generated in exactly 64 µs.

Since the signal is being generated with an infinite loop, the µC is fully consumed by this activity. To make this usable would require using interrupts, freeing up cycles during the flyback “blanking” intervals.

But that was never really the point of this exercise – as will become clear next week!

For comments, visit the forum.

Debugging an FPGA via a µC
Oct 19, 2016

When trying out things in Verilog, one of the struggles I have is understanding what my “noob” code is doing. There’s no such thing as a printf here, obviously. There’s simulation, but that only goes so far when tying into the real world, and there are LEDs and 7-segment displays, which are trivial to attach to some internal signals.

But that’s not enough: I want a transcript of a series of events, to see what happened, in the proper context and sequence.

My solution to this is something I’m calling an “SPI peek” device, which can be included in the FPGA and talks to an external µC.

SPI is a brilliant data exchange mechanism. It’s simple and it’ll handle > 20 Mbit/sec:

SPI is basically a circular shift register split into two parts: one half lives in the master, the other half lives in the slave. Three wires to carry all information and a fourth enable wire (SS) to delimit message boundaries.

The SPI peek idea is to have say 32 bits on both sides, rapidly sending them across as often as needed. The enable signal (which is active low), is used as follows: on the falling edge, 32 bits are latched into the FPGA’s slave register. On the rising edge, the slave register is latched into an output register.

The result is that we can connect up to 32 internal FPGA signals to the slave’s input, and that we get 32 output signals to tie back into the FPGA to control it.

As a first test, I’ve “connected” these virtual output pins to the 4-digit 7-segment display on my starter board: 4 pins to digit select, and 8 pins to the individual segments.

The attached µC is an HyTiny STM32F103, running this Forth code:


  %0000000001011100 SPI1-CR1 !  \ clk/16, i.e. 4.5 MHz, master
\ %0000000001001100 SPI1-CR1 !  \ clk/4, i.e. 18 MHz, master (max supported)

: >fpga> ( u -- u )  \ exchange 32 bits with attached FPGA, takes ≈ 7 µs
    dup 24 rshift >spi> 24 lshift swap
    dup 16 rshift >spi> 16 lshift swap
    dup  8 rshift >spi>  8 lshift swap
  -spi  or or or ;

\ verify that data comes back when loopback is set
depth . $12345678 >fpga> hex. depth .
depth . $90ABCDEF >fpga> hex. depth .

: scan1 ( u -- )  not  $FFF and  >fpga> drop  5 ms ;

: scanner
    $808 scan1
    $404 scan1
    $202 scan1
    $101 scan1
  key? until ;


In other words, the µC is acting as if it had the 7-segment display attached directly to its I/O pins, and does all the multiplexing and segment setup. The FPGA just passes the signals on to the real display hardware.

Once the proper segments lit up on each of the digits, I knew that this new SPI peek interface was working. Now, it can be used to develop new logic in the FPGA.

As a next test, I’ll send a 16-bit value and have the FPGA display it as hex number.

Here is the Forth side, acting as test driver:


[... same as above ...]
: >fpga> ( u -- u )
[... same as above ...]

: counter  0  begin  1+ dup  >fpga> drop  250 ms  key? until  drop ;


It’s nothing but a counter, sending a new 32-bit count to the FPGA every 250 ms.

Now, the FPGA does a bit more: converting each hex nibble to a 7-segment pattern, and rapidly multiplexing each of the digits:

(that’s the display, after 1343 seconds…)

For the SPI peek device code, see GitHub. Here is a Verilator (PDF) simulation run.

SPI peek works up to ≈ 1/10th the FPGA’s clock rate (tested w/ 18 Mb/s @ 200 MHz).

What I really like about this setup, is that I can prototype a bit of logic in Forth on the µC side, setting up the FPGA to simply pass through all its signals, and then gradually convert parts into Verilog code to make the FPGA take over specific functionality.

Verilog synthesis is a very slow process. With Forth’s interactive peeking, poking, and fast-paced coding cycle, the whole process becomes a lot more… enjoyable!

Update - SpiPeek has been optimised to use 30% fewer LEs and can run 2x as fast.

For comments, visit the forum.

PDP-8/L & DF32 disk on FPGA
Oct 12, 2016

As final entry in this PDP-8 saga (for now), I’d like to present an FPGA implementation of a small but very functional setup by Ian Schofield of a PDP-8/L with serial console and a whopping 32-kiloword disk drive.

All available for $25K at the time (PDF).

Ian’s project is for the Cyclone III of an old’ish Nios II Embedded Evaluation Kit. Luckily, the DE0-Nano has enough on-chip BRAM memory to take its place. Since the DE0-Nano has no serial port, I had to hook up an extra FTDI-BUB to its I/O pins:

Here’s Altera Quartus’s synthesis summary:

And then the (old-fashioned) fun begins…


That’s it – the command prompt of DEC’s Disk Monitor System (PDF) for the PDP-8 series. DMS reserves the top two “pages” (i.e. 2x 128 words) in memory, and loads everything else on demand. You really have to be careful when total RAM is 4K words (in the PDP-8’s smallest configuration).

Here’s a directory listing:





PIP .SYS (0) 0025
EDIT.SYS (0) 0016
LOAD.SYS (0) 0011
.CD..SYS (0) 0007
PALD.SYS (0) 0037
DDT .SYS (0) 0002
.DDT.USER(0) 0022
.SYM.USER(0) 0022
FORT.SYS (0) 0010
.FT..SYS (0) 0035
.OS..SYS (0) 0025
FOSL.SYS (0) 0010
STBL.SYS (0) 0001
DIAG.SYS (0) 0004
TF  .ASCII   0001
TX  .FTC BIN 0001
FCL .SYS (0) 0037


An editor, assembler (PALD), debugger, Fortran (!), Focal, and… 15 (octal?) disk blocks free, i.e. under 2 KB. Ouch …

Let’s look at that “TF” source code:


	DO 10 I=1,10
	TYPE 20,I

Now let’s compile and run it:


HELLO: 1    
HELLO: 2    
HELLO: 3    
HELLO: 4    
HELLO: 5    
HELLO: 6    
HELLO: 7    
HELLO: 8    
HELLO: 9    
HELLO: 10   !

Amazing. There’s even a way to dump this Fortran compiler’s internal symbol table:


  5230  7574

I     7576

0020  5212
0010  5226


Now imagine consuming a thousand times more power, running the terminal I/O at a thousand times slower pace, having to accommodate a ginormous cupboard in the room, and hearing the (very loud!) sound of a rattling ASR-33 “teletype” all day – then you’ll have an idea of how computers were starting to invade the office in the 1960’s …

The PDP-8 was before my time, but I did get a chance once to play with one. It made a huge impression. And it made history!

For comments, visit the forum.

Weblog © Jean-Claude Wippler. Generated by Hugo.