TFoC - A minimal computer Nov 8, 2017

(This article is part of the The Fabric of Computing series: in search of simplicity)

With today’s programming languages, it’s easy to ignore what goes on inside. Maybe you’ve never even bothered to find out.

The trouble is that there are only so many hours in a day. As long as the chip is usable from a high-level language, who cares?

Well… I do. It’s fun to open up the hood and figure out how things are built and configured internally. Often with lots of clever and creative engineering.

There’s an older but delightfully minimal CPU by Tim Böscke, called the MCPU. His design fits into a CPLD with a mere thirty-two macrocells plus 64 bytes of RAM.

The MCPU project now lives on GitHub, including a 6-page PDF. But this post is not about duplicating his design, or simulating it at the logic level - instead, I’d like to show how to create an assembler and emulator for this CPU with only a few dozen lines of code. At the end, we’ll use this to run a little prime-number generator written in MCPU-assembler, and then re-establish the earth-shattering fact that the 52nd prime is 239!

The CPU is the computer’s brain. It steps through code stored in memory, performs computations, and makes decisions. Here’s a tiny example program for this MCPU:

3E 45 7F C2 C4 FA 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF 01

These machine instructions are specific for MCPU, but every computer has ‘em, and they all look like this sort of gibberish.

Here is an emulator, written in Python, which follows the MCPU’s rules and executes / interprets the above program:

$ python emu.py <test.obj
pc,ir,acc,cf:   0   0x3e    0   0
pc,ir,acc,cf:   1   0x45    0   0
pc,ir,acc,cf:   2   0x7f    250 0
pc,ir,acc,cf:   3   0xc2    251 0
pc,ir,acc,cf:   2   0x7f    251 0
pc,ir,acc,cf:   3   0xc2    252 0
pc,ir,acc,cf:   2   0x7f    252 0
pc,ir,acc,cf:   3   0xc2    253 0
pc,ir,acc,cf:   2   0x7f    253 0
pc,ir,acc,cf:   3   0xc2    254 0
pc,ir,acc,cf:   2   0x7f    254 0
pc,ir,acc,cf:   3   0xc2    255 0
pc,ir,acc,cf:   2   0x7f    255 0
pc,ir,acc,cf:   3   0xc2    0   1
pc,ir,acc,cf:   4   0xc4    0   0
pc,ir,acc,cf:   4   0xc4    0   0
pc,ir,acc,cf:   4   0xc4    0   0
pc,ir,acc,cf:   4   0xc4    0   0
pc,ir,acc,cf:   4   0xc4    0   0
pc,ir,acc,cf:   4   0xc4    0   0
$

There’s a counter in there, which counts to 255, “overflows” to 0, and then stops.

The above will not make much sense so far, but perhaps this will help:

        org 0

        lda count
cloop   add one
        jcc cloop

done    jcc done

count   dcb 250

That’s the same code, in “MCPU-assembly” language: load a count in the accumulator, add one, and loop until the carry flag is set. Then we enter an infinite loop.

Half a century ago, this would probably have been state of the art. The assembly language freed programmers from looking up codes and entering them by hand. Even more importantly, the assembler figured out all the memory addresses and jump distances, with support for human-friendly labels (cloop / done / count in this case).

The big jump is that the assembler is itself a computer program. In the old days these were very complex, but with the help of a bit of Python scripting, we can now boil it down to just a few dozen lines of code:

# MCPU assembler, see https://github.com/cpldcpu/MCPU
#   python asm.py <test.asm >test.obj

from __future__ import print_function
import sys

def valof(arg):
    if arg in syms:
        arg = syms[arg]
    try:
        return int(arg)
    except:
        return 0

# pseudo ops
def org(arg):
    global pc
    pc = valof(arg)
def dcb(arg):
    global pc
    mem[pc & 0x3F] = valof(arg)
    pc += 1

# instruction set
def nor(arg): dcb(valof(arg) | 0x00)
def add(arg): dcb(valof(arg) | 0x40)
def sta(arg): dcb(valof(arg) | 0x80)
def jcc(arg): dcb(valof(arg) | 0xC0)

# combined instructions
def jcs(arg): jcc(pc+2); jcc(arg)
def jmp(arg): jcc(arg); jcc(arg)
def lda(arg): nor('allone'); add(arg)
def sub(arg): nor('zero'); add(arg); add('one')
def out(arg): dcb(0xFF)

code = sys.stdin.readlines()
syms = {}

for _ in range(2):
    pc = 0
    mem = 64 * [0]
    for line in code:
        fields = line.split()
        if len(fields) > 0 and line[0] != '#':
            if line.lstrip() == line:
                syms[fields[0]] = pc
                del fields[:1]
            f = globals()[fields[0]]
            f(fields[1])

s = '\n'
print(s.join([format(x, '02X') for x in mem]))

That’s the entire two-pass assembler. We need two passes because lines may refer to labels defined later on, so the first pass figures out all label values and the second pass then generates the final code.

The emulator for this charmingly-simple MCPU processor is equally small:

# MCPU emulator, see https://github.com/cpldcpu/MCPU
#   python asm.py <... | python emu.py
# optional arg is # of cycles to run
# show instruction trace if cycles < 1000

from __future__ import print_function
import sys

try:
    cycles = int(sys.argv[1])
except:
    cycles = 20

mem = [int(x, 16) for x in sys.stdin.readlines()]
assert len(mem) == 64

pc = 0
acc = 0
cf = 0

for _ in range(cycles):
    ir = mem[pc & 0x3F]
    pc += 1
    op, arg = ir >> 6, ir & 0x3F

    if cycles < 1000:
        print('pc,ir,acc,cf:', pc-1, hex(ir), acc, cf >> 8, sep='\t')

    if op == 0:
        acc = (acc | mem[arg]) ^ 0xFF
    elif op == 1:
        acc += mem[arg]
        cf = acc & 0x100
        acc &= 0xFF
    elif op == 2:
        mem[arg] = acc
    else:
        if ir == 0xFF:
            print(acc)
        elif cf == 0:
            pc = arg
        cf = 0

With these tools, we can now tackle a more substantial task - generate prime numbers:

        org 61
zero    dcb 0
allone  dcb 255
one     dcb 1
        org 0

        lda two
        out *

start   lda allone
        add allone
        sta subs

loop    lda number
inner   add subs
        jcs inner

        sub subs
        add allone
        jcc noprime

        lda subs
        add allone
        sta subs

        add number
        add allone
        jcs loop

        lda number
        out *

noprime lda number
        add two
        sta number

        jmp start

two     dcb 2
subs    dcb 0
number  dcb 3

Assembling the above code produces this:

3E 62 FF 3E 7E 7E A3 3E 64 63 CC C9 3D 63 7F 7E
DC 3E 63 7E A3 64 7E D9 C7 3E 64 FF 3E 64 62 A4
C3 C3 02 00 03 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF 01

If we feed that into the emulator, we get:

$ python asm.py <prime.asm | python emu.py 200000
2
3
5
[...]
229
233
239
$

Fifty two lines of output, and sure enough the last line is 239, which we can easily verify to be correct with Wolfram Alpha.

There you have it: an assembler + emulator which are simple enough to understand with just a little bit of code-reading.

(The way prime numbers are “computed” is quite a brain teaser in itself, by the way.)

Sooo… next time you start writing software on your computer, note that inside it, all of the above is happening - the pioneers who came up with these ideas live on as the giants whose shoulders we now stand on…

PS. The source files are available as gist.

Weblog © Jean-Claude Wippler. Generated by Hugo.