Polish notation and tiny words Dec 2016
Ok, so what is this “Forth” thing like, as programming language?
One way to answer this question, is to point to the Forth in 7 easy steps article, published a while back on this weblog. Go ahead and read it, it’ll definitely give you a first impression.
Forth was invented by Charles Moore, an independent thinker who is clearly not bound by “mainstream conventions”. Here’s an interview with him about Forth. And here is another one. Forth is an old language (just like C), but it’s still evolving in surprising directions, as shown in his 2013 presentation of an ultra-efficient asynchronous 144-core Forth chip.
It’s hard to capture the essence. If it had to be one word, perhaps that word should be “less”. Forth needs less resources and its notation is very minimalistic. That doesn’t make it crude or limited - but it does force you to think about accomplishing your tasks with less machinery.
A lot of what algorithmic programming languages brought into this world will be
of little use when programming in Forth. Expressions are not written in
algebraic form (”2*a+1
”) but in Reverse Polish
Notation) (”2 a * 1
+
”). Local state is not in locally named variables, but on the data stack.
Calls use an explicit return stack, which can be manipulated within Forth.
The reason for this is not to make life harder – though it might well be if you’ve grown used to thinking that everything has to be done in a certain way – but because Forth is a very carefully crafted compromise between capability and complexity. Making things as simple as possible, but no simpler could easily have been Forth’s guiding principle.
So the way to look at Forth is not “why on earth does Forth do it differently?” but: “how far can you extend a really simple and clean design without dragging in complexity?” - and the result is Forth: two stacks, a dictionary of words, RPN, if-else-then, short-but-rich names, and more.
The price paid is that it can take quite some effort to get used to all this. The conciseness is a consequence of the extreme expressiveness of the language. Math can’t be written in prose. Forth can’t be parsed and skimmed in the traditional sense - it’s not a series of statements.
Forth code tends to have few comments, and even then, many comments will be describing “stack effects”, not a sentence explaining what the code does. To clarify what the code does, you break it up into separate definitions (“words”) and give each one a descriptive name.
Forth is a concatenative programming language, which means – vaguely speaking – that you can take any phrase (i.e. sequence of Forth words) and replace it with a single new definition, without altering the behaviour of the code. When you see code such as this:
...
... a b c ...
...
... a b c ...
...
Then you can in general replace it with this:
: blah a b c ;
...
... blah ...
...
... blah ...
...
Even without having a clue what a
, b
, and c
do! Artificial as it may look,
this turns out be very useful - once you start shuffling code around like this,
you will often discover that there is a perfectly good name for blah
, which
describes what it’s intent is. Re-factoring becomes a fascinating way to gain
more insight in the code you’ve already written. Instead of the mantra “turn
everything you plan to re-use into a function”, in Forth you end up writing
code in whatever way makes sense right now, and then later discovering that
there is more generality to be extracted and turned into new words.
The risk of top-down design is over-generalising before there is a need, and Forth constantly nudges you to stop doing that. Write what you need now. Do not parameterise a word, just because it might be re-used later. When that happens, you’ll either remember that you already wrote something similar and re-use it, or you won’t - in which case the visual similarity of the code will shout out at you later on.
Design is about patterns. Patterns of logic and patterns of notation. So is Forth.
Next up: installing Mecrisp Forth on an F103 and trying it out for real!