Closures versus Objects Jul 26, 2017

For a recent project I’ve been working on, I had to deal will lots of small components, each with somewhat different behaviour, and had tried several times (unsuccessfully) to come up with a good design for them.

Solving such puzzles tends to depend surprisingly deeply on the programming language used: in C, you’d need to create an ad-hoc object system and muck around with function pointers. In Java and C++ you’d set up an elaborate class hierarchy. Both of them fairly tedious, to put it mildly…

In a scripting language, it’s much easier to play games with the data structures, due to their dynamic typing, and to use tricks such as monkey patching - Python, Ruby, etc.

And in JavaScript, there’s also prototypical inheritance, which lets you tweak methods on a per-object basis. This is very flexible.

Although all of these are doable, they tend to work on the basis of a very centralised mutable mindset. Allow me to clarify this:

OOP has objects with state and methods which use and operate on that state. This is a great way to tackle complexity, by offering encapsulation (hiding data differences) and polymorphism (hiding code differences).

Methods need a backdoor into the object they’re working for, using a “this” pointer. Because all of the state lives in the object.

Now imagine you need a new object variant which differs only in a single method, say calculating and tracking a running average.

This requires new state to hold that running average, which then gets updated on each call. With the object-oriented model, you need to extend the object, i.e. create a subclass which has that extra state, and then override the method to access that state - not the end of the world, right?

But step back for a moment to think what happened: just because a single method (and possibly a single object instance!) needs to track extra state, we have to come up with a new object data structure, even though it doesn’t concern or affect any other method, or state for that matter.

Assuming most non-trivial objects have lots of methods, why do we have to invent a new class, come up with a new name for it, or rely on (redundant) inheritance - just to make one method do something which requires purely local state?

Worse still, repeating the process a few times with several methods being extended independently: what sort of messy class hierarchy are we going to end up with?

There are GoF patterns to deal with this, but hey, there’s in fact a much simpler solution in languages which support a mechanism called function closures.

The idea is that functions can access the context in which they’ve been defined. Here’s an example in Go, which should be readable even if you haven’t ever used it:

var lastAvg int

func Average(x int) int {
    lastAvg = (9*lastAvg+x) / 10
    return lastAvg
}

Wait a minute: that’s silly. Using globals? Where’s the OO? Even C code can do this!

True. I only included the above to illustrate the weakest form of lexical scoping: access to a static / global variable from inside a function, to illustrate how a function can manage state, i.e. how it can access data for its own private use (this is not a closure!).

Now compare the above code to this:

func Averager() func(int) int {
    var lastAvg int

    return func(x int) int {
        lastAvg = (9*lastAvg+x) / 10
        return lastAvg
    }
}

This is a function returning a function and also a peek into functional programming. By calling Averager() (note the extra “r”), we get a function back which does averaging, using a state variable which is inaccessible from the outside (very OO-like!) and which has exactly the same signature as that first version, i.e. taking and returning an int:

var Average = Averager()
var result = Average(123);

Note that we can now also set up multiple independent averagers, again very OO-like:

var Average1 = Averager()
var Average2 = Averager()

var result = Average1(123);
... = Average2(456) // does not affect above result

One way to look at closures, is as functions which act as if an ad-hoc object has been created for them and passed in through some invisible mechanism (which is how closures are implemented under the hood).

So how does this fancy machinery help us w.r.t. the original problem of this post?

Some Go code to illustrate the approach:

type MyObj struct {
    ... state ...
    MyMethod func(int) int
}

func MakeNewType() *MyObj {
    o := new(MyObj)
    var lastAvg int
    o.MyMethod = func(x int) int {
        lastAvg = (9*lastAvg+x) / 10
        return lastAvg
    }
    return o
}

func Example() {
    myObj := MakeNewType()
    println(myObj.MyMethod(123))
    println(myObj.MyMethod(456))
    println(myObj.MyMethod(789))
}

This code is admittedly fairly contrived, but note how easy it is to create an object with a custom MyMethod just for this variant, and how the local state required for averaging is not part of the object. Only MakeNewType() knows about the extra state, introducing it just before it gets used. This approach is just as encapsulated and polymorphic as class-based OOP, without needing an elaborate class hierarchy for objects that require slight variations in behaviour (i.e. code).

Both Go and JavaScript support this type of closure as first-class citizens. In the next weeks, I’ll explore another language which encourages this technique - stay tuned!

Weblog © Jean-Claude Wippler. Generated by Hugo.