Contemplations on functional programming libraries, part 1

2013-09-28

Since wading into Erlang and functional programming last summer, I’ve struggled with defining library interfaces. There are several mechanisms for allowing a developer to customize the behavior of a library function, and it’s rarely evident to me which approach is best.

I’m facing these choices in a new library right now, so it seemed an opportune time to think out loud and treat you, dear reader, as my rubber duck while I find my way through this maze.

Are you not flattered?

C what you made me do?

My first serious programming endeavours were in C, which made this question largely moot. If alternative behaviors were needed, multiple function names with the same basic functionality would do the trick, but more common were static options as additional arguments to the function.

The POSIX regcomp function is an example of the latter approach, using the convention loved by C programmers around the world: bits crammed into an integer argument.

This…lacks a certain panache in functional programming, but certainly lives on; see the Erlang re module for the same tactic with atoms¹ instead of bits.

OOh, interesting

In one (very hand-wavy) sense, object-oriented polymorphism achieves a similar effect to offering different function names for custom behavior in C. In fact, I believe the first C++ compiler was just a preprocessor that converted methods to C functions with unique names.

And reconfiguring an object so that future method invocations give different results is not meaningfully different from passing options to functions, just much more opaque (typically in bad ways).

Passing objects as arguments for callback purposes is quite similar to passing functions, which we’ll get to shortly.

Pattern recognition

Erlang pattern matching² in function heads is a form of function overloading. In short, any given function name can have effectively an infinite number of different behaviors depending on the type³ of the arguments.

It would be quite interesting (and perverse) to see a non-trivial problem solved using only one function name.

Processing with processes

Erlang offers a way to modify future function calls to a library by wrapping it inside a process and tweaking the state, much like you can tweak an object in OO programming to change method calls.

One example would be a Twitter library that needs authentication information; the library is initialized with user’s credentials instead of passing them to each library call.

Whether an Erlang library should be encapsulated in a process is a key decision; it gives the library author more tools, but means that there’s yet another process that must be managed and restarted should it fail. Fortunately OTP makes it straightforward to implement such supervision.

Fun with funs

Now we’re getting to the heart of functional programming.

I enjoyed playing with function pointers in C, but it was never more than a curiosity.

In Erlang (and obviously even moreso in LISPy languages) anonymous functions offer extremely powerful ways to customize library behaviors, especially for list processing.

For the benefit of anyone who’s unfamiliar with Erlang or FP in general, here’s an example of iterating over a list of integers using Erlang’s lists module and an anonymous function.

1> lists:foreach(fun(X) -> io:format("~B~n", [X]) end, [1, 2, 3, 4, 5]).
1
2
3
4
5
ok

Specifically, fun(X) -> io:format("~B~n", [X]) end is an anonymous function that is applied by lists:foreach against each element of the list [1, 2, 3, 4, 5].

Unsurprisingly, most uses of anonymous functions are a bit more interesting.

Canned funs

Something which the lists module in Erlang does not do because it’s a library with broad applicability, but which I find useful for more focused libraries, is to supply functions that can be sent as parameters to other functions.

We’ll return to this idea when we look at the library I’m working on.

This composition needs a conductor

And now we arrive at the approach which gives me the most heartburn, because it feels like the obvious way to make libraries as flexible as possible, and simultaneously can place too much of a burden on the developer who is using the library.

In all of our discussions above, we’ve looked at ways to allow a user of the library to influence the behavior of a library function.

An alternative is to not supply a general-purpose function, but rather to provide a selection of smaller filtering and transformation functions from which the user can compose a custom pipeline.

Taken to its extreme, however, a library built this way feels less than helpful, because the glue code for error handling and branching in such a pipeline is typically not trivial, or at least shouldn’t need to be repeated every time someone uses the library.

What’s next?

What I expected to be a 2-3 hour task of taking code I’d already written and turning it into a library has become an 8+ hour project of trying to make it general enough to be a useful library, and then writing this blog post when I struggled with architectural decisions.

With luck, I’ll have a followup to this soon to look at my library and talk about the choices I’ve made. The library itself isn’t particularly interesting, but hopefully the thought process is.

Disclaimer

As anyone who has read earlier posts by me has probably gleaned, nothing I write on software development should be treated as authoritative. I’ve been writing UNIX software off and on for 20 years now, but with relatively little formal education and little exposure to collaborative programming with real developers.

Of all the features of Erlang which I find liberating, I think atoms may be the most immediately empowering. No more worrying about defining global variables or worrying about #define compiler processing instructions to include in all of your C source code, or enums, or worse: relying on shared knowledge of the meaning of integer values without defining them anywhere. And no extra syntax (usually) since variables are always capitalized. So so nice.↩
Ok, on second thought, pattern matching may be the most immediately-empowering Erlang feature.↩
Yes, I know Erlang doesn’t have much of a type system; certainly, you can’t overload functions by declaring one to take an integer and another to take a floating point number, at least directly, but atoms and tuples offer flexible alternatives.↩

erlang libraries

NoSQL is dead. Long live NoSQL! Don't let media or marketing drive your tech decisions

Good documentation, or, how not to be like Twitter In which I rant and celebrate and lament