Observations on the Future of Programming

Over the past year, I have been writing a large amount of infrastructure using old technology in novel ways. Rather than focusing on building applications that have large scale structure, I have been focusing on writing lots of little actors which communicate through messaging. Effectively, I have been writing Erlang code in a half dozen languages (including Erlang). Lacking the OTP for the other languages, I've found it necessary to use an Erlang broker to effectively manage some of the finer details of building a reliable distributed system. By delegating the responsibilities for tracking different states to different actors on a graph of communications, the system is capable operating with a high degree of separation of concerns. By not worrying about global state at any point, it has also been liberating allowing for implementation details to change radically without impacting the stability of the entire system or its design.

As these developments move into the future, I am increasingly convinced that certain these general techniques will gain popularity over the next 20 years. While there has been an increasing awareness of message passing architectures, most of the language tooling has not yet caught up with the needs of wide scale distributed system. Even new languages like Go suffer from a cultural legacy which makes message passing a second class citizen, where as old languages like Smalltalk have the capability of making the shift, but only at the expense of a major mental shift for all of the code. The real mental leap which I found made actor model programming easy was discarding the notion of return values and functions. In my new programs, no method attached to a function returns a value. Every change of flow control is a message send.

Take for the typical identify of:

f(x) -> x

There is actually an implicit message send which is more obvious if you write in a continuation passing form:

f(x,cc) -> cc(x)

Where if we replace function application with a message send so cc(x) is actually send x to the actor cc. When we add the concept of topologies in which we can have collections of continuations:

f(x, cc1, cc2, cc3, ... ccn ) -> cc1(x), cc2(x), cc3(x), ... ccn(x)

We can now model each operation as a parallel message send, without needing to define a strict order of operation for the "," operator. We can efficiently write parallel code without having to explicitly state when or if ever these continuations will synchronize. In fact, each of these operations could be executed on separate hardware at different clock frequencies, intervening networks, or delay. It can also be modeled simply on any single core system as a list of events in an event loop, further reducing the dependencies of each expression.

Now passing along a list of continuations is just maddening in practice. One would need to model all of the interaction of all future state machines and pass all possible flows to each function. But the reality becomes much simpler. Rather than pass a continuation, we merely name our state transitions, and make that state transition the primary identifier for the message. Each actor then needs only model how its own finite state machine reacts to the various state transitions of the entire system. And for nearly all actors the default transition for any given message is:

A(message) -> A

That is to say, the state of A is unaffected by the message, or A ignores the message. No operation by default provides a sort of global safety net, wherein the system may undergo many state transitions, and most objects will not be modified in any way shape or form. You can think of a variable:

var a

Now if the message sent to this variable is:

b = 10

Well "var a" doesn't know what to make of the message asserting "b = 10" and can safely ignore it. Should you send it a message:

a = 20

Then it would more rightly think that the state of "var a" should actually change because the message is asserting something about a, and that it's state should now change. It would be totally reasonable for the programmer to expect that the internal state of this variable may in some how reflect this change in the application. And at the same time, the variable "b" if it existed, would be rather unperturbed by this declaration of state change in the system.

Modeling an entire application by broadcasting each statement to every actor in the system and then allowing 99.9999% of them to ignore every statement would be while feasible, entirely too wasteful. It is actually an approach I had successfully implemented first around 1994, and would later use around 2004 in a number of game engines to allow each AI to react to changes in the program. Several of my front end web frameworks work exactly this way still today, where there are generally few enough objects that the extra effort of routing messages is actually more expensive than just sending to all of them. But as the programs grow more distributed, and more servers are involved adding a directed graph to describe the topology of the message sends between actors becomes a positive boon.

With a directed graph of message sends you can replicate all of the typical flow control structures. A loop is just a cycle in the graph, where in an actor may send itself a message or through a series of coroutines.

foo(x) -> bar(x)
bar(x)-> foo(x)

Simiarly, you can achieve the equivalent of conditional branch by having a true and false message handler:

foo( x % 2 == 0)
foo( true ) -> console.log("even")
foo( false ) -> console.log("odd")

The graph can be extended to handle various error conditions without even needing to raise an exception, as a default message handler (represented by the message ) can trap the state of a wayward system:

, debug ) -> debugger(here)
foo( , production ) -> exit("foo received an impossible message, I swear I debugged this thing before shipping!")

One can also use the graph to indicate many parallel actions

foo(x) -> bar(x), baz(x + 2), borf(x), narf(x), foo(x+1)

Even generating sequences, and other explosions like Unix fork bombs. When programming in this fashion, the program is the flow of message between the actors. The actors merely encapsulate the local state, deciding if it needs to change or not in response to the current context of the program.

Back in 2005, I used this technique to model poker games in which the rules of the game could change based on which cards had appeared during the game. There were also game variants where in the scoring of the game would change based on which cards were visible. In order to respond to each of these conditions, the game engines were written as finite state machines which could rewrite the continuation of the program on the fly, allowing the scoring, direction of play, and AI strategy to all change during runtime. Since we wanted to support home games wherein the rules could be injected at runtime, we also had to account for the fact we didn't know the full rule set ahead of time. By being able to alter the flow of messages on the fly, we also could allow the AI engine to reprogram itself based on the current rule set. The new values of the cards when one-eyed Jacks were wild, would influence the probability any given hand was worth playing / bluffing based on the prior seen count. You can't account for all the permutation with a static set of rules, but you can dynamically build a rules engine that adds and removes rules based on the context of the current game. I see the future of programming in much the same light.