Named Continuations and Distributed State Machine

In my last entry, I described a style of programming where in the program was the flow of messages through a system. Parallelism was a direct side effect of the capacity to send multiple copies of a message to multiple actors at once. Depending on your concurrency the behaviors resulting from the reception if a message would either be invoked contemporaneously or in sequence. But core to this model was the binding of messages to behaviors, as each message name was indicative of a state change desired across a local graph of the system (local as in the communicating nodes). Each state change has locality, in that each recipient models the change in its own way. State is not transferred in its entirety, as that would require replicating the entire state of the system in every message. Similarly, state is not shared between objects, although two or more objects in the system may model identical states.

This encapsulation of state among distributed objects makes it possible to localize processing to where the action is. It also allows for mirroring of state across many geographic regions, as long as there is sufficient network between the different locals. If one can reliably replay the same set of events on a remote node, as long as the processing is deterministic the results will be the same. The core observation to make here is that it is not the system being modeled that needs to be deterministic, but the behavior of the modeling application itself. At scale, however, as the aggregation of components increases, the system as a whole will generally start to exhibit non-linear characteristics, as the flow of messages dampen and drive components in the system that generate signals.

The interesting thing about building non-linear programs is you have many futures. The continuation of the program is dependent on the possible future states and availability of each class of component, though not necessarily any specific component. Fault tolerance forces designs where in local state must be mirrored for critical state machines. But at the same time, availability mandates that components which mutate those states be elastic and often ephemeral. These two competing forces shape the designs of all distributed systems that work in practice. Fault tolerance requires that we duplicate data and work across the system, whereas availability forces us to minimize the amount of work any component does any unit time.

To cope with this, one technique that has had a resurgence in popularity is event driven programming. This style of programming is characterized by a core loop which dispatches functions or methods based on a queue of event objects. As the queue is double ended, adding and removing can be done in parallel safely with a minimal amount of checking. Device events, software interrupts, and program threads can all be accommodated with this very simple state machine.

The down side of event driven programming is that it can be very difficult to trace program flows. The simplicity of the event loop belies the complex flows that can be generated. Loops can be constructed by directly injecting an additional event of the same type or by injecting an event which starts a chain leading back to the originating event. Without a directed graph depicting the possible flows through an application, maintaining a sufficient mental model becomes difficult. But with one, your program behaves similarly to a Markov Chain, and detecting and explaining various program flows begin to yield to graph theory.

My programming tool graphie provides a simple means to generate a distributed program configuration via a text based directed graph. Each node is a named continuation tied to an event of the same name. The collection of arrows form a graph describing the possible interactions / state changes in the system. Each node can store information and local state information, but all of the state transitions occur based on messages sends which invoke the next state of the system. From the point of view of the programmer attempting to understand the interactions between the components of the system, the graph provides a summary view, that help understand the possible state changes. But ultimately, one needs to read all of the code to understand why states transition from one to the next.