Message Flow Programming with Parsing Expression Grammars Supporting Preconditional Dependencies

If you look at a typical PEG, you have a pattern matching engine which takes an input stream (typically of bytes) and generates productions when a match is found. As the behavior of the production could be anything from executing code, to compiling objects, to modifying the PEG itself, this flexible structure allows one to start with a small set of rules and define a language for discussing a given application. Missing in the typical formulations for Parsing Expression Grammars is a notion of context. They are largely inspired by context free grammars, and resemble BNFs in most implementations. This fear of context means that each rule either holds equal weight or is given a priority ordering when implemented. Much of this has to do with having the rules evaluate in a predictable manner so that the resulting grammar makes sense to a human, and not just a machine.

Two innovations which would accomodate a wide variety of new applications for PEGs in real world applications would break from these traditions. First if we allow for preconditions on matches which generate productions based upon a context sensitive view of the world, in which the prior history of the input stream could be used to match against necessary preconditions before exposing a section fo the grammar. So if we start with a typical production rule in which:

pattern -> product

Each time the pattern on the left hand side is matched the resulting product on the right hand side is evaluated. This construction of a rule provides a clear mapping, and the composition of rules allows for defining matches for rather complex input streams. If we define a precondition rule producing a dependency such that:

precondition <- pattern

We would establish that the pattern precondition must be matched prior to the pattern pattern is matchable. By using a simple lookback history, and applying the precondition pattern across it for each match, we could restrict the execution of the original rule if the precondition pattern had not previously been found. The precondition need not even have an explicity product attached to it either to be useful. When a set of preconditions are attached to a condition:

( P ) <- pattern

The back track could be performed in parallel for each precondition and restrict the operation of the pattern if any of the patterns in the set of preconditions P fail to match. This allows for sequencing multiple events within the modified PEG and model transitions in a statemachine within the grammar itself. By adding the context of history, we enable the PEG to shape the grammar to only allowable options. This means we can restrict matches that only make sense in a given context, and not need to worry about an ever increasing number of potential rule conflicts. A simple precondition rule can effectively implement modularity within a PEG, meaning that a rule is only in effect if the rule's module declartion was encountered. Adding the ability to nest contexts by having preconditions have preconditions, gives us the ability to define nested contexts within our grammars further approximating the encapsulation mechanisms found in more tradtiitional languages.

Once you have multiple preconditions, it also makes a lot of sense to look at having multiple rules match a single input stream. Rather than artificially serialize the evaluation of the productions by defining an ordering to the rules or simply evaluting the first product for a matching rule, if we allow multiple rules to match a single pattern, we can also gain explicity concurrency in the evaluation of productions:

pattern -> product1
pattern -> product2
pattern -> product3

In this example, pattern will match all 3 rules and if encountered will product all 3 products (product1,product2,product3). And rather than rely upon the ordering of the rules to serialize the evaluation of the products, we can simply make them explicitily concurrent, allowing each to evaluate within its own timeline. We could introduce a continuation operator ; in which the left hand side is first executed and its result discarded, and then the right hand side is evaluated as the continuation of the program:

pattern -> product1 ; product2
pattern -> product3

But that is less interesting as we could simply redefine product1 to be both product1 and product2 in sequence in our production rule itself. This is interesting as it is also an example of where an explicit precondition (that product1 be produced before product2) is terribly useful feature to add to grammars, but even more so when we have explicitly parallel grammars.

So how would one use this sort of thing in the real world? Consider a typical commercial laundry machine which requires you put in $4.50 in change, shut the door, and select a washing cycle before it begins to wash. The washing machine doesn't care what order you do it in, but it won't do anything if you press the wash button until you meet all of the necessary preconditions. From the standpoint of the washing machine, we can look at the user's input as an input stream defining a program written in our grammar. We have a limited set of words in our grammar as the interface to the washing machine is only a coin slot which only accepts quarters, a radio button which selects wash cycle, a door lock, and a compartment for soap, and a start button. Let's say you push the start button, and the function of that button is to send the wash string into the PEG. Each time you insert a quarter, the coin slot sends a coin method. And each time that you shut the door, it sends a lock message. Now the PEG reads these messages as a space separated stream of bytes and will match on them using a simple set of grammar rules:

coin -> update_money_display()
hot -> wash_cycle_hot()
cold -> wash_cycle_cold()
medium -> wash_cycle_medium()

These are all very self explanatory, if we see a coin we update the display, if we see the hot button pushed we set the wash cycle to hot, etc. Now let's consider what happens when we push the start button:

wash -> agitate_drum()
wash -> fill_with_water(current_wash_cycle())
wash -> release_soap()

Now all 3 of these activities happen in parallel and take different amounts of time to execute. The soap is dispensed in a few seconds, the water a minute or two, and the agitation can more than 20 minutes to complete. But that's a fairly good description of what it means to wash clothes. Now let's consider the preconditions:

coin18 <- wash # we need $4.50 in quarters before we start the wash
lock <- wash # door must be locked or the laundry would go all over the place and the floor would be soaked
(hot|cold|warm) <- wash # at least one must be selected | is an or operator

This set of preconditions which must be met ensures that the device is not put into an unsafe state, cost the landry money, or could ruin the clothes. We're not checking for soap, because our washer doesn't really care if you put it on the clothes ahead of time or dumped it in the little trap door which it will open when it starts to wash. By using a set of preconditional matches, we make the synchronization of the at least 20 steps in the input program which must occur before the washer will even work trivial. As this grammar can actually process an infinite number of variations of user behavior, it is fairly safe. Even if the user spent all of eternity changing which wash cycle he wanted, this grammar would produce a functional variation if and only if all of the necessary prerequisites were met. And what gets even cooler is when you think of using prerequisites to define post consitions:

wash -> start_timer()
wash <- spin
minute20 <- spin
minute -> update_display()
minute -> spin()
spin -> enter_spin_cycle()

So that even the internal state machine of the washer, in which the hardware timer is started and generates a minute message each minute. We then define a precondition for the spin cycle start message spin to be both a wash and minute repeated 20 times. Each time we recieve the minute message, we update the display to indicate the progression of time (usually counting down from 20), and send the spin message back into our input stream. Should the necessary precondition be met, this message will end the wash cycle and enter the spin cycle. This allows us to use the flexibility of our PEG to generate new inputs into our grammar and evaluate the resulting generated program. Further refinements may invole adding explicit dependencies on the termination of productions, cycle guards, and postconditional guards such as:

spin <- (wash -> noop() )
end <- ( wash -> agitate_drum(); wash -> fill_with_water(current_wash_cycle()); wash -> release_soap() )

Where in the grammar itself can be reset based on a set of preconditional matches as guards which prevent prior rules from having adverse side effects in new contexts. But these are thoughts for another day.