Programming In the Concrete
One of the techniques that I have been employing with great success the past few years has been thinking of objects in the concrete. Rather than writing about objects in the abstract fashion, talking about them in class hierarchies, type systems, or connected nodes in a theoretical graph, I treat them as if they were physical objects. The technique requires you strive for a 1-to-1 relationship between the object in the program and the thing it represents in the physical world. The focus of programming shifts away from the structure and methods of the supporting language towards the elements of the problem at hand. And with a steady hand and a special ruthlessness, it turns out it is possible to strip away most of the unnecessary scaffolding that obscures the beautiful solution underneath.
Take for example a typical MVC application; we will keep a log of how many steps our pedometer measure day by day, and display a user friendly set of reports. To make matters clear, let's put this on the web and assume our pedometer has a nice CSV export format that it stores in a flat file on an internal flash drive. The tasks that a person will need to complete to make use of the application:
- plug pedometer into a micro USB cable attached to their laptop
- sign in on the website
- upload their latest data
- view their dashboard with several different graphs and tables
- sign out
For our purposes, we will hide from the user all of the tricky bits we can. We will make sure they don't:
- upload the same data more than once
- have to choose what sorts of reports they would like to see
- need to look up instructions on how to plug their device in
None of these tasks are particularly hard for the user, but they are annoying and unnecessary. Anyone who wanted fine grained reports they could tailor would already import the CSV into a spreadsheet and build their report themselves. Uploading the same data twice is a problem that is more easily handled by the computer than by the human, especially if the human uploads a week's worth of data at a time, and doesn't remember the last time they plugged in. And no one ever has the fucking manual handy, especially those crappy little pieces of paper written double sided in 15 different languages. So with these tasks in mind we will have:
- a two panel welcome screen with a place to sign in on the right and a set of instructions with pictures on the left
- when you sign in the right panel has a big shiny upload button, the instructions shift to how to find your files, and your old reports load below
- when the file upload is complete the two panel shifts left revealing the updated reports and historical data
Underlying this system, we will need:
- a web server
- a data store for collecting data ona per-user basis
- a backend process to parse the CSV file and update the data files
- a monitoring service that keeps track of the availability of these pieces
With this decomposition in mind we can start breaking down the components into things we can actually think about in concrete terms. Since we are going to limit ourselves to things that have some tangible reality to them, let's focus on the UI for starters. It is always a good idea to show your client something, and if your client can't see it, it isn't tangible enough. All computer UIs are made up of 4ish resource types:
You can think of these as a hierarchy, or a set of degenerative cases of a fundamental base type, but none of that hand wringing will actually build you a functional UI. If one really wanted to model the problem, there is only Video, with an Image being a single frame video without sound, Sound a Video without images, and Text is just a special case of image. Similarly, a Button is just one of Text, Image, or Video that responds to a click. You can even have a Sound button, but its bounding box is in time not space. And as you can see, worrying about these metaphysics of GUIs does a great job at filling pages, but provides little immediate gain.
Back to our instruction panel, we will construct it of Text, Images, and possibly a Video to demonstrate finding the right end of the micro USB cable, and plugging it in with the correct orientation. The play/pause button will just be an Image that sends "play" and "pause" messages to the Video object. For screen readers and other assistive technologies, we will use Text objects to contain the copy of the instructions, but will not focus on structuring it beyond what is necessary to make it legible. Little to no semantic markup is necessary because few to none of our users for this data will make use of it. The remainder of the UI is just Text or Image objects that respond to the Mouse or the Keyboard driven messages. The animations and transitions are managed by the Timer object which maintains the animation heartbeat. By convention, I tend to use "tick" messages each frame at a fixed rate (say 20-60Hz) and a "tock" message on second intervals (1Hz frequency). This allows each piece of the application's UI to manage its own acceleration and velocity. The only thing a Text or Image element needs to animate is a handler for the "tick" message. By changing the position, size, orientation, or contents, we can produce any animation effect we can imagine.
By allowing each component to manage its own display, the complexity of coordinating the objects falls to the flow of messages across the interfaces of each object. To make this efficient, a number of techniques borrowed from electrical and network engineering are used to route messages to the correct objects. Wire objects allow for point to point communication between any two objects. Message passing on a wire is usually half duplex, though synchronous protocols often require either two wires or a full duplex channel. Locking and mutexes are implemented as latches on a wire, that allows for programmatic access to enable or disable message passing between two objects. For one to many broadcast messages, typically a Hub object is used to pass the message to all objects attached to the Hub. Keyboard, Mouse, Touchpad, and Timer objects typically communicate to all of the views via a Hub. The process of receiving a message is typically done by exposing a method of the same name to the Hub, which will then call that method on the specific object passing the event as data. This model means that there is only one object listening for events of any one type, a Keyboard controller listens for key related events like press and release, and forwards on to all subscribers to its Hub. Likewise, Mouse and Touchpad events fire off messages to their Hub; these two sharing a Hub as they tend to generate pointer related events like down, move, and up. The Timer too broadcasts tick and tock to the Hub to manage animation states and help synchronize various timeouts. For example, a double click is two down events within the same region within usually 200ms of each other, (4 ticks @ 20hz, 12 ticks @ 60Hz). Since individual views may listen for events outside of their normal hit box, they may also respond to events directed at other objects. A typical usecase for this is a focus change when another object is selected, which is particularly useful in a full screen application you hit with your thumbs on a subway.
A Router/Switch object allows for many-to-many dispatching, often applying a pattern match against specific elements of the message itself to subselect a group of objects to which it will forward the message. In general, Router/Switch objects are mostly used to process data from Network objects which handle message passing to objects outside of the immediate application's context. WebSockets, AMQP messages, and AJAX all provide methods for performing remote procedure calls that benefit from a Router object. Often a local Cache object, the Timer, and one or more views are coordinated through a Router when processing data coming in from a Network object. The ability to multiplex a typical onData event is crucial since the remote application typically has no way to safely dispatch to local objects. The Router is a way to define policy to hide or expose local interfaces based on the state of the application. As objects send messages to the Router to negotiate the routes a message may take, they may add or remove routes as necessary, and leave the delivery, retry, and cancellation logic to the Network.
When combined together, Text, Images, Sounds, Videos, Keyboards, Mice, Touchpads, Timers, Networks, Wires, Hub, and Routers act as building block that connect together to form applications. The messages they pass and expose act like the pins and sockets of Lego to latch together at well defined points. The interactions of the various message passing structures allow for complex behaviors to be modeled using a handful of simple constructs. By mimicking the tangible and visual elements of the computer experience the programming model for these controllers allows the programmer to cut through the bullshit of web technology and focus on the actual user experience, comforted by a model that matches the hardware.
Taking this microcosm/macrocosm idiom to the backend, we can repeat this process to model our infrastructure in a similar fashion. Our data store will contain:
The behavior of the backend is composed of the same principles as the front end. Each model in our data store represents a thing that expresses some set of behaviors. Each behavior is expressed when a message is passed to the applicable object. For example, let's take sign on to the application and how it sends messages throughout the backend. When the user enters their email address and password a collection of objects are notified:
- People are queried for the associated email address
- Pedometers are queried for the associated devices
- Reports are queried for the associated reports
- Instructions are queried for the instructions associated with the associated Pedometers
- Credentials are queried against the user's salted password
- Logs are notified of the sign in attempt, and possible credential failure
- Workers are updated as a new backend process is spun up to prepare for updating the user's reports
So a single sign in message may trigger a flurry of activity across every model in the backend infrastructure. In more complex applications, notifications may be sent to coworkers or friends, saved searches may be updated, caches primed, and offline data synchronized with the shared repository. The more that can be prepared and baked initially, the faster and more responsive the application will feel. By performing incremental updates as soon as information is available, the backend can avoid the single most infuriating thing a user experiences: Waiting. When it comes to user friendliness, managing the perception of the passage of time is the single most important task of the developer. Animations and transitions distract users from the passage of time. Too long of a transition, however, and the user grows bored and realizes the time elapsed. Like with the Timer object in the UI, the backend must treat time with the same degree of care and attention. 200ms is your sweet spot for most network interactions. It is not an acciendent that most multiplayer online games become unplayable above 200ms round trip times. 200ms is the equivalent of 5 actions per second, or 300 actions per minute. As a skilled typist may enter 120-200 words per minute, this lag is effectively imperceptible to ordinary users.
Since each of these models are supposed to act as if they were real things in the real world, we have an expectation that they will update roughly with the speed that we experience in the real world. Most real state does not update on a rate slower than human perception. We can with a clever UI trick the user into perceiving changes happening far faster than they do, but we won't get away with running our backend update frequency slower than 5Hz. It is for this reason I tend to prefer persistent workers that process a user's requests over on-demand per request process or thread allocations. It is often cheaper to provision your load to reduce spikiness. When you provision for peak on demand, you are gambling that day to day usage will not substantially change AND that changes to your application or usage patterns will not degrade performance beyond your margin of safety. Treating one or more Workers as User dedicated resources provides a simpler model of scaling. If the user does not require that processing power at a moment in time, their allocation remains accounted for ensuring that when it is demanded resources are available. Servers can then be scaled on an on demand basis using public/private clouds, ensuring that the cost per user accurately reflects the desire level of responsiveness. While it may be tempting to "overbook" a flight of servers, this practice will eventually lead to catastrophic failure for usually little to no marginal gain. The reason for this is due to the cost of emergencies on the focus of the business, and opportunity costs left on the table on time spent maintaining past sales at the expense of future growing the business.
With that in mind, modeling server resources as finite, tangible, things, like slices of an apple, makes accounting and planning for growth (or decline) much more manageable. It also has the advantage of using the same Router, Hub, and Wire controllers for managing message flows conceptually congruent with the infrastructure itself. Distributing load across a server farm requires flattening the message passing volumes and physically locating Workers across different physical resources. By treating the worker as a unit of computer resources, it becomes possible to encode