Beautiful Code

One thing to keep in mind when reading other people's code is that there is a difference between clever code and simply beautiful code. Quite a few, very intelligent, astoundingly capable programmers, find themselves, day after day, writing clever and elegant, but ugly as hell code. It is the sort of code that is a pleasure to write, but you'd rather burn for an eternity than have to spend a week reading it.

Making Reading Code Pleasant

Face it. We all have to do it sometime. We all have to read code. While many programmers feel they can do their jobs with their eyes shut, it often seems that they do. For example, if you fire up Xcode and start a new OpenGL application for the iPhone, you'll see some pre-generated code stubbed out to handle the basic tasks of getting a surface setup on which you can display your models. Here's a particularly telling piece of code:

eaglLayer.drawableProperties = NSDictionary dictionaryWithObjectsAndKeys:

[NSNumber numberWithBool:NO">NSDictionary dictionaryWithObjectsAndKeys:

[NSNumber numberWithBool:NO, kEAGLDrawablePropertyRetainedBacking, kEAGLColorFormatRGBA8, kEAGLDrawablePropertyColorFormat, nil];

You got that right? It is all perfectly clear exactly what this code does. If you've never seen Objective-C code before, this might be a little intimidating, but trust me this code follows nearly every convention intended to make code more readable. For example:
  • uses long descriptive variable and function names
  • repeats type information in the names of various variable and function names
  • prefixes predefined constant values with a k
  • uses an abbreviated prefix to clearly indicate which framework each constant is from
By any measures, a great deal of thought went into exactly how to name each element in this line of code. It is very consistent and makes use of no less than 3 distinct naming conventions. Now you might ask what is eaglLayer, but I'll give you a hint by showing the 2 lines that come before:

CAEAGLLayer eaglLayer = (CAEAGLLayer )self.layer;

eaglLayer.opaque = YES;

This clearly demonstrates the power of naming conventions over reason. For those who don't know what a CAEAGLLayer is it is a Core Animation Embedded Apple (Open) Graphics Library Layer. (GL as in OpenGL, which derives from IRIS GL which was the Integrated Raster Imaging System Graphics Library). Now the reason we're typecasting this into a new pointer is that the view's base layer is a CALayer (Core Animation Layer), and as such does not provide access to the extended drawableProperties parameter our CAEAGLLayer actually has.

How do we know that our layer is actually a CAEAGLLayer and not just a plain ordinary CALayer? Well we don't, we're just assuming this, and hoping somewhere, hidden in some configuration file, we set the initialization to the right thing. That's ok, it is all just black magic, and you'd never know this from reading any of the actual code. There's a comment that hints at it though:

//The GL view is stored in the nib file. When it's unarchived it's sent -initWithCoder:

And if you look through the nib file long enough you might just find out how this thing is initialized, but since the nib file was also auto-generated, and is edited using a GUI most novice Objective-C developers would just pull their hair out, not understanding why something does or doesn't work. Nevermind this comment is actually a bit misleading as the terms stored and unarchived don't exactly mean what you'd think they mean. As for what a Coder is, that's better left to another blog post entirely.

What would you rather say:

Now while this code functions, it is clearly the product of many seemingly good ideas gone horribly awry. It is not inherently the fault of some failure in the Objective-C language that this code is so difficult to read. Nor is it that what this code is doing is particularly difficult. The problem is rather that along multiple steps in the design process, a failure to think occurred. Rather than thinking about the code in a holistic fashion, focusing on the aggregate, the developers focused on minute aspects of the design without attention to unintended consequences.

Let's look at a few of the problems:
  1. Redundant naming conventions obscure semantics.
  2. Excessively Long Words
  3. Inappropriate use of class inheritance.
  4. Redundant type information.
  5. Semantic mismatch between method and product.
  6. Inversion of syntax against reasonable expectations.
1. Redundant Naming Conventions Obscure Semantics:

When API designers require naming conventions that produce obtuse and unwieldy code, they impose a large tax upon all future development. Looking at the two object instantiations using NSDictionary and NSNumber, you may be surprised to discover that these two class methods are both constructors. In the old NextStep convention, a class side constructor usually repeated the name of the class as the first word in the name of a constructor. The second part of this convention has to deal with the word With which serves to indicate that additional values are going to be supplied to the constructor.

2. Excessively Long Words

When reading through the code, the words kEAGLDrawablePropertyRetainedBacking, kEAGLColorFormatRGBA8, and kEAGLDrawablePropertyColorFormat, all make extensive use of naming conventions which help eliminate the remotest possibility of legible code. These words consist of 36, 21, and 32 characters each. In comparison, antidisestablishmentarianism has only 28 characters, and is not generally considered suitable for everyday speech, excluding perhaps discussions concerning overly long words.

3. Inappropriate Use of Class Inheritance

When you realize that the layer object that we are dealing with is a subclass of the layer object referenced by the object we're subclassing, most of this code makes perfect sense. The reason that we're typecasting the object to the type we "know" it to actually be is because our use of inheritance has gone awry. More over, our layer's particular subclass exposes a single property which is a catch all property for all of the properties we'd like to set on the object. Rather than modify actual properties of the object, we supply a dictionary containing the properties we would like to set in a slap dash fashion, and hope it all works underneath the hood. While this may make the ABI a little more stable, it won't prevent us from attempting to set properties that don't exist, or use grossly inaccurate values.

4. Redundant type information:

Most of the naming conventions are redundant, especially when combined with the type information in the declarations, and the names of the parameters. For example, if you looked up numberWithBool: in the API reference you would find the amazingly redundant:
+ (NSNumber )numberWithBool:(BOOL)value
If you consider what this line actually says, you'll notice that type information is declared twice, without providing any additional information. The icing on the cake is the choice of the word value to represent the value we are passing to the method. Why not go for the hattrick and use aBoolean instead?

5. Semantic Mismatch Between Method and Product:

If you consider the phrasing of these constructors, they appear to be a bit awkward. numberWithBool seems to imply that you can conjoining a number with a boolean, resulting in some sort of hybrid datatype, rather than deriving one from the other. In English, we tend to indicate the genitive case using the prepositions from or of or by using an 's not the preposition with. This choice of words generates a semantic mismatch between what is said and what is actually meant. Similarly, the phrase dictionaryWithObjectsAndKeys not only suffers from the same semantic mismatch, but also accepts a null terminated list of object, key pairs which does not naturally flow from the language. In English, we typically introduce a list with phrases "such as the following:".

6. Inversion of Syntax Against Reasonable Expectations:

Most programmers familiar with other languages would at first glance miss one critical piece of information regarding the design of dictionaryWithObjectsAndKeys and that is the argument list consists of object, key pairs. In most other languages, we write associative lists using key,value pairs. This strange inversion of convention may be a product of the implementation details, but clearly does not conform even to other common idioms in the Objective-C language. If one were to look at a representation of dictionaries in Smalltalk, one would see a set of key -> value associations. In the F-Script programming environment, NSDictionary objects can be created using the #{ key1 -> value1, key2 -> value2, ...} syntax.

Final Verdict

Not only is it easy to critique this code, and find plenty of fault with its formulations, it is equally easy to demonstrate several better ways of writing the exact same thing. With a little attention to what we are trying to do, and what it is that we are trying to say, we could eliminate much of the bother from this code. Unfortunately for us, to do this would require either writing an unreasonable amount of shim code, or going to work for Apple and redesigning some of the underlying implementation. So for sake of argument, let's look at ways we could improve the overall design.

Booleans are Opposites:

Whenever you encounter a place that you are tempted to use a boolean property, it is just as easy to define two opposite methods which set the internal state of your state machine. In our code example we use two booleans:

eaglLayer.opaque = YES;
[NSNumber numberWithBool:NO], kEAGLDrawablePropertyRetainedBacking

In both of these cases, we could dramatically improve the readability of the code by introducing two separate methods:
[ layer opaque ] vs [ layer transparent ]
[[ layer backing ] retained] vs [[ layer backing ] discarded ]
Effectively, both of these idioms could be combined, if at each stage the method returned self rather than void or some nonsensical value. The code could be written to say:
[[[ layer opaque ] backing ] discarded ];
Settings are Properties

We could even do a similar thing with the color format of our layer, by simply adding a method to set the color format to the desired value:
[ layer rgba8 ];
Odds are slim that any of our color formats are going to dramatically conflict with other desirable method names. However, if we wished to delegate the formatting behavior to a separate state machine we could simply use:
[[ layer format ] rgba8 ];
Regardless of how we chose to structure our data, the point is that any finite set of settings can be expressed in an equally finite number of methods. If you have already gone to the great lengths of implementing a message passing architecture, you might as well pass the simplest messages possible.

Putting It All Together:

So what I would like to be able to say, in basically any language, would look something like:

layer rgba8 ; opaque ; backing discarded .

which incidentally is actually perfectly valid Smalltalk. As such we can write the same thing in Objective-C with minor translational changes:
[[[[ layer rgba8 ] opaque ] backing ] discarded ];
If this were javascript, I could write the same functionality using the same idiom:


With C code, you can do exactly the same thing assuming that each returns a pointer to a structure, which contains the necessary function pointers:
In all four cases, in all four languages, we are equally capable of saying the exact same thing. What these statements have in common is that they're simply a compound sentence containing a list of properties we wish the layer to have. It is much like the English sentence where we say:
The layer is RGBA8, opaque, and its backing store discarded.
And that is beautiful code.