DAVE'S LIFE ON HOLD

Puppet vs Chef vs Image-based Deployment

Full disclosure up front, I wrote a system that operated much like Chef or Puppet but in Perl. It was used to install hundereds of thousands of machines. And it did dependency management on a scale Puppet and Chef could only dream of. I know their methodology well, and have used puppet in production in 3 ventures.

That said, the approach that Puppet and Chef take to system maintenance is fundamentally flawed. The reason for this has largely to do with their inability to guarantee that every recipe a sysadmin devises is in fact idempotent and that the behavior of third party software will invalidate the environment during install. This might sound like a 80/20 problem, where I am quibbling about the 20%, and it is, but that 20% prevents you from achieving ONE 9 worth of SLA.

Problem 1: Packaging

Ignoring issues with configs not being updated, developers failing to package all of their software, or busted spec files, package installs do not guarantee a repeatable process. If you go straight tarball (the best option), you get a clean install only method which mostly preserves file permissions. If you aren't fastidious abbot maintaining consistent ids across your systems, this will fail bit so will most other packaging tools. If you go deb based (next best) you will get proper uninstall and correct dependency handling semantics, but at a cost of locking yourself out of certain markets (govt for example). Then if you go RPM (the enterprise standard) you are trusting to an accident of fate to ensure the same packages get installed on all systems. This is because RPM resolves dependencies based on string embedded in the spec file. These strings are generated by the tools but overrideable by the spec writer. The order in which these strings get resolved is dependent on the order of your repos in you config, cache, and their manifest files. If any of these things change or were different because of historical build order, you will get inconsistent builds. On a large enough system, this can have devastating effects. (ran into this yesterday in fact)

Problem 2: Repositories

If your org works like most, you probably don't have your own CPAN mirror, your own Gems repository, a copy of every PiP, or even know where all your vendor's code comes from. If you use RVM for managing Rails or NPM for managing Node, you probably don't have your own copy of all of the package sources being fetched from your own repo. Well guess what, you should! One of the things I see far to often in companies is sysadmins cleverly automating the process of building servers by pulling arbitrary versions of code from public repositories. The developers obviously don't know what their dependencies are (the thinking goes) so whatever is latest must work best.

Yesterday one of our developers ran into an issue where the newest version of glibc was the root cause of a new double free bug resulting in an intermittent segfault (about once every 7 hours). This meant that as an institution we now have code that is only validated to run on a specific known buggy version of glibc. To make matters worse, the program can not be fixed as it is dependent on a 3rd party binary to which we isolated the error condition, but as that code base is hundreds of thousands of lines, not owned by us, and would require too much time and money to solve for a problem we created ourselves by upgrading part of the system, the correct business decision is to maintain a new image and secondary repository.

When you add up all of the rpms, modules, and sundry 3rd party vendor packages in a typical installation, you can not afford to track the dependencies between all of the various pieces of software in your puppet configs or chef recipes. While documentation helps, the process of validating a build is so expensive that it is not something one should do often. And upgrading core libraries without an extensive burn in process is doomed to create new failures.

Problem 3: Configuration

One of the supposedly best features of puppet and chef is their ability to templatize configs. This sounds great when you have a dew dozen boxes and need to periodically roll out hardware. If you occasionally spin up VMs having your configs templatized can save a fair amount of frustration. There is a problem with customizing configs for specific instances however, and it quickly breaks down when the quantity of hardware and dynamic nature of flexibly provisioned fabric comes into play.

If you move into the world of cloud computing, wherein the computing resource is a logical entity provided by a compute fabric utility, you don't want to manage your configuration on a per-node basis at all. It is too expensive to validate and configure both the network fabric, the system stack, and the application code with hardcoded endpoints and network layer details. Simply put if you have 3 teams working across these three layers of your stack, communication will break down and something will get misconfigured or changed without sufficient foresight of the implications. Rather configuration must itself move to a reflexive shared configuration service, which provides discovery capability to all applications. It shouldn't matter where an endpoint is added to the network, or how the fabric is switched, only that an application can discover where all of it's dependent services are running.

11 years ago, the zeroconf working group came up with two great ideas: link level addressing and dns service discovery. These two things have been heavily used in the consumer electronics space to provide network device configuration for plug and play printers, Apple Bonjour networking, and adhoc network creation. For most data centers, however, it has remained the domain of large organizations and is rarely used by small to midsized companies intentionally. This is a shame as DNS is one of the most reliable and proven distributed databases for network configuration ever built. Running local DNS in your datacenter with separate zones for production, staging, QA, and development can greatly improve your ability to flexibly deploy system images. Rather than retaining per node configurations, the services themselves are tied to identities (hostnames) and these hostnames assigned via DHCP (static leases linked to MAC) and can be shifted around between physical hardware and VMs as demand requires.

If you templatize your nodes, you can't propagate or promote nodes between environments however. Also the act of reconfiguring to a specific environment invalidates your testing. If you have hosts configured like mysql.QA.local. and memcache.QA.local in your configs and need to change each instance of QA to prod or staging, you introduce risk that something will get missed. Worse still is if you have different configs for each environment which trigger changes in code paths taken. This means you never validate against a representative system (ie you never really tested your code in the first place). Puppet and chef are really poor tools at this point, as their greatest strength is preserving a poor configuration practice in the first place!

Solution: Images + Systems Architecture

Systems Architecture is where the application hits the fabric. Designing this interaction such that each component is an interchangeable part is a key concept. In an ideal environment there is one system image that has the capability to perform all functional roles. Rather than having a database image, a webserver image, a mail queue image, a chat server image, a proxy server image, a cache server image, a search indexer image, a data acquisition image, a FTP dropbox image, a storage node image, a monitoring machine image, you have one image with all of the software installed.

In the ideal world this image only contains the software and core libraries, none of the data, kernel, or configuration. Kernel images are maintained based on hardware sets, data is stored in separate volumes (so that they can grow/shrink as needed) and configuration is stored in a separate volume and selected via DNS SD, providing a consistent method for keeping code and configuration separate. The main system image is always mounted r/o and can be swapped out when software is upgraded. Rather than altering images in production, code changes are released by spinning up new images and cutting over at the network layer. Using consistent hashing and routing rules, subsets of the customerbase can be released to without risking the entire infrastructure on a big release gamble.

GM images in this context have the marvelous property of being truly idempotent. If I copy this image over that image, then copy this image over the same image again, it will still be this image. Rsync allows one to only update the bits of an image that have changed. This is actually more efficient than running puppet or chef over a network, and ensures dab server is binary identical in terms of it's software stack. If all configs are the same as well, you know that you are running the exact same code in prod as staging as QA as dev. And if your environment is partitioned via DNS configuration and physically unified in a single fabric (say like prod staging QA and dev are all in ec2) you can be certain migrating between environments will not introduce additional risk. In fact it becomes possible to view the migration of system images as an image lifecycle. Dev are images not yet set in stone, QA is images undergoing validation but will not change. Staging is the set of images that have passed validation and are awaiting customers to start using them. Production is those images being used by customers.

Notice in thatdescription it is not the images that change, only their identity as to how they fulfill business needs. Puppet and chef don't have a role beyond development inthis configuration. And even then it is a tenuous one.