DAVE'S LIFE ON HOLD

Towards a Better HTTP Server

So I have been playing around with different implementations of my Self HTTP server, and have a couple different parsers. This is the "I can't tell you how many"th web server I've written. C, C++, Perl, Python, Ruby, Lua, JavaScript, Erlang, Ocaml, Forth, Lisp, Smalltalk, Java are just a few languages I've written webservers in over the years. Not because there aren't perfectly serviceable production webservers out there, but because I view that functionality as the de minimis for any networked application. Being able to handle a request for a resoure over a TCP/IP connection just isn't so hard as to require a ton of code and a battle worn library. Some times all you end up using it for is to say "ping".

With the Self implementation, I am using mmap, writev, and kqueue/epoll for all of the low level socket wrangling. Based on my tests I saturate my machine with 66% of the CPU in system space just flooding my network card (which craps out before anything else). So as with most projects, using a slow application language has no appreciable effect on production throughput. I say similar behavior 12+ years ago with game libraries and Perl. SDL Perl with OpenGL bindings would crush my soundcard and graphics chipset, adding only about 3% overhead to the runtime, but saving hundreds of hours in compile/run/debug cycles which made it possible to build game building tools on a budget as well. When time to market is directly correlated to the remaining funds in your personal bank account, you will take the 3% hit all the way to the bank.

In doing this parser, I've been playing around with using built-ins vs. "doing what I'd do in a LLL". Most of these trade offs have to do with canonicalization of strings. In Self, immutable strings are made canonical by storing in a central lexicon. Like atoms in Lisp, identity comparisons can be used to short circuit linear scans. This can be huge as dynamic dispatch based on discovering a selector in a string can be worrisomely expensive. Self's two stage compilation approach helps remedy this cost for code that is run more than once, but for data that is often never even looked at, I would prefer to never pay that cost at all.

In C I would tend to use a offset and length struct to map out the interesting bits of the buffer. This means in a single linear scan of a HTTP request, I can parse the whole thing into tokens, and then use a handful of macros to access elements in a  char*, size_t token array. It doesn't modify the source data, and works well when working with partial requests, as you can always pause and resume parsing when you have more data. The size of the token arraycan be fixed to a hardware page size, and you can let the OS and paging hardware help map additional pages if you need more than a few hindered header entries.

The Self like parser makes prett heavy use ofthe built-in tokenizing methods. AsTokensSeparatedByCharactersIn: is a verbose if not simple way of doing things. It alone can slice and dice the tokens out of a HTTP header, though it is not at all useful for parsing a chunked data set. I haven't done much with pattern matching in Self, but that could be an interesting challenge in and of itself. The main downside I see with these messages is you need to know ahead of time the completeness of your dataset. Backtracking requires copying portions of the buffer into new string objects, and parsing those. In the trivial case where you recieve all of the request in a simple payload the code is much simpler.

I look forward to evaluating how the creation and extension of Self objects impacts the performance. If the simplest solution is also the fastest that poses a very interesting possibility. But if the complex solution is both faster and more robust, that would lead me to believe that an alternate set of abstractions might be in order.