DAVE'S LIFE ON HOLD

PgProc.lua and the Joy of Reflection

Tonight I've released a bit of code derived from the some of the work that went into Evangelizmo. Back in 2010, I wrote a set of C functions which bound the stored procedures in an application specific schema for Jawas to the globall namespace of the Lua or JavaScript runtime with the same names. With the LuaJIT FFI extension, I have ported the C + Lua code of the original to a pure Lua implementation as pgproc.lua. The design of the API is to minimize those layers of code necessary to maintain in multiple places for any application. Consider the code:

require('pgproc').bind('object')
print(object.create(' ')'create'))

This code will automatically connect to the database as specified by e DB_CONNECT_STRING environment variable, read the stored procedures out of the objects schema, and generate the appropriate Lua global object table with the associated functions as closures wrapping the stored procedure function call. If we create a object.create stored procedure in our database, that functionality is the immediately available by restarting the Lua process. It is possible to call bind(schema) at runtime to pick up additional changes as they occur. A long running process may periodically rebind the stored procedures every hour or so with little impact on performance.

If you've never used a stored procedure, you're missing out on one of the greatest database design tools available. Rather than relying upon every programmer being a SQL ninja who can explain and analyze every query and data model to optimum performance, you can separate concerns between different programmers. Let's say your Big Data Architect has a crazy shared database architecture with dozens of individual machines coordinated via a set of proxies. As modern Postgres servers can define external resources, which themselves may be other Postgres databases, your simple join statement in your SQL query could be unintentionally spanning multiple datacenters. How are you to know you can't safely make that join in production with risking taking down the company? After all, you're just trying to get a report to work for finance. Enter the stored procedure. Rather than dealing with the data model directly and writing SQL, you invoke a function stored in the database which had been hand crafted by a domain expert. As the complexity of the infrastructure grows and the volume of queries exceeds what a handful of smiple master/slave databases can handle, it becomes impractical for every member of every team which may consume that data to have to become a subject matter expert. Instead, the only way tomscalemthe human knowledge gap is to architect APIs with the limitations of the humans in mind. Rather than know how it all works, all you need to know is the signature of the function.

With that in mind, pgproc.lua is designed to make the transition from a DB engine running on one machine to a globally distributed cluster of machines as painless as possible. Typically stored procedures still require you write the tedious wrapper functions to interface with your database. Since the catalog tables in Postgres already know all about the stored procedures and we can query the catalog tables from SQL, we can automate the binding process. Similarly, as LuaJIT FFi knows how to automate lithe binding process to the C routines of libpq, pgproc.lua even doesn't go any further than translating the minimal number of C header bits to support building the wrappers. Most of the methods only exist to make it slightly easier to integrate the database connections with different configuration schemes. Once bound, however, you need never even think of the database again.

One of the cool applications you can create with pgproc.lua is a simple CRUD application. By implementing create, read, update, and delete procedures and binding to a schema associated to a data type, you can trivially store objects in the DB. When used in conjunction with projects like amqp.lua, you can further extend he reach of the simple CRUD pgproc.lua wrapper to a highly available distributed application. In about 10 lines of code, you can wire up your entire world.