DAVE'S LIFE ON HOLD

Building Another Compiler In JavaScript

So I have begun writing another compiler in JavaScript. The prime motivation is that I just built a low level data manipulation extension to Self.js, and felt the need put it through it's paces. I have mach-o and elf header generation code done from another compiler I did a few years ago, and have both a 32 and 64 bit Intel machine code generator. So reall all I need to do is put together a translation layer that packs the machine code into an ArrayBuffer object and use the XHR2 support for POSTing ArrayBuffer data to a KV store. I can also use the base64 encoded data URL to produce download links in the browser like I did for Newscript's compiler, but most browsers have a practical limit to how large that buffer can be in memory.

The interesting bit of all this is that one could build a pretty printer that colorized and formats source code in a web page. My Self.js HTML tool kit is built with the idea of extracting strings out of a flat file and generates the markup from that. It seems reasonable that I could build a site that has a program, can compile it, and documents it. With a little work, it could be possible to build bootable disk images that would run in VirtualBox. This could be as simple as generating a floppy image with a boot sector. USB boot devices would take substantially more work, but aren't impossible.

What I find annoying in all this is that the browser still has no way to generate and save an untyped stream of bytes, but will download any file on the Internet and save it to disk. There is no security benefit to this arrangement, it merely makes it impossible to build useful offline web apps. Sure I can store base64 encoded binaries in local storage, but only about 2MB worth. But if I have a memory image on a remote server the bowser will happily allow me to download gigs of binaries.

So crazy though time, what if I wrote a browser plugin that mmap'd a memory region PROT_READ|PROT_WRITE|PROT_EXEC and then jumped to it? I already have that code for one of my Forth VMs with a MacOSX and Linux version. I could easily have the plugin fork a process and share the in memory image. Sure there's no sandbox, but I can whitelist only those sites you manually add to the module config. If someone infects a whitelisted site, it is no different from downloading and running a trojan. A clever scheme would even notify the user and allow them to abort any invocation.

The problem isn't people can run arbitrary code on their machines, after all, the problem is that they can't.