Red Echo

October 1, 2015

System programming is fun: introducing FLEET

I couldn’t sleep the other night so I spent a few hours coding up the foundation of a kernel for this new exokernel-style operating system concept I’ve been talking about, which I’ve decided to call ‘fleet’. (Trindle was the microkernel idea, now dead.) It’s a lot of fun – it feels a lot like working on an embedded device, except the board just happens to have been designed by lunatics. I feel satisfied with my progress; the kernel boots the machine, configures memory and interrupts, spews log messages to the serial port, and enumerates the devices on the PCI bus.

Since I’m treating the PC as an embedded device dedicated to a single application, this “rump kernel” is really more like a new flavor of the C runtime library than a traditional kernel. I don’t have to worry about paging, memory protection, or user/supervisor mode switches, and most of the usual concurrency problems just disappear. An application which needed those services could link them in as libraries, but I’ll worry about that later.

Once upon a time, when the world was young and people were still trying to figure out what you could do with a computer network, people tried to build abstractions that would represent remote services as though they were local ones. “Remote procedure call” was the concept of the day, and this really took off in the early days of OOP: the idea was that you’d have local proxy objects which transparently communicated with remote ones, and you’d just call methods and get property values and everything would be shuttled back and forth automatically.

This just plain doesn’t work, because the semantics are totally different. You simply can’t make the fundamental constraints of concurrency, latency, and asynchrony disappear just by throwing a lot of threads around.

Modern interfaces are focused not on procedure calls, but on data blobs. Instead of making lots of granular, modal, stateful requests, machines communicate by serializing big blobs of data and streaming them back and forth at each other. This emphasizes bandwidth over latency, and focusing on large transactions rather than small interactions simplifies the problem of concurrent changes to remote state.

My plan is to take this idea out of the network and apply it inside a single PC. The rise of multicore computing has demonstrated that the traditional approaches don’t even scale within a single machine, once that machine is full of asynchronous processes competing for shared resources! In the ‘fleet’ world, rather than trying to represent remote resources with local proxies, we’ll represent local resources as though they were remote. There will be no DLLs and no system calls: the system API will be a folder full of wire protocol and data format specifications.

This solves the problem of network transparency from the opposite direction: since programs will already be communicating with local services through some network datastream interface, remote services will look exactly the same, except for the higher latency and lower reliability.

I believe that this approach will substantially improve the security picture, since the absence of any shared memory or common filesystem limits the damage a single program can do to the rest of the machine should it become compromised. Hypervisors seem to be holding up well in practice. Of course there’s nothing which would prevent a single ‘fleet’ process from spawning its own subprocesses and reintroducing all those concerns – the fleet shell would be perfectly happy to run linux as a subprocess, for that matter – but it’ll be easier to use the hypervisor interface and spawn “sub”-processes as independent virtual machines.

Requiring each program to include drivers for every possible hardware device would be madness, and slow madness since device emulation is tricky and expensive. These programs are never going to be run on bare metal anyway, so I’m going to ignore all legacy PC devices and define the ‘fleet’ system interface as consisting solely of virtio devices. These devices all have a simple, standardized IO interface, so it should be no problem to build drivers for six or eight of them into my kernel-library. I’ll offer an efficient low-level I/O API for nonblocking DMA transfers. All the clunky, synchronous, blocking C APIs can be implemented on top of that.

Looking at this system from above, it’s clear that making this fleet of VMs do useful work is going to involve a lot of datastream routing. I’m still working on the details, but I’m thinking that each program will have to include a compiled-in manifest describing the connections it wants to make and receive and the protocols it wants to use with them. Fixed connections like ‘stdin’, ‘stdout’ can be represented as serial ports, while other traffic can be specified using IP port numbers.

I have no idea how far I’ll get with all this, but I’m back in my old stomping grounds with all this low-level hackery and having a great time at it, so I’ll probably stick with it long enough to build a proof of concept. Something that boots into a shell where you can manipulate a filesystem and pipe data between programs, with a little monitor that lets you see what all the VMs are doing – that should be fun.