Red Echo

October 24, 2015

A few solid hours hacking on fleet last night got me to a basic C “hello world”. The C library has everything from string.h and the simple character I/O functions from stdio.h, and the kernel has just enough of a driver interface to make simple file operations work. I’m using the legacy serial ports for now, since the drivers are trivial; stdin maps to COM1, stdout is COM2, and stderr writes to the host console via port E9.

This is the part where it starts to get really interesting. If you handwave past the part where most of the C library isn’t there yet, it’s now possible to compile a couple of ordinary C programs, link them against the fleet library, run them in VMs, and pipe data between them in grand old Unix shell style. It’s all very normal- except that these processes are just as comprehensively isolated from each other as if they were running on separate physical machines.

October 21, 2015

This fleet project is a lot of fun, combining a shiny new idea with an excuse to take a crack at a lot of classic problems.

The next layer after the startup code should be something to do with drivers and the low-level kernel apparatus, but it all felt a bit vague, so I decided to start with the C standard library interface and work my way down, letting that drive the rest of the kernel architecture.

There are dozens of free C library implementations available, but I have not been able to find one that will work for my project. I don’t want POSIX support, don’t need hardware portability, and won’t have a Unix-style system call interface underneath. And while I’m building this in the style of an embedded firmware project, it’s actually designed to run on a purely virtual machine, so I don’t need or want a lot of code dealing with legacy PC hardware.

Oh, well, I’m writing my own C library. Of course I’ll fill it in with a lot of existing components, but this architecture is apparently weird enough that the framework is up to me.

I did write the string library myself, though, because I thought it would be fun. There sure is a lot of weirdness in there – it’s been 23 years since I learned C, and I can’t say I had ever noticed the existence of strspn, strxfrm, or strcoll – but now I’ve written ’em and built test suites too.

October 19, 2015

I posted a thing: a piece of fleet called ‘startc’

I factored the lowest-level portion of the fleet code out as a standalone library which I’ve named ‘startc’ and posted on github. I also announced it on reddit.com/r/programming and on hackernews. Of course it feels a wee bit nerve-racking to post something experimental like this for the world to examine, but it’s a good exercise as it forces me to get all the loose ends tied up and to really think carefully about the interfaces between modules. So far the reception has been generally positive, which is nice. I have no idea whether anyone will actually use the library, but perhaps someone will get through the early stages of a similar project more quickly by looking at its source code, and that would make me feel good.

October 16, 2015

Treasure

tmp_6972-IMG_20151016_1722430432082328948

October 15, 2015

According to The Death Clock, I have about a billion seconds left.

That… seems reasonable.

tmp_2401-IMG_20151015_0956502082328948

October 12, 2015

I spent four hours hanging a TV on the wall yesterday. Yes, really. I thought I’d simplify the project and save myself a bunch of work by purchasing a wall-mount swivel arm for the TV instead of building what I wanted from scratch.

As soon as I got started, it was clear that the wall-mount was designed to be mounted on a solid wood or brick wall (seriously? how many of those do you find in the USA?), so I started with a trip to the hardware store for a plank and some lag screws. After some careful measuring and a lot of exploratory drilling, I found the right spot and bolted the anchor panel firmly into the studs.

Next, I discovered that the wall-mount was a little bit too small for the TV. What!? I thought I measured it before I ordered it! Well… the wall mount listed a diagonal measurement range which includes the size of my TV, and its mounting bracket style is the same as that of the bracket I formerly used to attach the TV to the entertainment center, but it was designed for TVs with square bolt patterns and it just doesn’t spread out enough.

So… back to the hardware store, for another handful of bolts and some aluminum bars. I cut and drilled until I had a workable pair of adapter brackets.

Finally, I bolted the adapter brackets onto the TV, bolted the swivel-arm brackets onto the adapter brackets, screwed the swivel-arm brackets onto the arm head, bolted the swivel-arm base onto the anchor panel, which I’d previously bolted onto the wall.

Sure saved myself a lot of work there!

October 6, 2015

The hypervisor is the new kernel.
The virtual machine is the new process.
The process is the new thread.
Virtual PCI devices are the new POSIX.

Shared mutable state does not scale.

October 1, 2015

Text editing as a wire protocol

I spend a lot of my computer time editing text files, and so I’ve thought a lot about how one might go about that in a system like Fleet. One approach would pack all possible editing services into a single, monolithic IDE, which could run within a single VM. It would mount the disk containing the files you want to work on, present a file browser, and let you edit away to your heart’s content.

There’s nothing wrong with that approach, and it wouldn’t be hard to build out of existing components, but it doesn’t really satisfy my sense of elegance. I’d rather find a way to plug my editing tools together like Lego bricks.

It’d be really convenient, for example, to separate the code that renders text on screen from the code that manages all the data and performs the edits. Text can be displayed in lots of different ways depending on the context (code? email? notepad? letter writing?), but the process of editing a text buffer is the same. Wouldn’t it be neat if I could write the editing engine once and just slap a bunch of different interfaces on it depending on context?

The Fleet philosophy says that every connection between components has to take the form of a wire protocol, but what kind of wire protocol would represent a text editor? That really isn’t the sort of thing client/server processes typically do!

It occurred to me, however, that unix is full of command-line apps which accept commands typed in through a serial connection, producing output as text. There is an ancient program called ‘ed’, part of Unix since the 60s, whose user interface is basically a little line-oriented command language. What if we just redefined its interface as a wire protocol? A text-editing interface program would become a bridge, with one end connected to an “edit buffer service” and the other connected to a “terminal display service”.

This would allow multiplexing: one could have an arsenal of tiny, single-purpose editing tools which do their work by sending commands to an edit-buffer service. No need to keep reimplementing the edit buffer in every tool – just send some ed commands down the wire.

The `ed` program was designed to edit text files, but considering its command language as a wire protocol, what we’re looking at in the abstract is simply an array of text chunks. There’s no reason the actual bits on disk have to be nothing more than a flat text file: one could implement a different edit-buffer service for each different kind of file format, allowing one to mix and match editor interfaces and buffer services.

We can take it further. `ed` commands consist of a line reference, an identifying char, and optional parameters if the command needs them. What if we could extend the line reference syntax and used the same protocol to manipulate multidimensional data?

The syntax currently makes no use of the colon character ‘:’, so I suggest that the editor wire protocol could be extended by allowing a sequence of indexes delimited by colons:

Current: ‘X’\n

2D extension:
:‘X’\n

One could thus write a generic table viewer which would speak this protocol, then plug it into an edit-buffer service representing a CSV spreadsheet file or an NCSA server log file. And of course there’s no reason you couldn’t continue stacking dimensions arbitrarily if you wanted an edit service backed by JSON or some other hierarchical format.

It might be worthwhile to define a read-only subset of the protocol, since some tools will be content to view data, and it would be useful to develop buffer services which present a common interface for exploring structured data even if it’s not practical to perform edits.

System programming is fun: introducing FLEET

I couldn’t sleep the other night so I spent a few hours coding up the foundation of a kernel for this new exokernel-style operating system concept I’ve been talking about, which I’ve decided to call ‘fleet’. (Trindle was the microkernel idea, now dead.) It’s a lot of fun – it feels a lot like working on an embedded device, except the board just happens to have been designed by lunatics. I feel satisfied with my progress; the kernel boots the machine, configures memory and interrupts, spews log messages to the serial port, and enumerates the devices on the PCI bus.

Since I’m treating the PC as an embedded device dedicated to a single application, this “rump kernel” is really more like a new flavor of the C runtime library than a traditional kernel. I don’t have to worry about paging, memory protection, or user/supervisor mode switches, and most of the usual concurrency problems just disappear. An application which needed those services could link them in as libraries, but I’ll worry about that later.

Once upon a time, when the world was young and people were still trying to figure out what you could do with a computer network, people tried to build abstractions that would represent remote services as though they were local ones. “Remote procedure call” was the concept of the day, and this really took off in the early days of OOP: the idea was that you’d have local proxy objects which transparently communicated with remote ones, and you’d just call methods and get property values and everything would be shuttled back and forth automatically.

This just plain doesn’t work, because the semantics are totally different. You simply can’t make the fundamental constraints of concurrency, latency, and asynchrony disappear just by throwing a lot of threads around.

Modern interfaces are focused not on procedure calls, but on data blobs. Instead of making lots of granular, modal, stateful requests, machines communicate by serializing big blobs of data and streaming them back and forth at each other. This emphasizes bandwidth over latency, and focusing on large transactions rather than small interactions simplifies the problem of concurrent changes to remote state.

My plan is to take this idea out of the network and apply it inside a single PC. The rise of multicore computing has demonstrated that the traditional approaches don’t even scale within a single machine, once that machine is full of asynchronous processes competing for shared resources! In the ‘fleet’ world, rather than trying to represent remote resources with local proxies, we’ll represent local resources as though they were remote. There will be no DLLs and no system calls: the system API will be a folder full of wire protocol and data format specifications.

This solves the problem of network transparency from the opposite direction: since programs will already be communicating with local services through some network datastream interface, remote services will look exactly the same, except for the higher latency and lower reliability.

I believe that this approach will substantially improve the security picture, since the absence of any shared memory or common filesystem limits the damage a single program can do to the rest of the machine should it become compromised. Hypervisors seem to be holding up well in practice. Of course there’s nothing which would prevent a single ‘fleet’ process from spawning its own subprocesses and reintroducing all those concerns – the fleet shell would be perfectly happy to run linux as a subprocess, for that matter – but it’ll be easier to use the hypervisor interface and spawn “sub”-processes as independent virtual machines.

Requiring each program to include drivers for every possible hardware device would be madness, and slow madness since device emulation is tricky and expensive. These programs are never going to be run on bare metal anyway, so I’m going to ignore all legacy PC devices and define the ‘fleet’ system interface as consisting solely of virtio devices. These devices all have a simple, standardized IO interface, so it should be no problem to build drivers for six or eight of them into my kernel-library. I’ll offer an efficient low-level I/O API for nonblocking DMA transfers. All the clunky, synchronous, blocking C APIs can be implemented on top of that.

Looking at this system from above, it’s clear that making this fleet of VMs do useful work is going to involve a lot of datastream routing. I’m still working on the details, but I’m thinking that each program will have to include a compiled-in manifest describing the connections it wants to make and receive and the protocols it wants to use with them. Fixed connections like ‘stdin’, ‘stdout’ can be represented as serial ports, while other traffic can be specified using IP port numbers.

I have no idea how far I’ll get with all this, but I’m back in my old stomping grounds with all this low-level hackery and having a great time at it, so I’ll probably stick with it long enough to build a proof of concept. Something that boots into a shell where you can manipulate a filesystem and pipe data between programs, with a little monitor that lets you see what all the VMs are doing – that should be fun.