Christopher B. Browne's Home Page
cbbrowne@acm.org

3. Microkernel-based OS Efforts

So what is a Microkernel and why would you run Unix on top of another OS?

A Microkernel is a highly modular collection of powerful OS-neutral abstractions, upon which can be built operating system servers. In MACH , these abstractions (tasks, threads, memory objects, messages, and ports) provide mechanisms to manage and manipulate virtual memory (VM), scheduling, and interprocess communication (IPC).

This modularity enables scalability, extensibility, and portability not typically found in monolithic or conventional operating systems. Because the MACH microkernel is OS-neutral, different OS "personalities" such as Linux can be hosted on the microkernel. Perhaps the most significant advantage is that there is a focus on portability in the microkernel design and in decoupling that from the hosted OS.

Microkernels move many of the OS services into "user space" that on other operating systems are kept in the kernel. This has significant effects in the following areas:

The microkernel approach has been taken by Apple in creating a version of Linux that runs as a "personality" on top of MACH on PowerMacs. This should make it considerably easier to port Linux to additional system architectures, as MACH is considerably smaller than Linux.

Are there any downsides to the use of microkernels? Yes, indeed.

Probably the fundamental consideration encouraging microkernel development is the growth of interest in SMP and other multiprocessing applications. Simplifying ports to new architectures and encouraging compatibility with other OSes being close behind. And then there's the Hurd arguments...

Discussions of implementing Linux over a microkernel come up periodically in the Linux kernel/system design newsgroup; MkLinux represents some concrete steps in this direction. Many of these ideas can be applied to a non-microkerneled OS in at least limited ways; Linux "kernel modules" being a good case in point.

3.1. Another Comparison of Microkernels versus Other Architectures

On 17 Mar 1998 21:12:52 GMT, brianm@csa.bu.edu Brian Mancuso wrote:

The fundamental attribute that distinguishes monolithic vs. microkernel vs. exokernel architectures is what the architecture implements in kernel space (that which runs in supervisor mode on the cpu) vs. what the architecture implements in user space (that which runs in non-supervisor mode on the cpu).

The monolithic architecture implements all operating system abstractions in kernel space, including device drivers, virtual memory, file systems, networking, device/cpu multiplexing, etc.

The microkernel architecture abstracts lower-level os facilities, implementing them in kernel space, and moves higher-level facilities to processes in user space. Usually what distinguishes higher vs. lower-level os facilities is that which can be implemented in a platform independent manner vs. that which cannot. Following from that another distinguishing characteristic is what is sufficiently general to provide for various operating-system "personalities" and what is not. In microkernel architectures device-drivers, virtual memory, process/task/thread management/scheduling, and other such facilities are implemented in the kernel, and parallel facilities that specialize those facilities for the operating system's personality are implemented in user-space processes. Also implemented in user space are file systems, networking, etc. that employ the lower-level facilities provided by the microkernel (like device drivers).

In contrast, the exokernel architecture implements nothing in kernel space. The exokernel's sole purpose is to securely multiplex hardware resources among user-space processes. Device drivers, virtual memory, even cpu multiplexing and process management are implemented in user space. Supervisor-mode hardware events, like timer ticks, page faults, etc., activate stub handlers in the kernel that simply pass the event to a user-level process that implements the relevant facility's policy. The same system can simultaneously implement forward and inverted page tables, compute-job-friendly or interactive-job-friendly process scheduling, and an application can pick and choose which ever policies will provide it with the best performance.

The exokernel architecture is essentially the extension of the philosophy of RISC cpu architecture to the operating system level. The only exokernel architecture that I know of (MIT Aegis) has come up with some very novel ways to implement this.

Were I to implement my dream system it would be of exokernel architecture.

3.2. The Fundamental Challenge of Microkernels

In a 2003 discussion on alt.os.multics , there was some disagreement between Linus Torvalds and some of the Multicians. One of the very best comments was the following...

 

I can understand you pet peeve about microkernels and message passing when looking at Mach or Minix (not as well when looking at QNX , which performs quite well, and isn't bloated). On the other hand, a lot of "system" work on Linux gets done by efficient message passing; it's just a part of the system you are not involved in at all (hint: it's the X Window System). Why does message passing work quite well between X and the client? Because X manages to pack a lot of messages together before it actually switches tasks. That's because X's calls are asynchronous (with a few exceptions, some of them mistakes like XInternAtom()).

Unix's syscalls all are synchronous. That makes them a bad target for a microkernel, and the primary reason why Mach and Minix are so bad - they want to emulate Unix on top of a microkernel. Don't do that.

If you want to make a good microkernel, choose a different syscall paradigm. Syscalls of a message based system must be asynchronous (e.g. asynchronous IO), and event-driven (you get events as answers to various questions, the events are in the order of completion, not in the order of requests). You can map Unix calls on top of this on the user side, but it won't necessarily perform well.

 
--Bernd Paysan 

This seems pretty convincing. The L4 designers found indeed that message passing overhead led to their microkernels operating more slowly than monolithic kernels, running the same Unix APIs. If applications were redesigned to use asynchronous message passing, they might perform very well, but when they use the synchronous Unix APIs, performance will suffer.

This is also consistent with embedded applications running on QNX performing well: if you design the application to use its message passing APIs, it will work well.

3.3. Exokernels

Exokernels are a further extension of the microkernel approach where the "kernel" per se is almost devoid of functionality; it merely passes requests for resources to "user space" libraries.

This would mean that (for instance) requests for file access by one process would be passed by the kernel to the library that is directly responsible for managing file systems. Initial reports are that this in particular results in significant performance improvements, as it does not force data to even pass through kernel data structures.

They are presently still largely in "research mode;" there are not, at this time, any readily available implementations that one could run at home.

Google
Contact me at cbbrowne@acm.org