Christopher B. Browne's Home Page
cbbrowne@acm.org

8. Novel OS Work

Some work on OSes has provided more unique results. This stuff is really not ready for users, as it takes quite a bit of thought to get your mind around the consequences of the design concepts.

The big concept that I keep seeing these days is that of making OSes "object oriented," and then having data persistence, which potentially eliminates the need for "files" and "file systems." (The other aspect is that people keep thinking that they need to reimplement Linux in C++ "to make it object-oriented." Unfortunately, G++ is not nearly as robust as GCC, and not much value would be provided by the exercise overall.)

While the notion of not needing to load things is pretty neat, I am not convinced that file systems are truly dinosaurs that ought to be discarded.

My thesis in favor of file systems is that in order to organize the sets of objects in a coherent fashion so that they are manageable, it is necessary to have some sort of universal "object hierarchy," and have a "universal" set of tools that can manipulate those objects.

A file system is a very powerful sort of hierarchy (a rooted tree) that should be able to represent anything that can be put into a hierarchy. Plan 9 is the seminal system that tries to represent "everything" as a hierarchy looking like a file system. This is discussed in the paper The Use of Name Spaces in Plan 9. Hurd extends this by proposing attaching DBMS-like "views" to file hierarchies. This can be called a "Virtual File System," or "VFS."

Similarly, a file can be treated as either the source or destination of a stream of data. The Unix Philosophy urges the use of this abstraction; this is a useful abstraction that is different from the "object oriented" notion of grouping objects.

It seems to me that an object system that can't be rendered in a VFS fashion is likely a spaghetti of not-particularly organized material whose representation is problematic at best. I have observed this in action; SAP's R/3 software stores all programs in a database in the form of a single enormous table. Unfortunately, you lose all form of hierarchy in the process. There are a variety of other tables that do crossreferencing between related components, but one loses the advantage of having the single powerful representation.

Traditional FORTH systems also suffer somewhat from this problem; they store both programs and data in the form of "BLOCKS," which typically are 1024 byte blocks of disk space. A neat aspect of this is that it is trivial to create virtual memory systems on top of this; with a decent sized BLOCK cache, "disk" accesses become extremely fast. Unfortunately, quite a lot of effort has to be expended on the part of anyone that touches the system in order to do "block management," creating manually whatever hierarchy might be wanted.

There are some SNMP tools that allow you to look at network configuration as if it were a VFS. You can head to "subdirectories" that represent the hierarchy of network 'features.' This is a very nice example of the the "wonders" of "everything as a file hierarchy."

It would be nice if RDBMSes such as Oracle and Sybase had a "file system" representation that represented system configuration including tuning parameters in the form of a VFS. Note that some people that are working with PostgreSQL are working on much this sort of thing via what they are calling Storage. It is not clear whether they intend to merely provide read-only access, or if they will provide the ability to create records, tables, and such by creating "files" on the filesystem. Ensuring consistency when creating a complex database object would be quite a challenge.

The docfs Unified Documentation Storage and Retrieval for Linux Systems proposes to use a VFS to manage system documentation; documentation would be created using SGML and placed in an appropriate directory; file requests to /usr/man result in a daemon creating manual pages from the SGML sources. (Probably with assorted caching to keep the system quick...)

Another approach that various projects have been considering is to store files in a fairly static fashion (sometimes with cryptic, made-up names so that users are not tempted to fiddle with them in that file structure), and then collect as much metadata as possible about those files and store them in some form of database, whether a text management system, SQL relational database, or some other nonrelational database. The notion is that you collect as much metadata about the documents as possible and stow it in the database; you can then search for documents based on the various characteristics. If this includes some sort of near-full-text search, it can be quite potent. Beagle is a GNOME system made for this purpose; one front end for this is Dashboard , which monitors what you type, and then uses Beagle to search your documents for others that may be related.

Google
Contact me at cbbrowne@acm.org