Christopher B. Browne's Home Page
cbbrowne@acm.org

The Unix Operating System

Christopher Browne


Table of Contents
1. The Unix Philosophy
2. Unix Trademark and Standards
3. Unix FAQs and General Resources
4. Unix Shells - csh, ksh, bash, zsh, ...
5. Screen
6. Unix Hardware Vendors
7. Interoperability Tools
8. Networking Stuff
9. Unix-based Faxing
10. Unix-based Alphanumeric Paging
11. Random Unix Links
12. 4.4 BSD Lite-based Operating Systems
13. init
14. NFS - Network File System
15. Unix Haters
16. List of Unix Variants

There is not much doubt but that I am something of a "Unix Bigot." To this end, here are a number of references to "Unix Stuff" out on the Internet.

Seeing as how the Internet was mostly created and implemented on Unix-based computers (yeah, with some TOPS-10 and ITS and Multics and such, but they didn't survive the VAX days...), this is inherently an extremely incomplete list. Many of the products/software components available under Linux are also usable on a variety of Unix operating systems.

1. The Unix Philosophy

The following principles have been extracted from the book The Unix Philosophy by Mike Gancarz, which I would commend to anyone that wants to understand why Unix is useful and powerful, and represent fundamental ways of taking advantage of Unix's strengths.

1.1. Dogmatic Tenets

  • Small is beautiful

    Small functional programs are readily understood, applied, and reused, and there is not as much complexity amongst which bugs can hide.

  • Make each program do one thing well

    Multiple useful programs can then be composed together to perform more complex functions.

  • Prototype as soon as possible

    Early affirmation/denial of the validity of assumptions is really valuable.

  • Choose portability over efficiency

    Effectiveness beats efficiency. If software works adequately, it will work even better on next year's hardware. And the real performance wins come out of the selection of better algorithms, which has little to do with portability.

  • Store numerical data in flat ASCII files

    Portable data is as important as portable code... Using text data files makes it easy to use powerful text-processing tools to manipulate the data.

    Note that in the realm of spreadsheets, the spreadsheet from Applix keeps this rule, which can prove useful. You can examine spreadsheet files from the ApplixWare spreadsheet module (as well as the word processor module) using an ordinary text editor, and find it fairly readable.

  • Use software leverage to your advantage

    Don't force yourself or others to reimplement functionality; make it easy to reuse functionality.

  • Use shell scripts to increase leverage and portability

    Shell scripts don't need to be compiled or recompiled, and run faster and better any time:

    • Hardware improves

    • Software improves (e.g. - someone optimizes the shell code; this happened with ksh; it really needs to happen with Bash.)

  • Avoid captive user interfaces

    "Hidden" interfaces are something people will have to work around. Languages like Expect may allow a developer to work around this, but it's painful and fragile.

  • Make every program a filter

    This makes it easy to build a set of filter components into "one big filter." Programs can communicate by explicitly reading and writing files; that is much more complex to manage, and requires a lot of disk I/O for interim output. A set of filters may pass data straight from memory buffer to memory buffer, avoiding disk altogether.

1.2. Lesser Unix Tenets

  • Allow the user to tailor the environment

  • Make OS kernels small and lightweight

  • Use lower case and keep it short

  • Save trees

    Keep text online, and use powerful tools to manipulate it.

  • Silence is golden

  • Think parallel

    SMP and networked parallel processing are easier to harness if we have a bunch of small pipe-connected components, rather than a single large monolithic process.

  • The sum of the parts is greater than the whole

    A set of programs that can be composed in unexpected ways is far more flexible than one monolithic one.

  • Look for the 90% solution

    Solving the next 5% probably costs more than the previous 90%; solving the next 5% after that costs more still.

  • Worse is better

    Cheap and effective beats technically superior but expensive. See the Gabriel essay, Lisp: Good News, Bad News, and How to Win Big. It describes a principle described as "Worse is Better," and argues that this is why less-elegant ( e.g. "worse") Unix systems beat out LISP systems.

    There is an argument that Good Enough is Best, (the essay has regrettably been lost) and that this implies that Windows NT will take over the world. There is some validity to the notion, but I am not so sure that Windows NT really is good enough to fulfil that promise...

  • Think hierarchically

    The notion of directory and file hierarchies is really powerful, and can be extended to the naming of devices, network services, graphical resources...

1.3. Contrast Unix with Mainframes

Useful contrasts may be found between the above notions and those used in mainframe systems such as MVS.

Unix prefers to treat data as a "stream," whereas MVS prefers to work with "blocked" data. These notions are sufficiently similar that they both have similar effects on system performance.

Unix "streaming" allows fast performance, since processes that "stream" can process data quite efficiently until they run out of input, at which point they stop, waiting for input from whatever process is feeding them.

Mainframe "blocking" similarly allows fast performance, albeit involving a fairly different approach to programming.

But Unix systems also allow users to use "signals" to change contexts. This has the benefit of allowing the system to change course very quickly. "Unix aficionados" tend to dislike the mainframe "3270" interface that doesn't feel very responsive in comparison to the instant feedback from Unix curses and X applications. However, the high level of interactivity comes at a price:

  • Interactive Unix programs run in what amounts to being an "Interrupt-Driven" fashion, which can get rather expensive as the CPU must manage lots of interrupts.

    Many Unix systems "change threads of execution" in a highly dynamic fashion. Interactive text editors and X-based applications are good examples of this sort of "dynamic" processing. This "dynamism" comes at a cost of added overhead; an interrupt-driven system that can manage 30 simultaneous connections could probably manage ten times as many connections if all requests were queued up and submitted in a "block" mode.

    Comparing that to the "real world," a traditional fairly-powerful Unix host could probably handle having 30-odd users connected simultaneously doing interactive "text manipulation." That might drop to a half dozen if users are running X applications that supply context switches every time the mouse moves. An MVS host of roughly equivalent power could probably handle 300 simultaneous user connections without bogging down from an I/O perspective.

    This is because on the Unix box, every time a user strikes a key, an interrupt and a screen update results. If done over a network, a mere byte of data input and a couple bytes of output also results in on the order of 50-60 bytes of data transfer with associated network load. On the MVS box, the changes are all collected up and submitted as a single "block" when the user hits "Enter." It may not be user friendly, but it sure is quick.

    This suggests that in order to improve efficiency of applications, it is advantageous to (when possible) group many updates together and submit them to the host all at once. Thus, a slightly-smarter-client can provide substantial performance improvements. I hear that "little" Linux boxes that might have trouble supporting 20 interactive (telnet) users can support hundreds of users connecting to a suitably designed MUD server. The same is probably true for IRC (Internet Relay Chat) servers.

    Recent client/server applications such as SAP R/3 as well as other systems integrating Unix-based Transaction Processors/Transaction Monitors (such as BEA's Catb product) try to take this approach, grouping database updates together and processing them in a sort of "block mode."

    Another example of this would be The Sabre Group's web-based travel reservation system, Travelocity. The Travelocity web site accepts user input from HTML "forms" (which "block" together a sort of transaction) and translates that into various "block mode" commands that are submitted to the mainframe-based STIN (Sabre Travel Information Network) system. Context switches on the mainframe are thus kept to a minimum.

    The typical MVS "blocking" scheme is highly analogous to the way Web CGI requests work on HTML FORMs. With CGI, form updates are collected together on the user's workstation without consuming any network or server resources, and are submitted en masse when the SUBMIT tag is selected.

    Criticisms of CGI tend to relate to the fact that typical implementations "fork" a process to handle each form submission, which can get quite expensive. A suitably configured web server (such as Apache ) where form processors are integrated into the web server can handle this processing quickly and efficiently.

  • Unix systems allow dynamic resource allocation, whilst "mainframe" systems pre-allocate resources. This is true for CPU, memory, and disk usage.

    Dynamic resource allocation means that additional work must be done on a continual basis to manage the growing/shrinking set of resources, which can be costly.

    Of course, pre-allocation has two offsetting disadvantages:

    • If more resource is allocated than was needed, this diminishes the resources available to the rest of the system, perhaps slowing or blocking other work from being accomplished. (Which is why mainframe computers are indeed big iron.)

    • If insufficient resources were allocated, it may prove necessary to restart processing or even to redesign a process.

    Mainframe systems tend to have fairly sophisticated "resource queues" to allocate the various resources; maintenance of both queues and policies for "job submission" represents a lot of expensive "human" effort at mainframe sites.

    These costs are a substantial portion of why companies are trying to move processes off of mainframes when possible. Unfortunately, while avoiding these costs, companies lose the resulting functionality.

There is a product called FLEX-ES that is an IBM 370 emulator that runs on SCO Unix systems. It uses an instruction level emulation scheme, and apparently has fairly comparable performance to IBM's System 370 systems on "modern" Intel Pentium Pro processors.

According to mainframes.com , mainframes combine four important features:

  • Reliable single-thread performance, which is essential for reasonable operations against a database.

  • Maximum I/O connectivity, which means mainframes excel at providing for huge disk farms.

  • Maximum I/O bandwidth, so connections between drives and processors have few choke-points.

  • Reliability - mainframes often allow for "graceful degradation" and service while the system is running.

1.4. Another Take on Why Unix is Powerful

From a post by Nathan Hand;

The basic Unix model is files are bags of bytes and keep the number of system calls down by implementing things as files.

Modern hardware, be it scanners or printers or modems, all work with streams of bits. They can quite easily be implemented as a file. The fact that they have been implemented on Unix proves the point: bags of bytes are a sufficient abstraction to do a whole lot of stuff.

Famous operating systems made the system interface increasingly complex in an attempt to address new technology. Unix has tried intentionally to keep the operating system extremely simple: at the simplest level which can abstract all the hardware.

You can always add complexity (object models, visual interfaces and multimedia types) at a higher level using libraries. The OS presents things at the lowest level possible: files and streams of bytes.

This level of abstraction is low enough to be both flexible and efficient, yet high enough to be easy to implement further code layers for additional abstraction. By making the OS simple, you also make it easier to make the system stable and fast.

1.5. Unix as Literature

Still another analysis, The Elements Of Style: Unix As Literature, suggests that Unix's preference for processing text goes along with literary interests.

I wouldn't argue with that; a former boss at Liberty RMS came from a background of studying and even teaching philosophy.

The Cognitive Style of Unix suggests a psychological side that the presence of a "learning curve" is a good thing that we attempt to remove at peril of preventing people from learning.

Google
Contact me at cbbrowne@acm.org