Christopher Browne's Web Pages
Prev	The Internet	Next

2. HTTP - The Protocol of the World Wide Web

The Web involves three really important things:

Web Browsers, that generate queries for information,
Web Servers, that respond to the queries, providing responses, and
The HTTP protocol by which this communication of information takes place.

HTTP is not the same thing as the HTML language used to represent information; HTTP primarily involves passing the names of "objects" and their values back and forth. While there may be disagreement as to how HTML ought to be defined, people are pretty clear on the HTTP definition...

There is, however, a proposal for HTTP-NG - Next Generation Hypertext Transfer Protocol.

2.1. Web Servers

Dmoz.org on WWW Servers
Apache
Apache is the most popular web server in use on the Internet today, and is the software used by approximately 41% of all web servers. (Based on the Netcraft survey, February 1997.) It is not unlikely that the fact that common Linux distributions such as Red Hat, Debian, and Slackware install Apache by default has assisted in its growth. It is commonly run on Intel "boxes" running Linux or one of the BSDs, and provides fine performance for most purposes. Several SSL (Secure Sockets) implementations are available from commercial vendors.
There are a number of related projects that include "Apache on OS/2," integration of scripting languages, a news weekly called Apache Week, and the Apache Module Registry listing developments using the Apache API. This last indicates a cause for Apache's popularity. "Standard" modules provide support for such things as enhanced authentication, server side includes, and some support for external relational databases. Contributed modules provide support for integrated execution of code in languages such as Java, Perl, and Python, enhanced logging, and many ways of managing authentication that allow security to be handled using cookies and database queries.
PHP/FI
DAV Resources
GROUP.lounge
A tool for collaborative production of (apparently) documentation across the web, with WebDAV support.
sws
sws is a very simple web server implemented as a small shell script. No CGI support, but it nicely feeds static content of various types.
WN - Web Server
This is a "highly featureful" web server notable for providing fine-grained control over security. It tries to provide as "built-ins" functionality in the areas of document 'rewriting/inclusion' and text searching that obviates the need for the use of CGI for these sorts of purposes. Documents can present different "views" to clients based on such things as the client's IP address, domain name, browser type, browser "Accept" capabilities, and more. The reduced dependence on CGI improves performance while cutting down on possible security holes.
Patches are available to allow use of SSL in conjunction with WN.
Boa Web Server
This is a web server intended to provide high performance with limited hardware resources. It does this by having a small RAM footprint and minimizing the number of processes it has to fork(); it will only fork when running CGI code. Boa does not support SSL at this time. I've tended to use Boa rather than Apache due to it being tiny, and since it does not have the same security vulnerabilities. (Although the lack of SSL support might lead one to conclude other things...)
Mongrel2
The Language Agnostic Web Server
Nanoweb - The PHP Web Server
WEBrick - an HTTP server toolkit
:: lighttpd ::
thttpd
Cherokee Web Server
Xitami
Xitami is a combination web server/ftp server that runs on various platforms including Win32.
Google search for +Web +Servers
AllegroServe is a web server developed for use with Common Lisp, particularly ACL .
The main purpose of AllegroServe is to serve dynamic pages using an html generator. It can dynamically generate web pages and "web-enable" existing applications with a browser front-end.
The code is licensed under the LGPL, and presumably could be fitted to work with one of the free Common Lisp implementations.
Lisp Server Pages - LSP
LSP is a Common Lisp- based dynamic content generation facility. It features a syntax that simultaneously offers the semantics of SGML and Common Lisp in the source LSP page, and the ability to compile the page.
Araneida - web server implemented in Common Lisp
HTTP server inside Linux kernel
This web server may provide the fastest conceivable service for static web pages.
- kHTTPD for Linux
- AOLserver
  A multithreaded web server with integrated Tcl .
- Twisted Matrix Labs
  A networking engine implemented in Python
- Koala
  HTTP Server written in Dylan . Provides a template engine analogous to JSP , and an XML-RPC server.
- AWS - Ada Web Server
  Web server software for developing web applications in Ada . This includes a SOAP implementation.
- publicfile
  The djb HTTP server... Minimalist, difficult to exploit... difficult to figure out licensing issues...

2.2. Special Purpose Web Servers

Squid home page - an HTTP Object Cache
Squid caches web pages in memory, spilling to disk as needed, so that if a page is requested multiple times, this can be satisfied by one access to the original page. This cuts down on overall traffic. ISPs should run something like Squid to improve performance for users and cut down on use of scarce communications resources.
Caching Tutorial for Web Authors and Webmasters
Caching for Web Authors and Designers
Junkbuster
Its purpose in life is to filter out gratuitous advertising and cookies from web pages. Configuration files describe sets of "blocking" rules for both URLs and cookies. By blocking sites that are purely used for advertising, you eliminate the images that take so long to load. Junkbuster can be used in sequence with other proxy servers; I have configured it as the proxy "nearest" to the user.
More sophisticated approaches would be to, rather than blocking URLs, rewrite the HTML so as to:
- Remove the blank space resulting from the blocked image,
- Substitute another "more desirable" image in place of the blocked image, or even
- Reformat the web page in some other manner to make the contents more usable.
Parsers such as WebFilter have been created to do so. Unfortunately, this is a much more complex, CPU-intensive, and error-prone approach which also has the potential to bring up legal questions as it involves modifying the contents of copyrighted works.
Glimpse
A text search engine freely available in source code form.
GlimpseHTTP installation.
An interface that uses Glimpse to search documents, using web FORMs to provide a "graphical" front end.
The Anonymizer
A web server that eliminates identifying information about you when accessing other web sites.
NetShell - Customized Handling of WEB Information
EWS Home - Excite WorkStation
The Excite Search Engine is now making available a (probably stripped-down) version of their web search software that can be used to build indexes of local servers. Not unlike Glimpse
Transparent Proxy.
This is an HTTP Proxy that uses ipfwadm to intercept HTTP requests and pass them to the proxy. As a result, there is no need to set any environment (e.g. http_proxy) or internal browser variables in order to make Linux-based programs make use of the proxy server.
CL-HTTP Project List
CL-HTTP is a web server written in Common LISP
Acknowledgements for CL-HTTP efforts
GNU httptunnel
SMLserver
A web server plugin for AOLserver allowing use of Standard ML programs for building web sites.
PS-HTTPD - Web Server implemented in PostScript
Winning points as about the most bizarre implementation...
HTTPD in Awk
Also in the running is a web server implemented in Awk .
sedhttpd
If there is anything more bizarre than a web server in Postscript, it would be one in sed .
Make A Shorter Link
Sometimes you want to post a web link in a news article. Unfortunately, if it is a reference to (say) a Google -based news article, the URL may get exceedingly long. The above web site basically generates 9 digit lookup codes, so that if your URL is terribly long, you can shorten it to easily fit onto a line.
TinyURL.com - where tiny is better!
Which produces somewhat shorter links than makeashorterlink's...
APL Web Server
A web server implemented in APL...

2.3. The Joy of Proxy Servers

Things that a proxy server can buy you include:

Not blocking for DNS hostname resolution (probably provided by all proxies...)
Anonymizing your connection - JunkBusters
Changing the claimed client name - JunkBusters
Controlling the transfer of cookies - JunkBusters
Blocking images - JunkBusters
Caching Squid
Allowing a non-SSL browser to use SSL - EDSSL
Allowing end-to-end compression
e.g. - the "Apache knows how to gzip things when talking to a browser that knows how to gzip -d them...") - There's no such proxy just yet...

2.4. Webbed Library Applications

Open Lending Database Project
PHP / MySQL Private Library Management System
GLIBS - GNU LIBrary Management Software
This is coded as a web application using PHP, storing data using PostgreSQL.
It looks as though it would be fairly capable of being used to run a full service library, albeit with a couple of things making it unattractive:
- The code is pretty verbose, with huge amounts of HTML tagging hard-coded in echo() statements;
- When I installed it, the user authentication code didn't seem to work as planned to the point to which I disabled it.
- There are no facilities for pulling bibliographic information from external sources.
  It you have an ISBN identifier, it is possible to pull a lot of data from public sources such as book vendors, and this should be able to considerably streamline the data entry process, which involves a whole lot of manual data entry as currently constituted.
Mind you, if one were to do some scrubbing on it, it might not be difficult to clean up...

2.5. How I Use the World Wide Web...

I typically use the Lynx web browser to read HTML documents. I also fairly commonly make use of Netscape Communicator.

I use Lynx in conjunction with Software Agents written in scripting languages such as Perl, Python, and Guile to search the web for information on such things as stock and mutual fund prices and weather forecasts.

In order to have a single, common, web cache that is persistent and provides service for any web browser software that I might use, I have installed the Squid web cache package (aka "proxy server"). I also have used JunkBuster to "cleanse" my web feed somewhat...

2.6. Blogging Software

For blog fans, there are plenty of options out there...

Many use the combination of MySQL and PHP , such as b2evolution , WordPress Free Blog Tool and Weblog Platform
Fewer are database agnostic, allowing use of PostgreSQL and such, such as Serendipity Weblog System , Movable Type: Publishing Platform for Business Blogs and Professionals .
It is not uncommon for blog systems to eschew databases in favor of flat files, such as blosxom :: the zen of blogging :: .
There are also crazy people building blogs in Common Lisp such as cl-blog .
Into the Personal-Website-Verse
This essay points out that "social media" platforms like Facebook, Twitter, and such have become something of a "dumpster fire", where these centralized services are targets for "sensation, lies, hate speech, and noise". The major platforms have become "bad neighbourhoods."
It proposes that individuals may get their own signals better heard by publishing your own stuff at your own web site, not entrusting those folk to be your publisher. You can't expect to get "top rating" from Facebook, but that was never likely anyways.