Christopher Browne's Web Pages
Prev		Next

PostgreSQL is an extensible "object-relational" database system, the most sophisticated and probably the best supported of the free databases. It has been chosen as Linux Journal "Editors' Choice" in 2001 , 2003 , 2004 , 2005 .

It is an enhancement of the Postgres research RDBMS engine, by Michael Stonebraker (creator of Ingres, one of the first commonly-available relational databases, more recently a "senior technical dude" at Informix , of late teaching at MIT.). People liked Postgres enough that some people decided to make the query language SQL-compliant, at which point, in keeping with other product names at the time, was called PostgreSQL-95 . Subsequent work dropped the "best before" dating.

There was a commercialized version of Postgres that was called Illustra; the company was bought out by Informix, and its object-oriented features have been integrated into Informix Universal Server. People occasionally try to draw out "family inheritance" in the code bases between PostgreSQL, Informix, and Ingres due to Stonebraker's involvement with them all, while there may be some internal similarities, the respective projects were considerably independent from one another, and real similarities aren't likely to be overly conspicuous.

More " free third-party tools" have been created for PostgreSQL than any of the other free databases, and include:

Strengths include:

Maintained, improved, and supported by a stable Global Development Group consisting of a large, diversely distributed set of contributors.
As free software, it provides immunity from "over-deployment". Proprietary vendors evidently regard "over-deployment" as a top license compliance problem.
If you use PostgreSQL, no one can sue you for breaking license agreements, notably as there is no licensing cost associated with using the software.
Support for access from numerous languages such as Perl , Python, Ruby , Tcl, Common Lisp , Ada , Eiffel , Mono , and standards-oriented APIs in the forms of ODBC and JDBC support,
This allows applications written in numerous languages to access PostgreSQL-based data.
Use of a multiple row data storage strategy called MVCC improves responsiveness in high volume environments.
It has the effect the readers do not block writers, and writers do not block readers, and is generally "better than row-level locking".
Similar strategies have been adopted by leading proprietary database vendors for the same reasons.
Support for hot backups, Point In Time Recovery (using Write Ahead Logs) are available for high availability
Network secure, "out of the box".
The pg_hba.conf Host Based Authentication system does not permit remote access with initial "default" configuration, but allows access to the server to be restricted based on host IP, user name, and database.
Supports Kerberos authentication and encrypted connections using SSH/SSL.
Includes a comprehensive manual set including both a tutorial and reference guides, available in various formats including HTML, PDF, man pages.
Supports numerous sorts of indexes, including B-tree, R-tree, hash, and Gist indexes.
Functional indexes allow creating indexes based on complex transformations so that those transformations are run once, at INSERT/UPDATE time, rather than requiring scanning through an entire table for matches.
Partial indexes allow creating indexes that only contain entries that satisfy some conditional expression.
This allows indexes to avoid storing values for common entries (that didn't satisfy the condition), which allows the index to be more selective, and which should often allow UPDATEs to avoid updating the index at all.
Bitmap index scans allow using the intersection of several index scans; for data warehouse applications, this allows having a number of indexes on columns be used together so that you do not need to create numerous indexes involving different combinations of columns.

Here are a bunch of interesting links:

"The Business Case for PostgreSQL" - a business case for usage of PostgreSQL sponsored by Fujitsu.
PostgreSQL ToDo List
PGCon - PostgreSQL conference for Users and Developers
A conference on PostgreSQL to be held in Ottawa, Canada, normally immediately after BSDCan, a BSD conference.
Use of a multiple row data storage strategy called MVCC improves responsiveness in high volume environments.
It has the effect the readers do not block writers, and writers do not block readers, and is generally "better than row-level locking".
Similar strategies have been adopted by leading proprietary database vendors for the same reasons.
Support for hot backups, Point In Time Recovery (using Write Ahead Logs) are available for high availability
Network secure, "out of the box".
The pg_hba.conf Host Based Authentication system does not permit remote access with initial "default" configuration, but allows access to the server to be restricted based on host IP, user name, and database.
Supports Kerberos authentication and encrypted connections using SSH/SSL.
Includes a comprehensive manual set including both a tutorial and reference guides, available in various formats including HTML, PDF, man pages.
Supports numerous sorts of indexes, including B-tree, R-tree, hash, and Gist indexes.
Functional indexes allow creating indexes based on complex transformations so that those transformations are run once, at INSERT/UPDATE time, rather than requiring scanning through an entire table for matches.
Partial indexes allow creating indexes that only contain entries that satisfy some conditional expression.
This allows indexes to avoid storing values for common entries (that didn't satisfy the condition), which allows the index to be more selective, and which should often allow UPDATEs to avoid updating the index at all.
Bitmap index scans allow using the intersection of several index scans; for data warehouse applications, this allows having a number of indexes on columns be used together so that you do not need to create numerous indexes involving different combinations of columns.

Here are a bunch of interesting links:

PostgreSQL BuildFarm
This web site aggregates together results from regression tests produced by dozens of locations where people regularly do PostgreSQL builds followed by regression tests. The virtue of this is that any time any change is committed to the code base, numerous systems, running diverse combinations of operating systems, compilers, and optional extra modules verify either that the change was widely compatible, or that it wasn't. This quickly and frequently points out portability regressions throughout the development process.
I operate the node called kite.
PostgreSQL Git Hosting
People are hosting various projects' source code including PostgreSQL itself and Slony-I using Git.
PostgreSQL Frontend/Backend Protocol
PostgreSQL General Bits
General Bits is a column based on the PostgreSQL mailing list pgsql-general.
There's a Linux Database HOWTO on PostgreSQL. You can find it sold at Amazon as the PostgreSQL HOWTO book. The author waxes more than a little overevangelistic/overenthusiastic about PostgreSQL, vastly overstating its advantages, and happily insulting alternatives as being not even worth thinking about. He makes the same sorts of claims that allow Project Gutenberg proponents to claim that their work is worth "billions of dollars," when the fact that the public values them enough to donate a few tens of thousands of dollars suggests that they may be overvaluing things a mite. He has a HOWTO on "building your own CPU" that has similarly overoptimistic valuation of the merits of designing a "Free CPU" (in the free software sense).
There are also a number of books available, some even free of charge.
PostgreSQL: Introduction and Concepts
Note that PostgreSQL uses GiST to implement advanced sorts of indices.
January 2003 Linux Productivity Magazine: PostgreSQL
Using PostgreSQL as the data store for OpenLDAP
PostgreSQL has long supported MVCC ...
PostgreSQL MONO connection; there are some libraries to allow you to use PostgreSQL with MONO...
pgfsck - PostgreSQL table checker and dumper
This tool provides "best efforts" at recovery. You should only run it if you have encountered True Disaster, and are trying to recover whatever bits you can, with the full expectation that some bits may be irretrievable.
Accessing PostgreSQL databases using LibreOffice
- ooffice PostgreSQL SDBC driver
Multi-Version/Multi-Cluster PostgreSQL architecture
This article studies the Debian packaging of PostgreSQL which is being redesigned to allow multiple versions and multiple clusters to coexist. That will allow orderly version upgrades.
PostgreSQL CHAR(10)
A PostgreSQL conference with focus on clustering, held at Oxford University. See also PostgreSQL CHAR(11)
PostgreSQL Platform Notes
I'm responsible for a number of entries in this FAQ; AIX is a somewhat odd-ball form of Unix .
apt.postgresql.org README
For aficionados of Section 3 and derivatives using similar package tooling such as , this provides up-to-date installable packages for PostgreSQL
PgLife: Postgres Community Life
A continuous data feed listing the most recent activities in a number of places (mailing lists, IRC feeds, and such).

1.1. Replication Systems

Slony-I
A single-master, multi-slave replication system implemented in a combination of C and pl/pgsql, compatible with PostgreSQL versions 7.3.3 and later.
PeerDirect (Replication Software)
Whitebeam Replicate
This is a reimplementation of the Perl-based DBMirror.pl replication system in C++ .
Postgres-R
Replication of PostgreSQL databases using Spread and Ensemble . Unfortunately, this code hasn't been integrated with recent versions of PostgreSQL. It is likely that Slony-I will be a more useful option.
There's also a paper on this; [Don't be lazy, be consistent: Postgres-R, a new way to implement Database Replication ]
PostgreSQL Replicator
Replication hooks for PostgreSQL GORDA
Challenges Involved in Multimaster Replication
This article is mostly about the issues surrounding replication with Oracle ; the challenges are quite representative of why multimaster replication is troublesome in general.
Why Clusters Usually Don't Work
This article isn't directly about replication; it's about the kinds of problems characteristic to clusters that suggests that they are not panaceas, notably (and not all of this is directly in the article):
- Hardware costs tend to multiply
- Cluster behaviour is not the same as the single server alternative, so QA effort is multiplied
- Due to the need to do extra work to synchronize data between nodes, performance is compromised.
- Due to the need for more complex configuration and extra nodes, clusters are comparatively fragile. Uncareful administration can make the cluster "shatter."
Sometimes particular applications are already designed in such way that they cluster cleanly with little trouble. DNS is an example of such an application; its normal protocols define a consistent way of handling clusters of DNS servers. But this is highly unusual.

1.2. Interesting Extensions

pgpool
pgpool is a connection pool server for PostgreSQL, standing between PostgreSQL clients and a server. Any PostgreSQL client may connect to pgpool as if it were a real PostgreSQL server. It can serve a number of functions:
- pgpool caches database connections to reduce the overhead involved in establishing connections.
  It may also be used to diminish the number of connections that need to be opened, which can improve the efficiency of server usage.
- pgpool may be used for failover purposes. If the "main" server goes down, pgpool can automatically switch connections over to the secondary server.
- Replication
  Update queries may be applied to multiple servers, so that this would represent a sort of "poor-man's replication scheme."
- Load balancing
  SELECT queries can be submitted in some form of "round robin" against multiple servers.
There are plans afoot to build a more sophisticated scheme to dynamically configure pgpool in conjunction with Slony-I. With Slony-I, you may submit a request to shift the "master" role from one database to another, perhaps for handling a system failure, or to take a server out for maintenance. A useful enhancement to pgpool would be to have Slony-I inform it to defer all requests when that switchover process starts, and then to start to submit them to the new master as soon as it is ready. That potentially allows applications to be switched to a new "master" fairly much invisibly to the users.
ExtenDB - Cost-effective Data Warehousing and BI
Enumerated Fields in PostgreSQL
pgmemcache
This is a PostgreSQL-based implementation of Memcached . The general idea is that you have a transactional database as the "official store" for vital information.
If you have applications which really frequently access this information, where is isn't a forcible disaster if they're not completely up to date, you can improve performance by having a distributed cache that takes the "heat" of heavy queries. Users first try to hit the cache; if they fail, then they hit the database, and draw the results into the cache. Updates will naturally invalidate the cache; either your applications must explicitly mark entries as dead or repair them, or triggers on tables are used for cache entry invalidation.
This sort of algorithm has proven very useful for various sorts of query-heavy database-driven applications where entries are much more frequently read than they are written:
- Encyclopedia-like applications like Wikipedia;
- Discussion boards like Slashdot ;
- The WHOIS service
Another "home" for pgmemcache may be found here.
PostgreSQL pgbash
An extension of Bash that allows fairly direct scripted database access.
pgqueryspy - spy on PostgreSQL queries
pgnotifyd
A daemon to assist monitoring LISTEN/NOTIFY activity.
pqc - PostgreSQL Query Cache
tPostgres
This is a project (not totally clear degree of proprietary versus free) to build extensions to Postgres to introduce SQL Server compatibility.

1.3. Tools

pgTune
pgtune takes the wimpy default postgresql.conf and expands the database server to be as powerful as the hardware it's being deployed on
pgLoader
The PostgreSQL Loader project is a fast data loader for PostgreSQL, with the particularly interesting ability to generate files of rejected rows. It is written in Tcl.
If enough developers get interested, the plan is to develop pgloader into an analagous utility to Oracle's SQL*Loader.
PgMail - send email from inside PostgreSQL .
This permits attaching triggers so that messages may be when automagically sent when columns are modified, or tuples are inserted or deleted.
PostgreSQL table checker and dumper
GCC XML Introspector Project: Welcome to the GCC AST Tree_Node Project with Postgres Interface
SQLBrowser
SQLExplorer(tm)-like tool for data/metadata browsing of SQL database servers through Borland's dbExpress client libraries
PgAccess - a Tcl/Tk-based interface,
dbbalancer.sourceforge.net
A project combining connection pooling, load balancing, and data replication services for PostgreSQL
pgdiff - PostgreSQL Diff Utility
pgdiff is a utility which compares the table definitions of two databases, and returns the differences as PostgreSQL commands which will transform the structure of the first database to be identical to that of the second (c.f. diff and patch).
WebDav Interface to PostGreSQL
PgWorksheet
MySQL Compatibility Functions for PostgreSQL
PostgreSQL cheat sheet
Bruce Momjian -- Presentations on PostgreSQL
Playr - PostgreSQL Log Replayer
tending the garden Drupal + PostgreSQL: review some patches, folks!
For interesting secondary systems, there would be considerable value in having PostgreSQL folk reviewing things to make sure that Drupal code is in good order.
Don't Do This
Here is a list of relatively common mistakes; a whole bunch of PostgreSQL features that, while technically legitimate (e.g. - might be mandated by SQL standards) are features that You Probably Shouldn't Use unless you're really certain of what you're doing (and perhaps not even then).

1.4. Programming PostgreSQL

PostgreSQL provides an interface for tying languages to the database server, allowing the Gentle User to develop stored functions in quite a variety of languages.

SQL - included internally in standard distribution
For very simple requirements, this is doubtless the quickest and easiest "stored function" language
PL/pgsql - included internally in standard distribution
This is very well suited for running code that consists of sets of inter-referenced database queries.
C - included internally in standard distribution
Writing C/SPI procedures is almost certainly the hardest way of writing stored functions, but since PostgreSQL is written in C, these procedures are able to access things deeply internal to the database engine inaccessible by any other means.
In principle, the ability to compile code into optimized machine language might make this the fastest option for cases where stored functions need to run tight loops of integer or floating point arithmetic processing. That's not usually consistent with how one would want to use a database in a database application, so that it would seem unusual to be able to meaningfully harness that advantage.
Tcl - included internally in standard distribution
Perl - included internally in standard distribution
Python - included internally in standard distribution
Ruby
Scripting languages tend to offer highly tuned string processing "engines" providing convenient ways to quickly munge around with strings.
PL/R User's Guide - R Procedural Language
On using the R statistical language in conjunction with PostgreSQL.
Embedding R in Postgres
Articles on how languages were embedded in with the PostgreSQL engine so that statistical functions implemented in R could directly access database data and then be exported as SQL functions.
plPHP - programming in PHP
PL/sh handler
PL/Java
PL-J
A Java Stored Procedure execution system for PostgreSQL.
PL/Scheme
A Scheme Stored Procedure execution system for PostgreSQL using Guile.
PL/mono
plMono is a PostgreSQL language using the embedded Mono runtime. Would-be, at least; it seems more fabled than reality...
In principle, it could provides support for writing functions in C#, or any other language that supports .NET. At the time of this writing that includes: APL , C++, COBOL , Eiffel , F# (a Caml derivative), Forth , Fortran , Haskell, ML , and probably others.
pglib - Python /Twisted implementation of PostgreSQL protocol v3
Writing PostgreSQL Functions with PL/pgSQL

1.5. Tuning PostgreSQL

PostgreSQL often gets accused of being rather slow. Some of that represents the way things were years ago. When observed recently, it is typically the result of taking an install with all of the defaults from what is generated by the source distribution.

The defaults are configured to be able to function on all platforms on which PostgreSQL runs, and are definitely not optimal for typical usage.

Fortunately, it is quite straightforward to achieve massive improvements by looking at a very few things. Here is a sort of " top ten" list of things that should be done to improve performance, and why:

Increase, in postgresql.conf, the value of shared_buffers.
The defaults are low because on some platforms (Solaris and SGI, notably) it requires invasive action like recompiling the kernel in order to use larger values.
The shared buffers are used to hold query results that are in progress. The default of 512K is quite likely to be low. The "rule of thumb" is to estimate the "right value" to be between 5% and 15% of total system memory.
Note that this value does not include any OS file caching that may take place; look below at effective_cache_size, which addresses that. Note that any time records are updated, they must be pushed out to disk, thereby invalidating both this cache as well as the OSes filesystem cache.
Increase, in postgresql.conf, the value of effective_cache_size.
This parameter indicates to the query optimizer how much OS filesystem cache you expect to have available. On a server used solely for running PostgreSQL, this might represent most of your system's physical memory.
Run VACUUM on database tables frequently.
Any time records are deleted or updated in a table, this leads to generating " dead tuples" for this table. They are not automatically cleaned out; that requires running VACUUM. If the table is not vacuumed, the postmaster will have to rummage through the dead data, only to discard it as useless. It may be no big deal if there are 10,000 dead tuples in a table with a million entries, but performance will be bad if (say) a table of financial balances has a few hundred live entries combined with thousands of obsolete dead tuples.
At one time, VACUUM locked whatever table was being cleaned; that has long changed, so that it now does not block activity on the system. Seemingly paradoxically, the more often you run VACUUM, the faster it tends to run, and the more likely it is that it will do some good.
If there get to be really a lot of dead tuples in a table, there may be more free space than the free space map knows to manage, and a proper cleanup may require the other version, called VACUUM FULL.
A recently developed program, pg_autovacuum, monitors database updates, and automatically initiates VACUUMs on just those tables that need cleaning.
Run ANALYZE on the database periodically.
PostgreSQL's query optimizer is a cost-based system, and hence needs to have some statistics on what data the various tables on the system contain.
Until you run ANALYZE, the system may have no better ideas than to do sequential scans of all the tables involved in queries.
The same is true for other cost-based optimizers; I have also seen Oracle bog down horribly on queries until some statistics built up at which point those queries sped up immensely.
Move the WAL files to a separate physical disk.
That is, set the directory pg_xlog to point to a directory on a separate disk from that which the data is stored on. These files, that contain the most recent updates to the database, are heavily updated, and if you can separate these updates from those concurrently taking place on database tables, you can get an easy performance boost, often on the order of 25%.
It may also be helpful to ensure text logs are being written to a separate physical device.
If your requirements are more sophisticated, it may even be worthwhile to move individual database files to separate disks if you have enough disk drives to do so.
Mind you, if you have such sophisticated requirements, it is likely to make even more sense to build fairly sizable arrays of RAID disk, at which point all the disk activity will be automatically heavily striped. In such a situation, it is fairly likely that separating certain database files to specific disks would lead to less use of striping, and perhaps lower performance a little.
My boss did some benchmarking where he compared having WAL and the rest of the data all "shared" on a big disk array with splitting off a couple of disks to just do WAL; he couldn't see any measurable performance difference. It does appear to be the case that the smarter the disk array, the less significant it is to try to finely attune your database's configuration to the exact set of files and filesystems.
Increase, in postgresql.conf, the value of sort_mem
If you do a lot of complex sorts, and have a lot of memory, then increasing this parameter allows PostgreSQL to do larger in-memory sorts which, unsurprisingly, will be faster than disk-based equivalents.
Mind you, this size is applied to each and every sort done by each user. Set it to 50MB, and have 30 users submitting queries, and you are soon using 1.5GB of real memory.
This is where " Data Warehouse" systems, where users are submitting very large queries, can readily make use of multiple gigabytes of memory. 32 bit architectures, where there is a hard limit of 2GB per process, are starting to look pretty creaky. If AMD x86-64 systems make it inexpensive to deploy systems with tens of gigabytes of memory, that will be quite a big deal...
Reduce, in postgresql.conf, the value of random_page_cost
If you have particularly fast disks, as commonly found with RAID arrays of SCSI disks, it may be appropriate to lower this figure, which will encourage the query optimizer to use random access index scans.
Increase, in postgresql.conf, the value of max_fsm_pages and max_fsm_relations
The Free Space Map is used to track where there are dead tuples that may be reclaimed. You will only get effective nonblocking VACUUM queries if the dead tuples can be listed in the Free Space Map.
As a result, if you do not plan to run VACUUM frequently, and if you expect a lot of updates, you should ensure these values are usefully large. It should be easy enough to set max_fsm_relations high enough; the problem that will more typically occur is when max_fsm_pages is not set high enough. Once the Free Space Map is full, VACUUM will be unable to track further dead pages. In a busy database, this needs to be set much higher than 1000...

This represents a set of actions that are highly specific to PostgreSQL; if you go through these activities, you should see substantial improvements in performance over the " default" settings.

After that, it also makes sense to look to additional techniques to tune specific queries. Typically, this involves examining the query plans for particular queries, and experimenting to see what changes improve those plans. This commonly involves adding indices that allow your queries to be efficiently selective about what records they retrieve. Evaluating the efficiency of queries is a bit of a "black art" where it simply takes time and practice to recognize the goodness of a query plan.

There is a growing set of literature on performance tuning:

PostgreSQL 9.0 High Performance
A book by Greg Smith
Tuning Your PostgreSQL Server
Tuning PostgreSQL for performance
Annotated postgresql.conf file
The PostgreSQL TOAST (The Oversized-Attribute Storage Technique) project
Large fields are stored in TOAST files, where attempt is made to compress them. As a result, no special effort need be made to compress " very large fields".
Performance Tuning PostgreSQL

1.6. Using SEQUENCES

When you define a UNIQUE column in a relation, there needs to be some way of assigning values to that column. And it is an exceedingly unattractive idea to force this on your application. In the domain registry arena, I have observed cases where domain registrars have been expected to assign unique IDs to objects, and there are always one or two that have trouble doing so coherently.

If we are creating a new employee, or a new transaction, or a new "whatever it is," and require that the object be unique, it is an excellent idea for the database to handle this. There are two typical methodologies:

Sybase "auto increment" fields
This approach, also used by Microsoft SQL Server, Microsoft Access, and MySQL AB's MySQL, attaches this as an attribute of a column.
The semantics may vary somewhat, but may be approximated by the notion that when a new tuple is created, the column is given the value MAX(column) + 1.
This approach can lead to performance bottlenecks since the database has to continually recalculate this aggregate. More of a problem, it can lead to concurrency problems if there are a great deal of concurrent transactions on the table, leaving open the question of when the actual value for the column can be assigned to the tuple.
Sequence objects
This approach is used by PostgreSQL and Oracle, and establishes a "sequence object" which provides an increasing sequence of values on demand.
Unlike " auto increment", the value is maintained solely in the sequence object, eliminating aggregate-related issues.
In the case of PostgreSQL, a sequence object can be configured to cache multiple values within a single backend process so that if you are generating large numbers of values, a particular backend might " lay claim" to a range of 100 values all at once, making clashes between concurrent processes go away.
A disadvantage to cacheing is that values can and will get skipped along the way. If you were expecting them to fall in some exact serial order, that expectation will fail.

SQL standards did not have any such concept until recently; the approach ISO took was to define sequence objects fairly similar to what PostgreSQL uses. You can create columns in relations of the SERIAL type; that automatically generates a sequence object and uses it to generate default values for the column.

There are two other rather clever effects that may be achieved with PostgreSQL sequences:

A sequence may be shared across multiple columns in multiple relations.
So if you have multiple relations that may contain "transactions" that should all be uniquely identifiable, you might set up a transaction_id value that defaults based on a shared sequence object.
You might want a sequence to cycle
If you have five printers, and want to use them all equally, you might select a printer based on creating a sequence thus:
create sequence printseq minvalue 1 maxvalue 5 cycle;
To select a printer, you simply do select nextval('printseq'); which will cycle through the five values.