XML, the "eXtendable Markup Language," may be best described as a "simplified" version of SGML. It's a somewhat less "generalized" system for describing document tagging systems. It's less powerful than SGML, but by being simplified, is far easier to parse, which is likely to make it more widely used, particularly out on the Web.
If XML is powerful enough to do useful things, and is easier to use to build HTML-like languages, that may seriously diminish the role of SGML.
On the other hand, if XML makes "good tagged descriptive mark-up" more accessible to "the masses," this may have the converse effect of promoting SGML.
In either case, it can provide a simplified route to the use of structural markup, which is certainly a Good Thing, particularly if it can discourage the use of proprietary binary data formats.
Initial reports suggest that Microsoft's support of XML comes in the form of replacing the RTF format used to (often fairly poorly) move documents between MS-Word and other tools with some XML instance, which appears potentially rather subversive, which comes as no great surprise...
Or: "Why XML is Technologically Terrible, but You Have to Use it Anyway"; notice a fair bit of Lisp advocacy in this...
The SGML/XML Toolbook with worked examples and software of DTDs.
Bringing the Filesystem into the File: Making Data more Accessible
XML: The Annotated Specifications
A book in the "Charles Goldfarb Series" that documents the XML specifications in great detail.
Including The Next 700 Markup Languages.
Ten best bets for XML applications - IBM "White Paper"
Graydon Hoare ranting about XML...
So against this backdrop of incompatible image formats, word processor formats, illustration formats, financial data formats, hell even incompatible email systems, the announcement of XML as this sort of general purpose description format put the wrong idea in a lot of people's heads. The idea is this: if XML is an extensible standard, and we're stuck with a very bad headache of incompatible non-standards, why not try to convert all these non-standards into extensions of XML? It almost sounds like a reasonable question, but I'm going to try to illustrate here why you might decide not to use XML for something.
...
What all these DTDs have in common is that they need additional logic. They need supporting programs in the background to make sense of the XML.
...
This brings me to my central issue: many uses of XML right now benefit not at all from being encoded in XML. Frequently the encodings are of tabular, non-tree-structured data, and frequently it is numerical data for which there is a stunning speed and size penalty to be paid for stringification. Furthermore the encoding in XML encourages people to believe in 2 extremely dangerous falsehoods:
That once the DTD is written, the software somehow already exists to interpret the semantics
That as a result, the semantics can become much more complex without causing any trouble
This seems to apply eminently well to SOAP , where the assumption seems to be made that by using XML, they can trivially add all sorts of complicated layers without making the result so complex that it will crash down under its own weight.
XML.com: Adventures with OpenOffice and XML - on the set of XML schema for LibreOffice.
A wonderful characterization of the problem with " XML everywhere" is thus:
" I really am willing to eat humble pie here and admit that I'm mistaken if someone can give me a similar list of good reasons to not use XML for off-line hierarchically structured data." Any file is a hierarchy of some sort. We often see a file being a sequence of lines, a line being a sequence of fields or tokens, and tokens being a sequence of characters. In many, many, really many applications, this organisation in lines and fields is wholly satisfactory. Reusing the enumeration above, it is easy to parse, easy to validate, easy to edit, easy to query, easy to transform and easy to store. Let's be honest. People are comfortable with lines and fields, examples and tools merely abound. XML becomes more sensible when you have a lot of structure, something which is complex, difficult, and which you have to exchange with away parties. For simple things, it is just annoying and heavy overkill, really... | ||
-- Francois Pinard |
In effect, XML is mainly really useful when you get into applications involving complex structure where the data and code are all pretty much guaranteed to be hairy and ugly.
An XML-based language for addressing parts of XML documents, somewhat analogous to an "SQL for XML."
XLink Filter Project - Creating an Open Source XLink SAX Parser Filter
Read XML document, parse, and turn into a DOM tree
A tool for transforming relational databases into XML documents.
It takes an SQL Query, and then constructs both a DTD to represent the structure of the result table requested, as well as XML data that uses that DTD.
TDTD Emacs Major Mode for editing SGML/XML DTDs
Conglomerate is a project to create a complete structured information authoring, management, archival, revision control and transformation system. Conglomerate uses XML semantics and powerful graphical editing, coupled with a centralised storage model and a flexible transformation language to create an environment which is easy to use, produces high-quality structured output, and lets the user target several output media with a single source document.
There are Win32 and Unix code bases, and this includes a graphical editor for documents.
An XML parser that represents 5K of Java code (hosted at SourceForge )
Cocoon is a 100% pure Java publishing framework that relies on new W3C technologies (such as DOM , XML, and XSL) to provide web content.
Vex is an editor for XML documents, based on the Eclipse platform. The "visual" part comes from the fact that Vex hides the raw XML tags from the user, providing instead a wordprocessor-like interface. Because of this, Vex is best suited for "document-style" XML documents such as XHTML and DocBook rather than "data-style" XML documents.
A Proposal for the Representation of XML DTDs as XML documents
XML: The Annotated Specifications
A book in the "Charles Goldfarb Series" that documents the XML specifications in great detail.
This web site presents a proposal for Linux to use "XML everywhere," instead of the present "lines of text everywhere."
An xterm would be able to display XML in more intelligent ways, offering the option of allowing the screen to contain hyperlinks based on the contents of the file.
The canonical example would be that ls would, rather than generating something like:
total 30 drwxr-xr-x 2 hd1adm sapsys 512 Feb 4 17:01 ./ drwxr-x--- 7 hd1adm sapsys 7680 Mar 1 10:42 ../ -rwxr-xr-x 1 hd1adm sapsys 5387 Jan 22 14:13 impdom* |
<?xml version="1.0" encoding="utf-8"?> <dir> <size>30</size> <directoryentry type="dir"> <attribs>drwxr-xr-x</attribs> <owner>hd1adm</owner> <group>sapsys</group> <size>512</size> <dates>99/02/04 17:01</dates> <name>./</name> </directoryentry> <directoryentry type="dir"> <attribs>drwxr-x---</attribs> <owner>hd1adm</owner> <group>sapsys</group> <size>7680</size> <dates>99/03/01 10:42</dates> <name>../</name> </directoryentry> <directoryentry type="file"> <attribs>drwxr-xr-x</attribs> <owner>hd1adm</owner> <group>sapsys</group> <size>7680</size> <dates>99/03/01 10:42</dates> <name>impdom</name> </directoryentry> </dir>
An XML sort would sort based on named fields rather than forcing the user to parse out the physical positions. Thus, you might sort this by group ID via: ls | sort -recordkey directoryentry -by group-id
The other "neat thing" would be that the Xterm could be written to recognize and do useful things with XML tags. For instance, it could use different colors to display different file types, and associate actions with files. Thus, selecting the directory ../ and clicking a mouse button could be reinterpreted as cd ../. Other displayed things could similarly have useful actions associated with them.
Read XML document, parse, and turn into a DOM tree
ontology.org - A mostly-academic consortium proposing making XML representations of "everything."
TDTD Emacs Major Mode for editing SGML/XML DTDs
Look up RAX - Record API for XML ; initial implementation using Python .
Several Scheme implementations provide XML processors...
MzScheme includes an XML processing module.
WAP (Wireless Application Protocol) Binary XML Content Format
XEXPR - XML Expression Language
This essentially provides a Scheme language embedded in XML tags.
x++: The World's First XML-Based Programming Language
Fortunately, it's only available for Microsoft Windows at this point.