Chapter 30. Introduction to XML
XML (Extensible Markup
Language) is a document encoding or markup standard that has been
approved by the World Wide Web Consortium. XML is not so much a
language in itself (like HTML), but rather a set of rules for
creating other markup languages. It is a
metalanguage used to define other languages.
If this all sounds highfalutin to you, think of it this way: XML
provides a way for you to make up your own tags! This is a powerful
new tool for exchanging meaningful information.
Consider these two examples, the first using standard HTML markup,
the second using a markup language written according to the rules of
XML:
<p>Bobby Five</p>
<p>4456</p>
<p>111.32</p>
<name>Bobby Five</name>
<accountNumber>4456</accountNumber>
<balance>111.32</balance>
The XML file tells a lot more about the information contained in the
tags. With meaningful markup tags, elements on the page aren't
just headings and paragraphs: they become useful data. So while this
information can be displayed on a page, it can just as easily be
stored in a database (which is a common use of XML-formatted
information). Using XML, various communities -- business groups,
scientists, trade associations -- may now define a markup language
to suit their particular needs for information exchange and
processing over the Web.
XML can also be used to indicate the structure of specialized
information that could not be represented using HTML alone, such as
musical notation and mathematical formulas. Chapter 27, "Introduction to SMIL" illustrates how the XML-based language SMIL is
used to assemble multimedia presentations. Chapter 31, "XHTML" discusses how the rules of XML have been
applied to the HTML authoring language. We'll look at other
examples of XML applications later in this chapter.
30.1. Background
The example at the beginning of this chapter highlights the
limitations of HTML. HTML was designed specifically for displaying
content in a browser, but isn't good for much else. When the
creators of the Web needed a markup language that told browsers how
to display web content, they used SGML guidelines to create HTML.
SGML, Standard
Generalized Markup Language, is a comprehensive set of syntax rules
for marking up documents and data which has existed since the 1980s.
It is the big kahuna of metalanguages! For information on SGML,
including its history, see http://www.oasis-open.org/cover/general.html.
As the Web matured, it became clear that there was the need for more
versatile markup languages. SGML provided a good model, but it was
too vast and complex; it had many features that were unnecessary and
wouldn't be used in the Web environment. XML is a simplified
and reduced form of SGML, tailored just for the needs of sharing
information over the Internet. It is powerful enough to describe
data, but light enough to travel across the Web. Much of the credit
for XML's creation can be attributed to Jon Bosak of Sun
Microsystems, Inc., who started the W3C working group responsible for
scaling down SGML to its portable, Web-friendly form.
As of this writing, XML is in Version 1.0, which was first issued in
February 1998 and revised in October 2000. Various aspects and
modules of XML are still in development. For more information and
updates on the progress of the standard, see the W3C's site at
http://www.w3.org/XML.
One of the first things the W3C did once they had XML in place was to
apply it to the existing HTML specification. The resulting language
is XHTML, which is just HTML rewritten according to the stricter, yet
more expandable, rules of XML. For more information on XHTML, see
Chapter 31, "XHTML".
 |  |  | | 29.8. Where to Learn More |  | 30.2. How It Works |
Copyright © 2002 O'Reilly & Associates. All rights reserved.
|