1.1. The Web's Fall from Grace
Back in the dimly remembered early years of the Web (1990 -1993),
HTML was a fairly lean little language. It was almost entirely
composed of structural elements that were useful for describing
things like paragraphs, hyperlinks, lists, and headings. It had
nothing even remotely approaching tables, frames, or the complex
markup we assume is a necessary part of creating web pages. The
general idea was that HTML would be a structural markup language,
used to describe the various parts of a document. There was very
little said about how these parts should be displayed. The language
wasn't concerned with appearance. It was just a clean little
markup scheme.
Then came Mosaic.
Suddenly, the power of the World Wide Web was obvious to almost
anyone who spent more than ten minutes playing with it. Jumping from
one document to another was no harder than pointing the mouse cursor
at a specially colored bit of text, or even an image, and clicking
the mouse button. Even better, text and images could be displayed
together, and all you needed to create a page was a plain text
editor. It was free, it was open, and it was cool.
Web sites began to spring up everywhere. There were personal
journals, university sites, corporate sites, and more. As number of
sites increased, so did the demand for new HTML tags that would allow
one effect or another. Authors started demanding that they be able to
make text boldfaced, or italicized.
At the time, HTML
wasn't equipped to handle these sorts of desires. You could
declare a bit of text to be emphasized, but that wasn't
necessarily the same as being italicized -- it could be boldfaced
instead, or even normal text with a different color, depending on the
user's browser and their preferences. There was nothing to
ensure that what the author created was what the reader would see.
As a result of these pressures, markup elements like
<B> and <I> started
to creep into the language. Suddenly, a structural language started
to become presentational.
1.1.1. What a Mess
Years later, we have inherited the flaws inherent in this process.
Large parts of HTML 3.2 and HTML 4.0, for example, are devoted to
presentational considerations. The ability to color and size text
through the FONT element, to apply background
colors and images to documents and tables, to space and pad the
contents of table cells, and to make text blink on and off are all
the legacy of the original cries for "more control!"
If you want to know why this is a bad thing, all it takes is a quick
glance at any corporate web site's page markup. The sheer
amount of markup in comparison to actual useful information is
astonishing. Even worse, for most sites, the markup is almost
entirely made up of tables and FONT tags, none of
which conveys any real semantic meaning to what's being
presented. From a structural standpoint, these pages are little
better than random strings of letters.
For example, let's assume that for page titles, an author is
using FONT tags instead of heading tags like
H1, like this:
<FONT SIZE="+3" FACE="Helvetica" COLOR="red">Page Title</FONT>
Structurally speaking, the FONT tag has no
meaning. This makes the document far less useful. What good is a
FONT tag to a
speech-synthesis
browser, for example? If an author uses heading tags instead of
FONT tags, the speaking browser can use a certain
speaking style to read the text. With the FONT
tag, the browser has no way to know that the text is any different
from other text.
Why do authors run roughshod over structure and meaning like this?
Because they want readers to see the page as they designed it. To use
structural HTML markup is to give up a lot of control over a
page's appearance, and it certainly doesn't allow for the
kind of densely packed page designs that have become so popular over
the years.
So what's wrong with this? Consider the following:
-
Unstructured pages make content indexing inordinately difficult. A
truly powerful search engine would allow users to search just page
titles, or only section headings within pages, or only paragraph
text, or perhaps only those paragraphs that are marked as being
important. In order to do this, however, the page contents must be
contained within some sort of structural markup -- exactly the
sort of markup most pages lack.
-
A lack of structure reduces accessibility. Imagine that you are
blind, and rely on a
speech-synthesis
browser to browse the Web. Which would you prefer: a structured page
that lets your browser read only section headings so you can choose
which section you'd like to hear more about; or a page so
lacking in structure that your browser is forced to read the entire
thing with no indication of what's a heading, what's a
paragraph, and what's important?
-
Advanced page presentation is only possible with some sort of
document structure. Imagine a page in which only the section headings
are shown, with an arrow next to each. The user can decide which
section heading applies to him and click on it, thus revealing the
text of that section.
-
Structured markup is easier to maintain. How many times have you
spent long minutes hunting through someone else's HTML (or even
your own) in search of the one little error that is messing up your
page in one browser or another? How much time have you spent writing
nested tables and FONT tags, just to get a sidebar
with white hyperlinks in it? How many line-break tags have you
inserted trying to get exactly the right separation between a title
and the following text? By using structural markup, you can clean up
your code and make it easier to find what you're looking for.
Granted, a fully structured document is a little plain. Due to that
one single fact, a hundred arguments in favor of structural markup
wouldn't sway a marketing department away from the kind of HTML
so prevalent at the end of the twentieth century. What was needed was
a way to combine structural markup with attractive page
presentation.
This concept is nothing new. There have been many style sheet
technologies proposed and created over the last few decades. These
were intended for use in various industries and in conjunction with a
variety of structural markup languages. The concept had been tested,
used, and generally found to be a benefit to any environment where
structure had to be presented. However, no style sheet solution was
immediately available for use with HTML. Something had to be done to
correct this problem.