Essays

HTML coding style


[ Comments ] [ Copyright ] [ Main contents ]


So you've written your first web page, and it looks good. Congratulations.

Now it's time to see if we can break it.

The web is, above everything else, a cross-platform communications tool. Documents on the web are read by people using PC's running Windows, by Macintoshes, by UNIX workstations ... even by people using IBM 3270 terminals on mainframes.

There's an important lesson here:


Documents look different on different computers!


Let's take that lesson as an example of its own subject. I've used two half-width rulers, centred, above and below a sentence, in red text.

The tags I used look like this:

<P><HR WIDTH="50%"><P>
<CENTER>
<FONT COLOR="FF0000">
Documents look different on different computers!
</FONT>
</CENTER>
<P><HR WIDTH="50%"><P>
What can go wrong with it on different computers?
No <CENTER> tag
The <CENTER> tag was invented by Netscape Communications. It is not an official part of the HTML 2.0 standard. HTML 3.0 was designed to use the ALIGN option to the <P> tag (for example, <P ALIGN=CENTER>). <CENTER> eventually found a home in the HTML 3.2 standard, but it is not universally supported, especially by older browsers. (Lynx 2.5 can handle it; Lynx 2.4 can't.)

The same problem applies to the <FONT> tag as well; it simply isn't part of HTML, prior to 3.2.

'WIDTH="50%" option to <HR> is corrupt SGML
Netscape also had the cool idea of making it possible to specify the width and weight of a horizontal rule. While this is a good thing in principle, their execution was terrible; the percent symbol '%' is a reserved character in SGML! It is not possible to represent tags like <FOO PARAMETER="n%"> in an SGML DTD. Because HTML is defined in terms of SGML, this means that Netscape's HTML is, to put it mildly, broken. A good HTML browser - by which I mean one that conforms to an official HTML Document Type Definition - will choke on this tag.

COLOR values are arbitrary
Is the fully-saturated red colour that I see on my screen the same as the fully-saturated red colour that you see on yours?

The answer is probably "no". First of all, different monitors display colours differently. To some extent this depends on physical factors like the age of the monitor, convergence of the electron guns, colour gun focus, and so on. All of these are liable to change over time. It also depends on factors such as the brightness of the tube (adjustable, but only within a predetermined range).

But it gets worse. Not all computers can display the same number of colours. Many older machines are limited to black and white, or to 16 colours in high-resolution modes; more recent machines can display many ( over 16000) colours, or 256 colours in high-resolution mode. Laboratory workstations can display up to 24 million colours.

The GIF image format assumes that the computer can only display 256 colours; put a GIF on a high-resolution system and it will tend to look a bit grainy and artificial. The Red Green Blue (RGB) colour model used to specify text colours makes the same assumption of only 256 colours.

In general, computers work around the differential between assumed and actual colour depth by 'mapping' between the colours used in an image and the colours available on their screen. Sometimes they dither a clump of pixels so that an approximate match to a precise colour value can be achieved. Dithering and colour map translation result in some nauseous combinations at times. Moreover, the colour model used by the Macintosh differs fundamentally from that used by Windows, and UNIX workstations differ again. So a background colour that looks good on a Windows machine may be indistinguishable from the foreground on a Mac, or vice versa.

It is possible to find colour triplets in the RGB space that are acceptable on all the main platforms, but you shouldn't assume that just because you can read text in a strange colour, your readers share your ability to do so.

This isn't to say that HTML documents can't be made portable.

This page is part of a site that has been built with portability in mind. If you look at it in a text-only browser under UNIX (Lynx, for example), it should be perfectly legible. If you look at it with a fairly old graphical web browser like Mosaic 1.1, it should work for the most part. However, if you use Netscape Navigator 3 or Microsoft Internet Explorer 3, you get whizzy extras: a frame-based Java-assisted navigation tool, background colour, and so on. Moreover, you get the same background colour on Macintosh, Windows 95, and (if it has enough colour cells free) X11R6 on UNIX.

There are additional features that make this page useful, without breaking portability. At the top of this document, in the <HEAD> section, there are some special tags:

<META NAME="Description"
      CONTENT="How to write portable HTML documents">
<META NAME="Keywords"
      CONTENT="html sgml coding sex portability">
These don't really affect the way your users will see the document. They do affect the way they find the file, though. The META tag is used to indicate metainformation; information that describes an HTML file. NAME indicates the name of the information; CONTENT indicates what it is. In this case, the "Description" and "Keywords" metainformation is presented for the use of web indexing services such as Lycos or AltaVista. These search engines specifically check for descriptive metainformation and file it away.

(I mentioned 'sex' in the keywords to demonstrate another point. People using a search engine to hunt for pages mentioning a certain keyword will find those pages that contain it. You can jack up your amount of web traffic - or annoy passers-by - by putting inappropriate but attractive keywords in your headers. But it isn't a good idea to make a habit of it, as a well-known women's magazine discovered when they put keywords like "cunnilingus", "BDSM", and "urolagnia" in the KEYWORDS information for their free dating service home page!)

Some general points to remember when writing documents for portability:

Always plan for the lowest level of compatability
It's easier to make a document or web site navigable by text-only browsers if you start from the beginning. In contrast, trying to retrofit compatability and portability onto an HTML document that uses lots of toys like JavaScripts and applets is a losing proposition -- it's easier to use a tool like html2text to suck the raw text out of the file, and start over.

Don't put images in crucial positions
The first place where portability breaks down is image handling. If you write a page that relies on images to work, you've just written a non-portable page. This page has a navigation toolbar (see below) that doesn't contain any images. Is it any the less useable for not having brushed-aluminium effect buttons that glow green when you press them? No; but it is useable with Lynx, or with image loading turned off.

There is an image on this page - the logo at the top - but it's non-essential. You can use this page even if you can't see it.

Bells and whistles should degrade gracefully
<FRAMESET>s are a curse of Netscape. They are useful for some specialized purposes - notably, control of a website or a complex program with a CGI interface - but they are incompatible with most browsers other than Netscape or Microsoft Internet Explorer. To give them their due, the programmers at Netscape added the <NOFRAMES> tag; as this is un-recognized by browsers that can't handle frames, the text enclosed in it is displayed normally. But it's essential, if you use frames, to use the NOFRAMES option - and use it well. Telling your readers to go away or download Netscape 3 is not going to make you friends or gather influence! It's a sign that you don't understand the web - that you're thinking in terms of television (broadcasting, with a program set by the broadcaster) as opposed to publishing (browsing through bookshelves, with an agenda set by the reader).

Clean, not Fancy
A clean-looking page of text beats a cramped grid with loads of tables and sub-tables and fonts and images and frames. It's easier to maintain, it doesn't make assumptions about the reader's browser, and it can be made to look good quite easily. It's also faster to write. Take this page as an example. I could have used tables to lay out the headers and add running call-outs and coloured side-bars. But it would have taken twice as long to say what I meant to say, wouldn't be readable on most browsers more than 9 months old, and would be a pain in the neck to edit or update. (I write using vi, the UNIX text editor. Not because I can't run or won't use a WYSIWYG editor, but because I am familiar with HTML and it gives me much more fine control. But by the same token, trying to maintain consistency between three or four embedded tables isn't a path to a peaceful life.))

Content rules!
Many websites are beautiful examples of graphic design, pushing the limits of the practical, demonstrating that despite the fact that it was never designed for such a purpose, it is possible to build complex pages using HTML. However, most of the time such over-designed pages have very little to say. Design is NOT a substitute for content. If you don't have anything to say, dressing it up with a thousand pretty animated GIF images isn't going to convince anyone.

I've got a dozen styles and I'm going to use them all
The DTP revolution of the mid-eighties unleashed a tidal wave of bad design. People who would never have dreamed of typesetting a publication before sat down at their PCs and Macs and moused around, because they thought of DTP in terms of software applications rather than in terms of design tools. The software was seen as an enabling technology for the graphically illiterate, rather than as a power tool for the proficient. The results were hideous. We're seeing a repeat of this syndrome today, with people who are barely capable of writing a letter in Microsoft Word trying to slap together home pages. The distinguishing sign of such publications is frequently the egregious over-use of software features that are not only uneccessary, but that detract from the impact of the document.

Just because you can do something, it does not follow that you should do it.

Professionalism
The pursuit of excellence takes time. Nobody wrote a classic work of literature over a weekend. Nobody should expect to be able to design a classic web site in a couple of days, either. Nor should you expect to accumulate the skills to do so overnight. Learning the web takes time, and lots of it. Don't be too disheartened; just don't fool yourself that your site is already as good as it is possible to be. Someone will come and poke holes in it, sooner or later. Continual maintenance is the watch-word.

And remember: we're all in this together.


[ Comments ] [ Copyright ] [ Main contents ]