Previous Table of Contents Next


Non-conversion Can Cost More Later

Have you ever gone through your old floppy disks and run across a file you wanted to look at again? Perhaps you’re copying all your old 400K Mac disks to newer disks and you find that they seem to copy without a problem. However, you can’t open the old documents on the new version of your word processor. The word processor is made by the same company and is the same product, but it’s a couple versions later than the version with which you created the document. Nobody said your old files couldn’t be read anymore; it just happened, and you only found out by chance.

HTML is changing in the same way. Like software products, it’s getting better with each revision (well, at least that’s how software is supposed to be!). But, because it is only one DTD, it can never contain everything everyone wants, and new versions will continue to appear. As new element types are defined and browsers start supporting them, you run into the same maintenance problem as with software upgrades.

You can already find pages all over the Web that use outmoded HTML tagging, or that have features that only work in some particular version of some particular browser. The problem will likely get worse. When browser makers create proprietary tags, they create incompatibility. This is especially true if those tags represent some formatting effect rather than some meaningful unit. But even in the best circumstances (namely, if everyone participates in the HTML revision process and every version of HTML manages to maintain backward compatibility), old files won’t be able to take advantage of whatever new capabilities come from the new tags.


Note:  
One good example of a new HTML capability is the DIV element added in HTML 2.1. It works like the chapter and section elements in many other DTDs. That is, it’s a container for whole units (in contrast, H1, H2, and so on are titles of containers). Using containers makes it much easier for software to manipulate them. A browser or server can generate an outline view, an editor can provide a “move section” command, and so on.

To use DIVs in HTML 2.1 or later, open a <DIV> right before each heading element (i.e., H1, H2) and close it right before the next heading of the same or lower level number (you really need to include the end tags for these). To be extra clear, you can also add a TYPE attribute with a value such as CHAPTER, SECTION, LEVEL1, LEVEL2, or whatever.

Flat HTML like this:

    <h1>How to use the Web</h1>
    <h2>Reading WWW pages</h2>
    <p>Fire up a browser and click</p>
    <h2>Creating your own WWW pages</h2>
    <p>Type lots of pointy brackets</p>
    <h1>How to use other Internet services</h1>

becomes hierarchical (there’s added indentation here to make the structure stand out, but there’s no reason you have to do this):

    <DIV><h1>How to use the Web</h1>
     <DIV><h2>Reading WWW pages</h2>
      <p>Fire up a browser and click</p>
     </DIV>
     <DIV><h2>Creating your own WWW pages</h2>
      <p>Type lots of pointy brackets</p>
     </DIV>
    </DIV>
    <DIV><h1>How to use other Internet services</h1>

By setting up your data in an SGML DTD that fits it, changes only have to happen when you really want them to rather than because you need to accommodate a change to one particular DTD that isn’t designed for your kind of data. Because you are not tied to a single DTD that may not be designed with your application in mind, you can stick with the one you have regardless of how HTML does or doesn’t change.

You still may want to re-tag eventually; for example, you might discover some completely new kind of processing you want to do with the data for which your tagging isn’t enough. The point is that you control the process. You can define the new element type, add it to the DTD, define it through stylesheets or a similar mechanism, and use it. And as browsers appear that aren’t limited to the HTML DTD, you’ll be able to use them.

Poor DTD Design Is Very Costly

Having your documents in a DTD of your choice preserves your freedom and control over your data. But as always, freedom has its own responsibilities and risks. HTML has a lot of design effort in it, and problems in its early versions were flushed out by a huge number of people using it and reporting whatever problems they ran into. This is also true for many other established DTDs, but it won’t be true for one you write from scratch, or for any modifications you make to a standard DTD.

When you start making your own DTDs, be very careful, and think far ahead about the consequences for your data. Chapters 4 through 10 in this book show you how to go about designing a good DTD. Books and courses in DTD design are also helpful (Appendix B, “Finding Sources for SGML Know-How,” lists some useful books). Another source of help is Internet discussion groups such as comp.text.sgml, where many experts participate and help others solve problems ranging from the simple to the very complex.


Previous Table of Contents Next