Special Edition Using SGML:Following Good SGML Practice

Table of Contents

Identifying Internal and External Objects (ID and IDREF). When you link back and forth between cross references or even hypertext links between different elements, you can refer to specific and unique IDs by using ID and IDREF pairs together. Parsers check these pairs together to make sure that they link up properly, so that you can get some extra validation of your links. You can also reference external files by using ID and IDREF pairs.

Consider figure 14.5. It assumes that you have a <P> element and a <GRAPHIC> element. You must also have a <GRAPHICREF> element. You need one element to make the reference and one element to be referred. You also need an element for the graphics object itself, <OBJECT>. Here is how the DTD would look. It’s simplified a little so that you can see the ID and IDREF features more clearly.

Fig. 14.5 An example of using attributes with ID and IDREF.

The document instance might look like:

    <P>When looking at meteors through a telescope, you should know how to
    calibrate the instrument. See <GRAPHICREF idref="fig59-3">Figure 59-3
    for
    the location of the calibration tuner widget.</P>

In Chapter 59, you could locate the part of the document that talks about calibrating a telescope and has a picture that locates the tuner widget. The markup for that graphic might be:

    <GRAPHIC id="fig59-3"><CAPTION>Here’s the telescope’s calibration tuner
    widget</CAPTION><OBJECT artfilenum="12a345"></GRAPHIC>

The object can reside in the file called Chapter 59, or you can use a macro to find fig59-3 and load it into the document on demand.

Tip:
If an object has:

•  Descriptive information about its element
•  Formatting information
•  Pointers and links
•  Nontext data, such as encoded graphics or compressed film clips

use an attribute instead of creating a new element. It will keep your SGML installation simpler—and keep you out of trouble.

Common Mistakes with Attributes

There are several ways to misuse attributes. Here are some of the more common ones.

Attributes as Documentation. Do not use attributes for documentation in your DTD. Do not try to use an attribute for what should be a comment. For example, if you want to make a note about an element’s structure with an attribute, don’t. Figure 14.6 shows an example of what is not good.

Fig. 14.6 It is a poor idea to use attributes as documentation inside elements.

The tabletype attribute expresses a different type of structural element. Presumably, table type a would be different structurally from table types b, c, and d. The type of table is so different structurally that it should be a different element. Do not stress a single structural element just because you think attributes can save you. If the table type is that different, it deserves its own element. Don’t try to make an attribute serve as a structural comment. If you need to make a comment, do what is done in figure 14.7.

Fig. 14.7 It’s a good idea to document information about elements and their content models, but do it this way rather than using attributes.

Notice how the comments introduce each element as a separate structure with a separate content model. The attribute names are unique to each element. This is a much better way to do tables rather than to use attributes to distinguish between different types of tables.

Identical Attribute Names in the Same Element. No two attributes in one element can have the same attribute name in their declared value lists. This is particularly dangerous when you use entities. It’s easy to lose track of what all those entities do inside your DTDs. Figure 14.8 shows an example of what you should not do.

Fig. 14.8 Identical attribute names in the same element are a mistake.

In effect, the attribute team is used twice in the same element. When the parser resolves the content for the %chessent entity, it will stumble on the team attribute again. This is a mistake.

Identical Attribute Values in the Same Element. Sometimes two attributes listed in the same element have an identical value in their respective content models. This creates a problem. Consider figure 14.9.

Fig. 14.9 Don’t use identical attribute values in the same element.

SGML parsers will have a problem keeping the value straight for the noteams and time2mov attributes because attribute values may be encountered in any order in a document, not just the order in the DTD. If there are two or more attributes for which the value “2” may apply, the parser will not be able to resolve which attribute receives the value. This is called attribute name omission because the parser cannot connect the attribute value with the correct attribute name. For this reason, don’t use overlapping content models in attribute declarations.

To fix this problem, you must use a token. In this case, you might use the NUTOKEN value in the value list, as shown in figure 14.10.

Fig. 14.10 Use NUTOKEN declared content to overcome this problem.

Number tokens and other tokens are handy and effective in attribute value lists. Use them for just such occasions.

• See “Types of Data for Attributes,” p. 136
• See “Attributes: Their Use and the ATTRIBUTE Declaration,” p. 182

If you steer clear of these name collisions, you will be much closer to clearing your DTDs—and especially your attribute lists—of parser problems. Even if you find a parser that somehow does not catch these errors, your attributes will be much clearer and more meaningful by following the conventions covered here.

Staying out of mischief by correctly choosing whether to use an element or an attribute helps a lot.

Table of Contents