Special Edition Using SGML:SGML Terminology

Comparing Declarations in a Document and a DTD

The DTD is where the heavy-duty declaration happens. Any declarations in the SGML document are usually only echoes of the DTD. In fact, the primary declaration found within document instances is simply the SGML declaration. The following tables summarize the two sets of declarations, one for SGML documents and one for DTDs.

**Table 3.5** Declarations Found in an SGML Document

Declaration	Purpose and How Different

Doctype declaration	Says which DTD is applicable. This is also used to include various entity inclusions for the document.
SGML declaration	Contains specific information for SGML processing system that says which features are supported and which are not.

**Table 3.6** Declarations Found in a DTD

Declaration	Purpose and How different

Markup declaration	Purpose covered earlier. These are not covered in document instance.
	Includes keywords like ATTLIST, ELEMENT, and ENTITY. These are not found in documents.
Element declaration	Sets up each element so it can be used in individual documents.
Attribute declaration	Sets up all attributes for each element so they can be used in individual documents.
Entity declaration	“Initializes” entities so they can be used in documents. (Some entities can be set up in individual documents first, however, without having been set up in a DTD.)

As you can see, declaration happens a lot in DTDs, while only once in document instances. For further details on declarations and how to handle the details of DTD declarations and declarations in document instances, see Chapter 10, “Following General SGML Declaration Syntax.”

Blending Content and Structure in Diagrams

As discussed in Chapter 2, blending content and structure in diagrams represents one of the most difficult challenges in SGML. But there are tools that you can use to help. Structure diagrams are one such tool.

Structure diagrams are much like tree diagrams that you read downward and to the right. The “blocks” on the left are structural blocks, while on the far right they become content blocks. The structure represents containers that the content fits into. When you take the content out of the structural containers, what you see is something like a structural diagram. It’s possible to look at structure in different ways. It’s also possible to structure a single document in different ways. Figure 3.5 shows the memo you examined earlier in figure 3.2.

Fig. 3.5 The structure of this memo could actually be looked at in different ways.

For example, the memo could be structured as any one of the structure diagrams shown in figure 3.6.

Fig. 3.6 This figure shows four different ways the same memo could be structured.

As you can see, there are several ways to look at the memo example. Basically, you can gauge a document’s structure by how much of a hierarchy is involved. If elements are nested 15 levels deep within each other, you either have a very complex document, or you should restructure it. The rule of thumb is to structure it as simply as your collection of documents will allow. You don’t want to build in any more complexity than what already exists. If anything, you want to simplify your documents by making their structures accessible.

Example 1 is pretty good, except for the <INTRO> element. It doesn’t really need to be there, and it introduces complexity that could be avoided. This means that for every collection of <DATE>, <SUBJECT>, <TO>, <FROM>, and <SALUTATION> tags, you have to add an extra set of <INTRO> tags. Why do that here? It doesn’t really buy you anything.

Example 2 has a different problem. It’s too flat. There’s not enough hierarchy there. What that means is the paragraph can only appear after the salutation and before the closing. What if you wanted two or more paragraphs? It would be easier if you had a <BODY> element that included one or more <PARA> elements.

Example 3 is my personal preference. You don’t have any more hierarchy than you need. Everything that needs to be there is there, and the structure doesn’t add any more complexity to the document.

Example 4 has way too much complexity for the document. Even a two line memo would have to have <OPEN> and <CLOSE> tags for <FRONT>, <BODY>, and <END>, and the <FRONT> element would have to include <DATE>, <MIDDLE FRONT>, and <LATER FRONT>. The <TO> and <FROM> elements would have to be contained in the <MIDDLE FRONT> element, and <LATER FRONT> would have to include <SUBJ> and <SALUTATION>. That’s very complex, and you’re not even out of the first part of the two line document.

You want to find the golden mean when it comes to designing document structure. You don’t want any more than is necessary, but you want enough for your needs. If you have too many structural elements, they complicate your markup and cost you more time with no added benefit. If you have too few structural elements, it’s like trying to move your household without enough boxes in which to pack everything. With SGML, you want to be able to move your document’s content smoothly, just like when the best professional movers pack up your household belongings and move them across the country for systematic unpacking in your new home. Moving documents is easier than moving people any day!

Table of Contents