Previous Table of Contents Next


Part III
Content Modeling: Developing the DTD

10  Following General SGML Declaration Syntax
11  Using DTD Components
12  Formatting the DTD for Humans
13  Evaluating DTDs and Using Parsers
14  Following Good SGML Practice

Chapter 10
Following General SGML Declaration Syntax

This chapter explores some of the basics of SGML and DTD structure (or architecture). Initially, you’ll explore the general syntax of the language (remember, SGML = Standard Generalized Markup Language).

Specific sections explore the use of the key components of SGML in the DTD:

  General SGML declaration syntax
  The DOCTYPE declaration
  Use of comments
  ELEMENTS: declaration and usage
  ATTRIBUTES: declaration and usage
  ENTITIES: declaration and usage

By the time you finish this chapter, you will have gained a perspective on DTDs that will enable you to read and understand them. Those of you from the HTML world may find yourselves gaining a new knowledge and perspective of the underlying architecture that you have already been using for some time.

Publicly Available DTDs May Be Appropriate

Most authors writing in SGML have little need to delve into the myriad of details involved in creating Document Type Definitions (DTDs). In many cases, you are presented with DTDs to use in writing tasks, much as you might receive a combination of writing style guides, outlines, and word processor stylesheets when writing in a traditional word processor-based environment.

In the structured authoring environment of SGML, you are presented with similar guides. These, most likely, include a DTD containing the structural rules of the document type that you’re authoring, instructions on using the DTD, and perhaps an output specification that corresponds to the output formatting of the document. If the document has multiple forms of delivery (such as hardcopy and electronic), you might receive several output specifications.

The DTD that you receive can be specifically designed for a delivery medium, such as the HTML 2.0 DTD designed to be used for delivery on the World Wide Web. It can be industry specific, such as a CALS DTD (defined according to the MIL-M-28001 standard) for the defense industry, or it can be a company-specific DTD designed for a specific class of corporate documents. Table 10.1 illustrates some current standard DTDs that are available.

Table 10.1 Industry-Specific Document Type Definitions

Industry DTD Type

Airline/Aviation ATA 100
Defense CALS (MIL-M-28001)
Publishing AAP (ISO 12083-1994)
Historical and New Scholarly Materials TEI (Text Encoding Initiative)
Internet/World Wide Web HTML
Electronics/Semiconductors Pinnacles and SEED

Why Get Involved with DTD Syntax?

Since standard DTDs are publicly available, you might find yourself asking, “Why should I get involved with all the details of DTDs and DTD syntax?” After all, with today’s tools and the range of existing DTDs, you may not ever have to write one. Most people writing documents in HTML for use on the World Wide Web haven’t the foggiest notion of what DTD they are using when they create documents, home pages, and so on.

Yet even if you never anticipate writing your own DTD, knowing the rules and syntax of how they are constructed can prove to be very useful later. Just as a homeowner might take a building construction course to understand the basics of home construction, you might want to understand some of the mechanics of DTDs.

SGML Declaration Syntax (or, “What Are All Those Angle Brackets Anyway?”)

The syntax for SGML is fairly straightforward, much as with many computing languages. It is also fairly rigid in its requirements for specific things in a specific order. This rigidity aids tremendously in allowing the automated processing of SGML encoded documents. Because computers are abysmal at handling ambiguity, the rules of SGML syntax permit ambiguity only in specific, predefined ways in specific, predefined places.

The rules for SGML declaration syntax are fairly simple (although they may seem a little arcane at first glance). The following illustrates the basic components of an SGML declaration:

As you examine the above declaration, note that there is a way to define the beginning and end of a markup declaration. The Markup Declaration Open (MDO) specifies the beginning of the declaration. Because a declaration can span more than one line, you need to specify the end of the declaration, hence the Markup Declaration Close (MDC).


• See “The SGML Declaration and the DOCTYPE Declaration,” p. 48

Now that you’ve specified the bounds of the declaration (using the MDO and MDC), let’s examine the actual contents. The region identified by DECLARE is where you put the term that identifies the type of object that you are declaring. (Although you can declare a number of objects, the ones that you will declare most often are elements, attributes, and entities.)

NAME is the area where you assign a name to the object that you are declaring. From the time you declare an object, the properties defined in the declaration will be applied to the named object when it is referenced.

The specific content or value of the named object is then defined in the area indicated by contents/values. Note that this area varies considerably depending upon which type of object you are declaring.

Last (and maybe least) is the Markup Declaration Close (MDC). This specifies the end of the declaration.


Tip:  
Errors in DTD parsing are often related to a missing MDC. When debugging DTD parsing errors, keep on the lookout for that missing angle bracket!

If this sounds pretty abstract and obscure, don’t worry. It will start to make more sense as you delve into the specifics of various declarations.


Previous Table of Contents Next