Special Edition Using SGML:SGML Terminology

Connectors and Occurrence Indicators

Connectors and occurrence indicators are the symbols that tell the parser how many and in what order elements must appear in a document. For more information, please refer to Chapter 10, “Following General SGML Declaration Syntax.” Table 3.4 summarizes each symbol and its meaning.

**Table 3.4** Connectors and Occurrence Indicators

Symbol (Connectors)	Meaning

,	Sequence (means “followed by”)
&	and (all items must occur but they can be in any order)
\|	or (only one of the alternatives may be used)

Symbol (Occurrence Indicators)	Meaning

?	Optional (only zero or one may appear)
*	Optional and repeatable (zero or more may appear)
+	Required and repeatable (one or more must appear)

You’ll see these connectors and occurrence indicators again and again. DTDs are full of them. Of course, if you’re lucky enough to be working with an installation where the DTDs are already developed, you won’t need to learn these. But in that case, you probably won’t have to worry much about learning SGML either.

Occurrence indicators are handy because you want to enforce the right amount of structure in your document types. This golden means of flexibility is tricky to build into a DTD unless you can tell a processing system to make a “roll call” for each type of element in every type of model group. Some document structures require great flexibility and will have occurrence indicators with the zero or repeatable indicator (*). Paragraphs within a <BODY> structure would probably be required and repeatable (+) because you’ll need at least one paragraph, but you will want to allow for as many as needed, and no fewer. You would need a lot of flexibility for that <PARA> element. But other elements might be required once or not at all. For example, a memo would require only one date, if any were necessary at all. So, you might want to give the <DATE> element an occurrence indicator of optional (?). In an SGML “roll call,” not only must every document structure be present and accounted for, any present but unaccounted for structures get “thrown out!”

Groups in a Declaration

Declarations usually fall into groups in a DTD. Especially with large SGML installations, DTDs can become rather lengthy. This length calls for some sort of order in which to place the declarations in the DTD.

Comments will usually set off different groups of declarations of the DTD. See Chapter 12, “Formatting the DTD for Humans,” for more about this. The basic idea is that whenever someone else has to go in and try to read the DTD, it would be nice if it all made sense to that person. These groups of declarations are different from groups in a single declaration. Within a single declaration, you normally group names with the connectors mentioned above. This keeps the naming easier to understand in a DTD.

Note:
You should keep in mind the different uses of the word group in DTDs. One type of group refers to the three types of groups in declarations: model groups, name groups, and name token groups.
The other use of the word group refers to the formatting of the overall DTD. It’s good practice to separate declarations together into various groups. This improves clarity and makes the DTD easier for newcomers to read.

Using groups in a declaration improves the clarity of that declaration. Excluding Data Tag and Data Tag Template groups (you probably don’t have to worry about those), there are three types of groups in DTD declarations:

• Model groups

• Name groups

• Name token groups

Model Groups. Model groups are those collections of elements in parentheses that you’ve seen earlier. They look like this:

   <!ELEMENT table - - (title,(tgroup+, tnote?)|graphic))>

They are simply groups of models used in the declaration. So tgroup would be one model, tnote would be another, graphic would be another, and title would be still another. Their purpose is to define the contents of whatever is being declared.

Name Groups. A name group is actually part of a model group. The name group is the list of names within the parentheses. For example:

   ((tgroup+, tnote?)|graphic))

consists of the name group tgroup, tnote and the name group graphic. Each grouping within a pair of parentheses is one name group. Name groups have to have the same connector type. They can consist of one or more name token groups.

Name Token Groups. The individual names themselves are called tokens. The key here is that only one type of connector appears within a name token group. The example above shows tgroup and tnote in the same name token group. graphic is not in the same name token group. Here’s another example:

 <!ELEMENT strentry oo (para|nlist|blist|note|caut|warn|txteqn)+)>

The name token group consists of all the names from the left parenthesis and para to txteqn and the closed parenthesis, because they are all within the same parentheses and are all joined by the same connector.

Table of Contents