Previous Table of Contents Next


Generalized Electronic Markup

Generalized markup tries to make machine- and software-specific markup transportable. Originally, typesetters used it to prepare documents to be typeset on different brands of typesetting machines. Documents typeset on one model of a manufacturer’s typesetter often could not be printed from another model of the same typesetter. Compatibility is a relatively recent phenomenon.

This type of markup tried to pick up where specific markup left off. A common language was needed to enable documents to be moved from one typesetting/publishing system to another. In the late 1960s, Charles F. Goldfarb led a team, in conjunction with the CGA (Computer Graphics Association) in creating GML (Generalized Markup Language), which is the distant predecessor to SGML. Dr. Goldfarb’s work with generalized markup is foundational in the development of SGML.


Note:  
Dr. Goldfarb is still active today. His well-known The SGML Handbook is a must for your SGML reference library. Several other books and numerous papers are available in bookstores and on the Internet. See Appendix B, “Finding Sources for SGML Know-How.”

Generalized markup provides machine-readable style and format markup that is not specific to any one machine. It separates content, format, and structure without violating their integrity. It imposes a general structure on documents, and it handles all documents according to that structure.

There are many kinds of generalized markup language. Some have been around for a long time, whereas others are experimental. LaTeX, for example, depends on a generic macro system based on TeX, yet another document preparation system (see fig. 1.3). These forms of markup work as long as the formatting mechanism recognizes the formatting codes and understands the document structures that it encounters.


Fig. 1.3  This example of the TeX generalized markup language shows some features it has in common with SGML.


Tip:  
Generalized markup is not portable unless it adheres to a standard set of rules. If it is not standard, the recipients of a document will not know what to do with it.

Standard Generalized Markup—What SGML Does

SGML goes beyond generalized markup by standardizing it. SGML is a standard set of rules for defining document types by their structures and for marking them up so machines can recognize and process documents by those structures. For example, when you define a book chapter like this by its standard structures—such as titles, headings, notes, and so on—a machine can recognize those structures by standardized tagging schemes and build the chapter from them.


Note:  
SGML is the mother of all markup languages. Using its rules, you can create an infinite set of smaller markup languages according to your individual document needs. HTML is one of the language offspring of SGML and is, therefore, an SGML “application.” That is because HTML is actually a single DTD.

Each markup language that SGML engenders must be consistent within itself. It must also be consistent with all the other documents that have ever been marked up in SGML. It must be consistent with ISO 8879, the international standard for SGML.

Figure 1.4 shows an SGML document instance. It has the following components:

  The overall card
  Address
  Return address
  Graphic image
  Salutation
  Body of message with paragraphs
  Signature


Fig. 1.4  This SGML document instance appears complex, but its structure is simple.


• See Chapter 24, “Understanding and Using Output Specifications,” p. 407


Note:  
In SGML, you are not greatly concerned with how a document looks. If you captured the structure of a document, you can represent it with many different appearances. SGML uses output specifications to make it look like you want.

Further, you’ll notice an encoding scheme for the graphic consisting of encrypted characters. SGML also allows you to call upon external processing systems that can translate these characters into the intended image.


You could have just as easily structured the postcard in figure 1.4 into another configuration. For example:

  Side 1
  Side 2

Or this:

  Image on side 1
  Line 1 on side 2
  Line 2 on side 2
  Line 3 on side 2
  Line 4 on side 2

Neither approach is recommended, for reasons you will learn later. SGML, however, supports many approaches for structuring your documents. Some are better than others.

SGML enables you to define document types by their common structures. Each new document type definition will require its own markup language. These markup languages become the collection of tags that tell any SGML processing system how to structurally build the document type for which the markup language was designed. Documents must be marked up consistently according to that markup language or else the SGML processing system will not be able to build them correctly.

How SGML Maintains the Integrity of Content and Structure

SGML maintains the integrity of document content and structure by defining all the document types and their structural elements and by defining their relationships among one another. It accomplishes this through document type definitions (DTDs) and by marking up specific document instances according to the rules in that DTD.

If you have browsed the World Wide Web, you have seen how HTML—itself a markup language with its own DTD—preserves the integrity of document content and structure. SGML enables you to develop your own DTD, like the HTML DTD or even better, to include the document structures you want.

Individual Document Markup

Each SGML document is marked up according to SGML standards. Among the types of markup that you find in SGML documents are the SGML declaration and a reference to the document’s DTD. The DTD contains all the information needed to tell an SGML processing system how to build that type of document. The document must be identified as a particular document type. Therefore, it must be associated with its applicable DTD. The basic parts of an SGML document are:

  The SGML declaration
  The reference to its DTD
  The data content of the document
  The individual tag markup

Figure 1.5 shows a marked-up document.


Fig. 1.5  An individual instance of an SGML document contains an SGML declaration, a reference to its DTD, data, and a markup tag.


Previous Table of Contents Next