Special Edition Using SGML:Handling Specialized Content and Delivery

External Processing of Equations

TeX and LaTex are popular typesetting languages to which math and scientific publishers frequently resort for handling equations. These languages require another application outside of your SGML application to process their code. An SGML document calls up the external application and makes a file or encoded symbols available for processing by that application.

This type of approach could be accomplished by using notation:

     <!ELEMENT equation   - - RCDATA    --equation-->
     <!ATTLIST equation  type NOTATION (TeX|LaTeX|PS|mathcad) TeX >

The notation could be as follows:

     <!NOTATION TeX LOCAL “c:\apps\tex\pctex.exe” >
     <!NOTATION LaTeX LOCAL “c:\apps\latex\wtex.exe” >
     <!NOTATION PS LOCAL “c:\apps\psview.exe” >
     <!NOTATION mathcad LOCAL “c:\apps\mathcad\mc.exe” >

Of course, these notations won’t work on anyone else’s machine, so you probably want to make public notations where you can, or confine your external processors to publicly available software.

If you’re doing heavy equation documents and you have a limited distribution for the documents, you might just pick a specific math program that’s available to your audience. Mathcad has found some popularity among scientists, for example. Because you don’t intend your documents to be widely distributed anyway, you won’t miss the large audience who can’t access your documents because they lack the special software/hardware they need. Anyone who wants to view your documents will be required to use Mathcad, or whatever software package you decide upon.

It’s basically the same story with TeX. Your documents depend on your readers having the necessary external applications to view the equations. If they don’t have the software, they don’t get to use the equations. That’s the big drawback of using external applications to view equations. The advantage is that they make equations look good on the display and paper.

Perhaps, one day, popular HTML browser developers will finally build in support for the math and equation features of the proposed HTML 3.0 revision. So far, the majority support flash over substance by supporting non-standard <BLINK> and <BACKGROUND> elements, but not supporting <MATH> elements.

Equations Structures in the DTD

Perhaps the most advantageous of these three approaches is to build the math structures up in your DTD. Not every machine has access to TeX and other typesetting languages. The whole point of SGML is to make a document transportable. Meeting that goal would presuppose the DTD route.

Many DTDs exist in the public domain for handling math. You can benefit by checking AAP, CALS, ISO12038, ISO/IEC TR 9573-11:1992E, and the Euromath DTD. Even the HTML 3.0 DTD has some math structures in it that might help. (Unfortunately, the latest browsers don’t support HTML 3.0 math structures yet.)

Note:
If you’re publishing scientific documents that are filled with complex equations but have a very limited audience, you might consider using some proprietary processor system—as mentioned above—instead of SGML. One major reason for using SGML in the first place is to allow wide accessibility to your documents. If you don’t need this advantage, and you’re not deriving the benefit anyway, you might just consider sending Mathcad files back and forth.

The first general challenge you encounter with math DTDs is that they require lots of external parameter entities with symbols that you’ll need. You’d better have all those entity sets locally on your system if you’re in doubt as to which set contains which symbols. Check to see which public parameters came with your SGML application. Then check the SGML Archive for further math entities.

The second general challenge is that your math DTD must provide sufficient tags or substructures to break your equations into manageable chunks. Some complex equations might expand to 10 times their normal length when represented with SGML tags. It’s easy to get lost in all those text strings unless you can include key “separating tags” to divide parts of the equation from each other. You might have an <OVER> element that separates the numerator from the denominator in an equation, or even a <SDIFF> element with substructures for simple differentiation. In other words, your DTD can contain an algebra section for algebraic equation structures, a calculus section for integration and differentiation equation structures, a symbolic logic section for symbolic logic structures, and so on. The point is to provide sufficient substructures in your DTD so that the parts of the equation are easy to follow.

Your formatting program also can take advantage of math substructures. Math substructures can tell your formatting program where to break two or more lines of the equation, to place one part of the equation over the other, or to place one equation segment in front of the equal sign and one after the equal sign.

Scientific documents can introduce you to another challenging type of content: footnotes and endnotes.

Linking Revisited

Chapter 9, “Extending Document Architecture,” looked at one popular way of handling links—multimedia links—but SGML offers many different types of links. Sometimes, circumstances dictate that links be formatted a certain way, as in the case of footnotes and endnotes in a document. This chapter examines links of these types.

• See “Adding Hypertext Links to a Document,” p. 160

Table of Contents