Previous Table of Contents Next


In this case, SGMLS cannot find the document’s DTD. Check the DOCTYPE declaration at the top of the SGML file to make sure that you specified it correctly. The easiest way to do this is to use a SYSTEM identifier. Type the file name, enclosed in quotes. Remember that this requires you to specify the actual file path for the DTD file; the directory that contains SGMLS is assumed to be the root directory. Simply put SGMLS and your file in the same directory.


• See “The DOCTYPE Declaration,” p. 177

You can also specify the DTD by putting it in a separate file and listing it on the command line between the SGML declaration and the file that contains the document.

The most complicated way to specify the DTD is in the document file with a PUBLIC identifier. You then must follow the instructions in the manual page to create a catalog file that resolves public identifiers into file names, and tells the program where to find the catalog file at runtime. This is the most foolproof way if you are modifying files and file names and using them in different directories.

SGMLS also has a facility for mapping public identifiers directly into file names. This method is complex. Because of its dependence on UNIX environment variables, it does not work on the Mac. Even on UNIX, users regularly experience problems when they use it for the first time.

Cryptic Error Messages. The error messages that you get from SGMLS and other validators use specialized SGML terminology defined in the SGML standard. They can be difficult to understand if you do not know SGML terminology. The most important thing to look for is the line number. The most important thing to remember is that SGML error messages often do not appear where an offending tag or missing quotation mark occurs. Instead, they occur at a later point, when the SGML parser finally sees something that makes the error clear. Look at that line in your document, and read backward to find the point that caused the error.

You can use the switches on the SGMLS command line to make error messages more or less wordy. For simple validation, the most useful switch is the suppress output switch, or -s. The text of the document is normally printed with open and close tag indicators as part of the error stream. This is because many people process this output, which is called the ESIS. If you want to see only the error messages, use the -s switch. For example:

     -s mysgml.decl alice.sgml
SGMLS: ftp://ftp.stg.brown.edu/pub/sgml
This program can also be found at other SGML FTP sites.


• See Appendix B, “Sources for SGML Know-How,” p. 565

Scripting Languages

Scripting languages are general programming languages with good support for string processing and pattern matching. They are tailored for writing file conversion programs and for automating system management tasks. Conversion to and from SGML is a typical file conversion task, so you might find these languages useful if you have some exposure to programming and are willing to learn a general tool. When a conversion is too complex for global changes in an editor, these languages are the next thing to try. They provide the full power of programming in a form that is easy to apply to conversion problems.

You should invest the effort of writing conversion scripts only if you have a very large document, many similar documents to convert, or a conversion that will be repeated many times—for example, a file that is converted whenever it is updated. Writing conversion scripts is often more enjoyable than making changes manually in an editor. However, it is usually not more productive unless you are dealing with a book-sized document or with extremely complex document structures.

You are likely to encounter and use two scripting languages in SGML conversion tasks: Perl and TCL.


• See “The World of Perl,” p. 491

Perl is currently the most popular language for scripting and file conversion. Despite its obscure syntax, Perl’s integrated pattern matching makes it easy to use for text transformations. It can take a while to learn because it has many built-in features for specific tasks.

TCL can be used for the same tasks as Perl. However, its syntax is much simpler, and it has fewer built-in features. Therefore, it’s easier to learn, but it lacks many of the convenient features of Perl. You are more likely to encounter TCL in one of the specialized SGML tools that use it than you are to use it for writing conversion scripts from scratch.

There is a wide choice of free SGML-specific software packages available based on scripting tools. Several of them are mentioned briefly here. Because Perl and TCL run on the Mac, these packages also run on the Mac, even though many were designed for UNIX.

SGMLSPL is a Perl script shell that works with SGMLS. It parses the output of the SGMLS parser and separates its parts. That way, the script can easily process the output. On the Mac, this requires you first to run SGMLS and save the output in a file. Then you must run the Perl program on the resulting output file.

The SGMLPERL package contains an SGML parser written in Perl that calls user-specified scripts. This offers the advantage of an integrated, one-step conversion solution. Because of the difficulties in writing an SGML parser, input to the program must first be validated with a conforming SGML parser to ensure that the file is free of errors.

Another possibility is to use the free TCLYasp conversion tool. It’s a general SGML parser, YASP, integrated with the TCL scripting language. You can use it to create one-step conversion scripts. TCLYasp is similar to a tool called CoST; both integrate an SGML parser with TCL (CoST uses sgmls, TCLYasp uses YASP). CoST has not been ported to the Mac yet, however. TCLYasp is designed around a simple sequential model of parsing. It is simple to learn, but it is better for straightforward conversions than for complicated structural transformations.

SGMLSPL: http://aix1.uottawa.ca/~dmeggins/ sgmlspl/sgmlspl.html
CoST: http://www.art.com/cost ftp://ftp.crl.com/users/je/jenglish
TCLYasp: ftp://ftp.stg.brown.edu/pub/sgml


• These programs can be found at other SGML FTP sites. For URLs, see Appendix B, “Sources for SGML Know-How,” p. 565


Previous Table of Contents Next