Previous Table of Contents Next


Tools Required and How To Combine Them

Putting SGML data onto the Web works best if you coordinate several steps. The first is to get the data into SGML somehow. Once it’s there, you may need a way to get it into some final form, either by converting to HTML, printing it, or feeding it to something that can process it directly (like an SGML-aware Web browser). Finally, you’ll likely need some tools for managing the data: tools that notice when documents change, check for broken links, and so on. This is doubly important on the Web, because a document may be linked to from many places that the document’s owner doesn’t even know about.

A lot of the tools center on the server—what your server stores and what it must send to clients. If the server stores SGML, you need to create the SGML somehow, and either use some protocol for sending it to SGML-aware clients or convert it to something your clients can handle. If the server only stores HTML, conversion moves back earlier in the process. In that case, you can use entirely offline processes (that is, create the HTML any way you can and just dump it on any Web server).

There is one other complication if you choose the offline conversion process: all of a sudden, you have to maintain two separate versions of the same document. Because of this, managing updates gets harder. For example, you may write an SGML document using your favorite DTD, then convert it to HTML and put the HTML form on your Web site. This immediately raises a question: which form is the “real” document? If the document never changes this may not be a big problem, but most documents do get changed or updated sooner or later. When that happens, you have to decide whether to update the original SGML and re-create new HTML from that, change the HTML and throw the original SGML away, or to try to separately edit both the original SGML and the HTML and keep them in sync. If you’re using the SGML form for other purposes too, like print or CD-ROM publishing, the only practical choice may be to edit the SGML and re-generate the HTML (which is probably the best choice anyway). However you organize the process, it still involves extra work, and it’s very easy to get files that don’t quite match anymore.

Getting into SGML

Getting into SGML is something this book has talked about a lot; you can author in SGML or convert data from something else. Authoring systems give you a straightforward interface where you can create, move, edit, and delete elements and text intuitively. Many also validate your documents so you can always be sure there are no SGML syntax errors (SoftQuad Author/Editor, ArborText Adept, and Grif). SGML authoring systems have another advantage: They typically write out the documents without a lot of minimization, so you automatically avoid more difficult SGML details that not all software may fully support.


• For more information on SGML converters and other utilities, see Chapter 28, “Other Tools and Environments,” p. 489

Converters come in two flavors—either built-in (like MS Word’s and WordPerfect’s SGML import/export features), or as standalone programs (like DynaTag, FastTAG, the SGML Hammer, OmniMark, Perl, and so on). The built-in kind have recently added validation features, and so are getting much more like native SGML authoring systems. Standalone programs have some tradeoffs. They can handle a wider range of input, even raw scanned OCR files, but they usually don’t guarantee the result will be valid.


Caution:  
It’s nearly impossible to guarantee producing valid SGML output when you have no control of the input (for example, if your data is coming from a scanner and OCR). Think about what would happen if someone tried to take a poem and “convert” it to SGML using a DTD meant for software manuals? Many of the required elements just wouldn’t be there—like version numbers, an index of commands, and so on. You could set up a converter to create those elements, but there wouldn’t be useful data to put into them. In the end, a converter can only do so much.


Previous Table of Contents Next