Previous | Table of Contents | Next |
Once you have SGML, youre in good shape to convert to almost anything else, and HTML is an easy target. In the simplest case, you just do a bunch of global changes to rename tags from the names in your DTD to the names in HTML; for elements that occur in all kinds of documents this works just fine: lists, paragraphs, headings, and so on.
Caution:
The first time people do this, they usually miss a few cases. Here are a few to watch out for if youre using software that doesnt really know SGML:
- Remember to allow for attributes in start-tagsyou cant count on the > being right after the tag name.
- Be very careful if you use minimization. A tag could have either pointy-bracket missing, the element type missing, or the tag might be completely missing if you use OMITTAG (in that case, it might be a little hard to write a global change to catch it!).
- If youre changing between two different DTDs, be careful that the element types you change to are allowed in all the places theyre used. Otherwise, the parser will either report an error (which makes the problem easy to find) or quietly recover by closing elements until it finds one where the new element is allowed (which can make it a lot more difficult to find).
- If youre converting HTML that you havent run through an SGML parser, watch out for URLs that arent quoted.
Usually when you convert from SGML to HTML, you end up throwing distinctions away, such as converting three different element types to just be tagged as italics (<I>) instead. This makes it very important to think of the SGML form as the real document, and keep it around for later when you may want to do a slightly different conversion on it.
See the previous chapter, Practicalities of Working with SGML on the Web, for more details on tools for converting SGML to HTML. In particular, some Web servers do it on the fly, which is a big advantage for data management and overall flexibility. There are too many standalone tools to mention (Perl is one of the most popular and portable)many useful tools are discussed at www.undergrad.math.uwaterloo.ca, for example /~papresco/private/calibre/sgml/tools/sgml2html.html).
As time goes on, the need to translate will shrink (it might eventually go away completely). Its not that much more difficult to program a browser that can accept any tags at all than one that can accept only HTML. The main addition is the need to access and read some kind of stylesheet that says what to do with each given tag. Panorama and DynaText have already proven that this approach works, and further solutions will continue to appear.
Note:
C. M. Sperberg-McQueen and Robert F. Goldstein wrote a wonderful paper on the potential for extending Web clients this way, with the imposing title HTML to the Max: A Manifesto for Adding SGML Intelligence to the World-Wide Web. You can get it from www.ncsa.uiuc.edu in the file SDG/IT94/Proceedings/Autools/sperberg-mcqueen/serberg.html.
In addition to Web delivery, you may want to provide printed output. Web browsers can do some level of draft printing on demand, but most are quite limited. For example, most, if not all, HTML-based Web browsers will not number your printed pages well, much less give you flexible control over page headers and footers, news-paper-style multi-column layouts, complicated tables, footnotes, and so on. Because of this, going through HTML as a way to print SGML isnt very effective.
SGML authoring systems can do quite nice printing, so if you have your data in one of them, you may be in fine shape. However, right now the most sophisticated print formatting tools dont directly accept SGML (some, like PageMaker, can read some limited SGML-like tagging). If you need high-end printing and typesetting capabilities, you will need to move the SGML data into a special paper-production system.
Remember that its much easier to convert SGML to other forms, than other forms to SGML. That puts you in a strong position if you have SGML. You can probably get to any typesetting system you want without too much pain. Once you do that, your data will be in a system that book production specialists already know. They dont have to adopt or learn something new, and they can focus all their attention on getting you the best-looking result (of course, they might be you in many cases).
Here are some of the formatting capabilities that (if you need them) could force you to move to a specialized solution:
For this level of features, youre best off moving your SGML into typesetting software such as QuarkXPress, PageMaker, TeX, or something similar. These tools are focused on doing one specific thing well, and so will do a better job at it than more general tools (which also have to devote effort to intuitive editing, search and retrieval, and so on).
Previous | Table of Contents | Next |