Previous | Table of Contents | Next |
The best way to convert from RTF is to use an SGML converter that can read RTF files. There are severalincluding three Word for Windows add-ons, EBTs Dynatag, and WordPerfects SGML modulebut they all run under Windows. You can also try an RTF-to-HTML converter, such as the public domain rtf2html, SoftQuads HoTMetaL Pro, or ClarisWorks. These programs start the process of tagging your document. You then must convert the HTML tags to their SGML equivalents and insert any other tags that do not have HTML equivalents.
If that is not possible, try to convert the RTF yourself, using the regular expression feature of an ASCII editor or a scripting language. In either case, it is important to have RTF that is as clean and consistent as possible. The document that is to be converted from RTF should be authored with consistent styles, which should be designed to handle interparagraph spacing, indentation, and other formatting features, so that extra returns and tabs are not inserted. You can also try inline text stylessuch as bold, italic, and underlineto indicate phrase-level elements. For example, you might use italic to indicate only foreign words and underline to indicate technical terms.
If you do not have access to other software or are working with many authors, a word processing program provides a good way to get text into electronic form and on its way to being tagged in SGML.
Caution:
All RTF is not created equal. Certain word processors that can output RTF do not preserve style information. Without styles, this system does not work.
Suppose, for example, that you are using Word. The steps are:
{\s1 \f22 \sbasedon222\snext1 FT;}
This kind of conversion is generally much easier than converting from another format into SGML. Both data formats are defined, and the DTD documents maintain the structure of the source document. With a valid SGML document, you rarely have to guess what a document creator might have intended when he put a list item inside a chapter heading. The best way to take advantage of the regularity of structure in an SGML document is to use an SGML parser to process it. Perform the conversion based on the output.
You can often convert SGML by using an editor rather than a specialized tool. This technique is not generally suitable for complex DTDs with deeply-nested structures. In such DTDs, it is frequent for the output created by a tag to depend on the tags surrounding it. This kind of dependency is impossible to handle in an editor unless the relevant tags are adjacent in the file.
You can use a straight Perl or TCL script for simple DTDs when an editor cannot do the job. To handle simple context dependencies, keep track of a global state as the script processes the document. This approach is possible as long as the context is strictly limited. Attempting to track the interaction of many global states generally leads to a programming nightmare that will be hard to maintain and will likely have obscure failures.
This chapter examined a variety of tools and strategies that you can use for editing, viewing, and converting SGML documents on a Macintosh. While commercial offerings exist for all tasks but conversion, they can be expensive for small projects. However, there is a lot of good public domain software to fill in the gaps. If you can take advantage of a mixed environment, it will be possible to use the Mac for editing and printing and do conversion on another platform, such as Windows or UNIX. If however, you intend to use only the Macintosh, with a little ingenuity, you can still put together a powerful suite of tools and complete your project successfully.
For more information, refer to the following:
Previous | Table of Contents | Next |