Previous Table of Contents Next


Structured Authoring

Structured authoring in SGML normally requires an SGML authoring tool. Using such a tool helps you avoid the inconsistency of document structure that makes document conversion a challenge. In SGML, you must design the structure of a document type before you can create an instance of that document type. Whether you create an instance of a document type by conversion or with an authoring tool, you’re basically filling in structural pigeonholes with specific document content.

When you use an SGML authoring tool, it’s impossible to create document content that falls outside the structural pigeonholes. Before you can enter content, you must select the correct pigeonhole from the elements specified in the DTD for that document type. This guarantees that the SGML documents you create with the tool parse properly. Figure 15.3 shows an example of a powerful SGML structured authoring tool.


Fig. 15.3  With ArborText’s ADEPT*Editor, you can insert different types of document content only at specific locations.

Fortunately, some tools with which you can do structured authoring have become affordable for the individual author creating documents outside of an industrial organization. In the past, it was common to pay $3000 or more for such an authoring tool. Now, there are alternatives for well under $1000. While still expensive for the home user, it is attainable. And there are still more cost-effective alternatives to authoring for home users. One class of tools are extensions to existing word processors that, in effect, turn them into SGML authoring tools. This type of tool has gained favor in HTML authoring, but SGML extensions to MS Word, for example, have been around for only a few years. Other programs convert Word or other word processing files into SGML documents. Strictly speaking, they are conversion tools, but they simulate the structured authoring found in dedicated SGML authoring tools.


• See “SGML Authoring Tools,” p. 434

• See “SoftQuad Author/Editor 3.1,” p. 469

• See “The World of Perl,” p. 491


These add-on tools make Word, and other word processors, an effective SGML authoring tool. They force you to create only documents with highly defined stylesheets, which become the basis for converting the documents into document instances of an SGML DTD. You must create the stylesheets and templates according to the DTD for the document that you create. Then, you must map the DTD and the stylesheet template together so that the document conversion utility can easily convert the word processor output into an SGML document instance.

Document Conversion and Its Tools

Suppose you find something that you just have to have in SGML. Especially with the Internet being so convenient, you can find public domain documents at an FTP site that might be great to have in an SGML Web site. Likewise, you might find an article in the newspaper that you want to scan and put on your Web site. In cases like these, you must convert the documents from one document file type into SGML.


Caution:  
Before you publish copyrighted material on your Web site, be sure that you are not illegally using that material. Newspaper articles may only be used according to the limitations specified in their copyright statement.

There are three types of conversion tools:

  Tools that convert word processing formats into SGML
  Tools that convert a common intermediate file type into SGML
  Tools that convert specialized file types into SGML

Word Processing Conversion Tools

Most of these tools require you to author in a word processor, such as Word, and to save the file with an SGM extension. These add-on tools are not true structured SGML authoring tools, such as Near & Far Author. Essentially, you create a Word file and then make it into an SGML file.


• See “WordPerfect SGML Edition,” “SGML Author for Word,” and “Near & Far Author,” pp. 434, 440, 445

One popular tool is Microsoft’s SGML Author, which is reviewed in Chapter 26, “Tools for the PC: Authoring, Viewing, and Utilities.” Essentially, you first create a DTD. Then you create a Word file template with a consistent style and relate the styles in the template to the elements in your DTD. SGML Author creates a marked-up SGML file that passes muster with most parsers.

The steps are:

1.  Do the document analysis, and create the DTD in a text editor or use an existing DTD.
2.  Create the stylesheet template in the word processing program according to the elements in the DTD.
3.  Use SGML Author to map the styles in the template to the elements in the DTD.
4.  Create your document strictly according to the stylesheet template that you just created.
5.  Save your document as an SGML document.

You now have one or more DTDs, one or more stylesheet templates, and one or more document instances that are all SGML documents. This is the basic scenario for hybrid converters using popular word processing programs.

These tools usually meet the basic needs of home users or builders of small installations. By extending this scenario, you can even have many people all creating SGML documents according to accepted stylesheet templates. If you can enforce the discipline of those stylesheets, you can make this process work for a larger SGML installation.

Converting documents in this way involves risks:

  Even when you use stylesheets, it’s possible to introduce structure to the document that will not be accepted or recognized when the document is parsed
  You might not understand the whole process of creating SGML documents without additional help
  It’s sometimes awkward to introduce substructures—such as attribute entities—into the document from within the authoring environment itself
  Authors do not have to remain focused on document structure, as with a structured authoring program; they can tinker with formatting instead of with stylesheets and output specifications
  These tools are sometimes weak when it comes to handling output specifications and SGML stylesheets


Previous Table of Contents Next