Previous | Table of Contents | Next |
Manual markup is fine if you dont have many documents or suffer from time constraints and boredom. It can get tedious manually inserting tags into a long document with a complex document structure. This approach, however, offers the greatest flexibility for markup.
SGML files are simple ASCII text files with an *.SGML or *.SGM file extension. You can view them with any text viewer, but an SGML processing system is required to interpret the tags and build the document those tags define.
Tip:
Use a simple text editor that saves the file without formatting. If you save the file in a proprietary format, such as a WordPerfect or Word file, you limit yourself. Eventually, you will have to resave the file as an SGML file anyway. In other words, dont use a tool that embeds special codes.
If flexibility is a high priority for you and you dont have many complex documents, manual markup is an option. The useful thing about inserting tags manually is you can put them anywhere. That does not mean that your document instance will parse, however, but you can parse separately. You also do not have to buy expensive SGML software.
There are trade-offs, of course. Because you can put tags wherever you want, you can make more errors than you would had you used an SGML structured authoring tool. For example, if your DTD calls for a <FIGURE> element to be used only within a <PARA> element, the structured authoring tool will not let you stick it in a <HEAD1> element. A manual text editor doesnt care where you put any tags. Although the structured authoring tool can be frustrating when you author a document, it ensures that it will parse according to your DTD. A manual text editor gives no such assurance.
The flexibility that comes from manual markup is not always good. Because its easy to manually markup small HTML documents with non-SGML element structures, you can run into a variety of problems. Certain browsers, for example, support non-standard extensions to the HTML DTD. These extensions cause problems on the World Wide Web, such as:
Note:
The short-circuit referred to above happens because some browser developers want their product to be the most popular, so they make theirs better than the standard browser by anticipating improvements to the HTML DTD. But by supporting non-standard extensions, they encourage their customers to develop Web pages that not everyone can read. Their customers are sharing attractive graphics, but those graphics can be viewed by fewer people because theyre non-standard. A better approach is to extend the standard, even though this does take time.
When you use a text editor to insert tags manually, you open the door to this haphazard approach to document creation. You should always remember to validate your documents.
SGML, as the international standard, enables you to load DTDs as needed without violating the HTML 2.0 standard. Therefore, you can use the Netscape extensionsas well as many otherswithout compromising the HTML DTD. Until Panorama is as popular as Netscape, however, you must be disciplined when you manually markup an HTML document. Figure 15.1 shows a non-standard <BLINK> element.
Fig. 15.1 You can create this <BLINK> element using a text editor, even though it does not parse.
There are many kinds of document conversion tools. This approach to creating SGML documents is useful when you want to upgrade HTML documents to other types of DTDs. Automatic tagging approaches such as this one enable you to convert documents to a neutral file type first, and then to SGML.
See Avalanche/Interleaf: FastTAG, p. 495
Suppose, for example, that you have many documents in a proprietary word processing format called WordWiz. No SGML conversion tool exists for that file format. You must first convert the documents to a neutral markup scheme, such as RTF, and then convert them to SGML. Figure 15.2 shows an example of this chapter converted into RTF and then into HTML.
Fig. 15.2 The program RTF2HTML converts RTF documents into HTML documents.
Before you seriously consider automated document conversion, you must have a consistent document structure. Youll need to have a specific translation scheme for each type of document. To create any SGML document type, you need document analysis. Suppose, for example, that some of your memos have a return address paragraph; other ones do not. Your DTD for the memo document type calls for a return address. To convert your memos into SGML, you must first add a return address to every instance that does not already have one. This is because when the SGML parser goes to each instance of a memo, it expects to see the return address as specified in the DTD. If it does not find one, the document fails to parse.
In short, document analysis is crucial, even when documents are already created. You must look at your legacy documents and go through the document analysis steps discussed in Part II, Document Analysis. These steps are:
See Document Analysis, p. 97
You must review these steps for document conversion to SGML. You probably need to convert documents gathered from elsewhere in your SGML environment as well.
Previous | Table of Contents | Next |