Previous Table of Contents Next


This is Smalltalk syntax. Other languages will have their own syntax to create a new class, but any object-oriented language will need the same information: the name of the new class (stored here in the variable docElementName), the existing class from which we’re deriving the new class (in this case, we’re deriving it from SGMLDocElement, the abstract class we saw in the Booch-style diagram), and the attribute values, or in Smalltalk vocabulary, the “instance variable names” that identify the particular pieces of information that we’re keeping track of for this class. For the HTML A element that identifies the beginning and end of links, these would include the HREF and ID values—in other words, its attributes as declared in the DTD. (classVariableNames and poolDictionaries offer further options in new class creation, but they are being passed empty strings as parameters in the earlier example because they are not used here.)

As STSGML reads in the DTD and defines a new class for each of its element declarations, it displays its progress to the end user with the following two progress indicators shown in figures 31.4 and 31.5.


Fig. 31.4  STSGML reading in a DTD.


Fig. 31.5  STSGML defining new classes.

Once the new classes are defined, STSGML is ready to create objects of these new classes, and it reads in the specified document. It actually passes the document to the SGMLS parser, which checks for errors and stores the document in a format known as ESIS, which identifies element structure with nested parentheses. STSGML reads in the ESIS file and, for each document element it finds, declares a new object of that type using syntax similar to that shown above to declare a new object of the STSGMLWindow class. If the parentheses of the ESIS file show that one element is inside another, the outer one is defined as having the inner one as one of its components. For example, a chapter element within a book is identified as one of the parts making up that book element.

Once the new document object is defined with all of its component objects, it’s sitting in memory waiting to be manipulated like any other Smalltalk object. The Plain Text Version to File and Formatted Version to File menu choices make it possible to save this document object as either a simple text file with no tags or as a file that has formatting codes (defined using the Format menu) embedded within it. The Format menu lets you define and edit format strings for each element on the fly as well as giving you the opportunity to read in a file of format strings; sample files make it possible to save TEI and HTML documents in RTF. Once the document exists as a Smalltalk object, this conversion to some other arbitrary format is possible with less than a page and a half of code, because Smalltalk makes manipulation of its objects so easy.

Object-Oriented Technology and the Future of SGML Development

To make things simpler for discussion, this chapter has been limited to object-oriented systems that process static SGML documents so that you didn’t have to worry about state transitions and other issues raised by multi-user editing. Also, this chapter never mentioned a serious limitation of the STSGML program: its inability to use documents that can’t fit into memory all at once. A serious SGML application should handle gigabytes of data.

What can current developments in object-oriented systems offer to these problems, and what other benefits might they provide?

Concurrency

Once you add the most basic features to an SGML system to make an object-oriented system, other properties of an object-oriented system can be added as needed. Candidates for these additional attributes are the kind of thing that separate one object-oriented methodology from another. For example, Booch describes a possible concurrency attribute for classes that allows simultaneous multi-user editing. Since many large SGML systems involve simultaneous authoring and editing by multiple users, features to allow proper concurrent use of data would clearly be useful.

SGML and Object-Oriented Databases

The way object-oriented developers describe it, you turn an object-oriented system into an object-oriented database by adding one feature, or rather, one attribute: persistence. Persistence makes it possible for created objects to exist after you’ve stopped using the program, so that they’re available for you when you start up the program again.

This is another potential class property that would play an obvious role in an object-oriented SGML system, because documents and their elements clearly persist beyond the execution of the programs that create them. A lot of people use relational database managers as the engine behind SGML databases, but this isn’t a very good fit. They’re only doing it because of the current maturity of relational database management software compared with the current state of OODBMS software.

Several aspects of SGML make it hard to squeeze a typical document into tables:

  The variable size of text elements
  The sharing of subelements
  The frequent need to traverse hierarchical relationships
  The increasing use of SGML in multimedia applications

These make object-oriented database management systems a much better fit to SGML than relational systems.

The better fit of OODBMS systems to SGML, along with the lack of existing implementations, indicates an area with a lot of potential work to be done. An obvious extension of the STSGML system would be the use of a tool such as Gemstone or Versant to implement persistence with STSGML’s Smalltalk objects.


Previous Table of Contents Next