Previous Table of Contents Next


SGML Entity Management

Section 4.123 of the ISO 8879 SGML standard defines an entity manager as “a program (or portion of a program or a combination of programs), such as a file system or symbol table, that can maintain and provide access to multiple entities.”

Essentially, its job is to map logical references to entity references, so that an application dealing with a document in terms of its logical structure can still manipulate actual document instances. Goldfarb takes great pains to distinguish between the abstract, or logical structure of a document, and its entity structure, or the organization of its resource storage. He did this to maintain the system independence of logical document structure specification. (For example, a single document might be made of multiple files, and multiple documents can be stored in a single file, but this should be irrelevant to the DTD’s specification of the document’s logical structure—especially when you consider operating systems that don’t store information in “files,” such as MVS or OS/400.) The issue of entity management is, therefore, outside of a discussion of the use of a document’s logical structure to automate the creation of an object-oriented system, yet still lurking close by.

A good place to start research on an object-oriented approach towards the organi-zation of system resources would be the field of operating systems, where object-oriented technology has already played a role in the development of NextStep, OS/400, and other commercially available operating systems.

Entity managers are currently not that common because so many documents have an entity structure simple enough to require very little management. As more organizations use SGML for applications, such as multimedia that require more complicated entity structures, the need for powerful entity managers will increase, and it promises to be a growing area of SGML software.

The Future of SGML Application Developers

Whether using an object-oriented approach or not, it’s exciting to think of the tremendous possibilities of SGML application development in the future. The vast majority of work in SGML up to this point has been the development of the most efficient possible data structures for terabytes of text and other data, and now you have all this data sitting in these efficient data structures waiting to be manipulated. Meanwhile, the possibilities of publishing with new kinds of media and interfaces continually expand.

You can complain about the lack of tools to manipulate them, or you can start creating these tools and manipulating that data. Much of the object-oriented philosophy, when put into practice, has resulted in tools that make tool development easier. Eventually, someone will come out with a self-contained, graphical user interface object-oriented database equivalent to something like Microsoft’s Access or FoxPro or Borland’s Paradox or dBase. An interface allowing such a product to read native SGML files will mean amazing opportunities for SGML application development. Developing this interface would mean working out and automating all the things that are discussed in this chapter. I’ve heard of people who are doing it, but in a very proprietary, in-house manner. I don’t know when someone’s going to do a more open systems, publicly available version, but it makes too much sense not to. When it happens, it’s going to be great.

From Here…

This chapter demonstrated how the many concepts and key terms common to SGML and object-oriented development can help a developer take advantage of object-oriented technology when creating applications that use SGML data. This lets you take advantage of the many benefits that the object-oriented approach offers, such as faster development time and less worry about low-level technical details. As an example, the chapter described the creation and use of STSGML, a sample program for turning an SGML document into a Smalltalk object that could easily be output as plain text, RTF, or other formatted versions.

For more information, refer to the following:

  Part II, “Document Analysis,” further describes the issues involved in document analysis.
  Chapter 28, “Other Tools and Environments,” has more information about the SGMLS parser and its updated version, NSGMLS.
  Chapter 30, “Understanding the Information Revolution: The New Paradigm,” gives further background on the Text Encoding Initiative.
  Appendix B, “Finding Sources for SGML Know-How,” lists books where you can find out more about object-oriented technology.


Previous Table of Contents Next