Previous Table of Contents Next


Object Documents

XML’s nested structure bears a strong resemblance to the hierarchies of data that appear in object-oriented programming’s data structures. XML and object-oriented programming are a good fit, because both systems can store datasets within datasets within datasets. Storing the information contained in object structures has often been difficult, because the linear and tabular file types most commonly used for documents are a poor fit for this kind of hierarchical structure. This section of the chapter won’t produce any specific DTDs, because they will vary radically from chapter to chapter. Instead, we’ll examine general principles and a simple example that may help developers plan XML file types that mirror data structures.

Our example is an object that contains several subobjects—the Appearance object of a very simple program. This object and its properties are loaded when the program starts and saved when the program finishes, more or less like a preferences file. It keeps track of settings for menus, toolbars, and the main window’s height and width. Users who leave the program can reasonably come back to find things looking much as they did when they left. As Figure 9.1 shows, the appearance object is really a container for other objects, which themselves contain additional objects and properties.


Figure 9.1  The Appearance object and its components.

The code for these objects is written into the main program, and our XML need not concern itself with any of the methods. XML can, however, provide a means for storing the property information that reflects the hierarchical structure of the objects. Implementing this structure will require adding some methods to each of the objects that can connect to a parser to read in their properties when the program starts and write out their properties as XML when the program ends.

There are several ways to structure the XML. All objects could be represented by elements, their immediate properties could be stored as attributes, and the objects they contained stored as subelements. This would produce XML that looks like

  <APPEARANCE>
  <!--Other elements -->
  <TOOLBARS>
  <TOOLBAR NAME="MAIN" VISIBLE="Yes" LOCATION="1"/>
  <TOOLBAR NAME="CONNECTOR" VISIBLE="No" LOCATION="2"/>
  </TOOLBARS>
  <WINDOW HEIGHT=450 WIDTH=400/>
  </APPEARANCE>

Alternatively, we could give each attribute its own element. In this way, all properties would be treated the same way, regardless of whether the contents of those properties are values or other objects. The preceding code would now look like

<APPEARANCE>
<!--Other elements -->
<TOOLBARS>
<TOOLBAR><NAME>MAIN</NAME><VISIBLE>Yes</VISIBLE><LOCATION>1</LOCATION>
<TOOLBAR><NAME>CONNECTOR</NAME><VISIBLE>No</VISIBLE><LOCATION>2</LOCATION>
</TOOLBARS>
<WINDOW> <HEIGHT>450</HEIGHT>   <WIDTH>400</WIDTH></WINDOW>
</APPEARANCE>

Programmer’s preferences will probably rule in these situation, at least until standard libraries arise to handle this coding. Both approaches will find their defenders. (I lean toward avoiding attributes because they tend to generate more verbose markup. They are more readable, but less efficient.)

Most programmers probably will not go to the trouble of validating these files. Even though a DTD might be useful for documenting the structures used by this code, and perhaps useful for debugging, a well-written parser/output program should be able to create well-formed code by itself. Another significant issue that gets in the way of creating a DTD for these files is that the sequence in which the objects write out their properties may vary from time to time, producing documents that are perfectly acceptable to the program using them but unacceptable to a strictly written DTD.

Metastructures—Emerging Standards Using XML

Even before XML has been finalized, a number of proposed standards that would use it have appeared. Microsoft’s Channel Definition Format (CDF), Netscape’s Meta Content Framework (MCF), Marimba and Microsoft’s Open Software Description Format (OSD), and webMethods’ Web Interface Definition Language (WIDL) are among the earliest proposals. The W3C is also developing a standard framework that can include many of these systems—the Resource Definition Format (RDF). Apart from a love of acronyms, these proposals share an interest in making the Web a more automated place.

Channel Definition Format

CDF is the first XML-based standard to receive anything resembling widespread use. Microsoft submitted the proposal to the W3C in March, but it looks like CDF will probably remain a Microsoft-only standard. CDF provides a standard set of tags for defining push content channels. Channels automate the flow of data from Web server to Web browser, providing the browser with a schedule for downloading new content from the channel’s server and labeling that content with a button and some brief descriptions. CDF is based on a DTD that contains information pointing the browser to the source of the information, descriptive information (e.g., author, logo, abstract, and copyright), and a schedule for regular downloads. When the user wants to visit the channel (using the button bar shown in Figure 9.2), the information is already loaded for them—avoiding waiting for downloads and making it easy for users to reference Web information offline.


Figure 9.2  The channel bar in action in Internet Explorer 4.0.

Channel content is still (as of Internet Explorer 4.0) in HTML, not XML. XML just provides a framework that allows the browser to find and describe the content. The schedule can have odd effects on computers that use dial-up connections; since most schedules are designed to download data at off-peak times (midnight to 4 A.M.), Internet Explorer 4.0 users may wake up in the middle of the night to the cheerful sound of their modem dialing out to their Internet Service Provider.

CDF information is available from several sources. The submission to the W3C, which includes a full description of the DTD, is at http://www.w3.org/TR/NOTE-CDFsubmit.html. Microsoft has white papers and other information available through its Site Builder (http://www.microsoft.com/sitebuilder/) and Internet Explorer (http://www.microsoft.com/ie) Web sites.


Previous Table of Contents Next