XML: A Primer:XML and the Future: Site Architectures

Table of Contents

Despite the lack of support for XML in the browser window, XML is still useful for a number of tasks in Internet Explorer 4.0. Microsoft has presented several demonstrations (available at http://www.microsoft.com/standards/xml/xmlparse.htm#demos) that use the XML Data Source Object and scripting to create formatted HTML documents based on XML documents. Even though the slow speed and complexity of this process limit its applicability to small projects, Microsoft has taken some first steps toward making XML available in the browser. Figure 12.2 shows one example of the kinds of pages made possible by this combination of scripting and parsing: a weather page that presents forecasts for cities chosen by the user.

Figure 12.2 XML weather forecasting, scripted into HTML.

In additional to the C++ and Java parsers, Microsoft offers an XML data source object (written in Java) that can be used with Internet Explorer’s data binding extensions to HTML. This approach is generally useful only when the XML data represents a table or other similarly structured data se because the original implementations of data binding were targeted at relational databases and their weaker cousins, delimited text files. While the XML data source object may provide enough power for some simple applications, it lacks the flexibility of the parsers. (For more information on the Microsoft XML DSO, visit http://www.microsoft.com/standards/xml/dso/xmldso.htm.)

Even though these transitional tools are useful, they suffer from certain limitations. First of all, even though they are written in Java, successful implementation of these tools is currently possible only with Microsoft’s Internet Explorer 4.0 because the supporting technologies (data binding and dynamic HTML) are not yet generally accepted standards. Even if Netscape comes out with similar tools, the odds are excellent that they will have similar limitations. It might be possible to use the same XML files with both browsers, but the surrounding programming that presents the documents will need to be customized for each browser.

Secondly, these tools treat XML documents in an extremely limited context. Although XML documents can now be addressed through an object model, that object model is completely distinct from the document object model the browser uses for HTML, and connecting the two models requires significant scripting effort. This scripting bridge stands in the way of using XML with style sheets (e.g., creating a wide range of viewable XML documents). It also makes using XML to build dynamic interfaces difficult because the document needs to be converted to HTML first and then manipulated again through script. The programming overhead for these translations (even if they work the same in Netscape and Microsoft products, an unlikely scenario) is still dramatic, requiring custom implementations of all but the simplest XML to HTML translations.

This translation also imposes unacceptable performance limitations on XML. Users have become accustomed to quick processing of HTML files. Loading a shell document, loading the XML document(s), parsing the document(s), and using a script to present the XML in HTML takes time, even on my 150 MHz Pentium. (On my 75 MHz Pentium, I can watch the scripts add the HTML almost step-by-step.) Although Java performance is hardly ideal, the performance of interpreted scripting languages is even poorer.

Finally, this implementation blocks the use of many of XML’s finest features, notably linking. Using HTML as the interface precludes the application of any links beyond simple in-line links, and even those must be converted so that they can bring up shell HTML files that will in turn load and process the XML. Implementing multitargeted, multidirectional links in this scenario is probably not possible, unless a processor on the server takes over the task of reading in XML documents and spitting out HTML documents that are rough equivalents. Links that have more than one target will still cause some dramatic problems.

This limited architecture may have some promise as an interim solution, however. Microsoft has already applied it to the channels shown in Chapter 9, so many users are processing XML without even realizing it. Simple applications like CDF will continue to use these parsing applications without direct presentation to the screen because they were designed to operate this way from the start. Files that contain only metadata don’t need direct presentation, and the issue discussed previously may or may not be applicable. Still, these metadata architectures are only one tiny part of XML’s potential.

XML in the Browser: Implications

Whether or not the browser undergoes the dramatic transition I suggested in the previous chapter, the advent of XML document presentation in the browser will likely change the underlying architectures of many sites. The XML syntax itself and the XML-Linking specification will drive these changes in architecture and design, although in different ways. XML syntax promises to bring a Web where content, formatting, and script are separated from each other more distinctly than they have been in HTML, while XML-Linking will distinguish itself by providing richer interfaces and a chunk model for document retrieval.

As we saw at the very beginning of the book, XML’s development has been based on a firm belief in the separation of markup and formatting information. The early XML community also continues the SGML tradition of the separation of document content from processing. These two philosophical motivations have concrete implications for the future structuring of Web sites.

The move to separate content from formatting has already gained some impetus in the HTML world from the appearance of the Cascading Style Sheets recommendation and the key role it plays in both Netscape and Microsoft’s implementations of dynamic HTML features. The most appealing aspect of Cascading Style Sheets to many designers is its ability to centralize style information, avoiding much of the repetitive work needed to create and update HTML pages. Style sheets are an automated version of the graphic designer’s spec book, providing a smooth path for formatting to flow into documents without constant hand tweaking. CSS’s advantages for programming are also becoming clear to programmers, providing a basic structure for formatting that can be applied to elements without as much concern for what kind of element the target of an operation is.

Table of Contents