Previous Table of Contents Next


Native HTML Documents as SGML Resources

Even though HTML is not full-featured enough to satisfy everyone, it is still a powerful markup language. It’s more than adequate for the vast majority of Web authors. The World Wide Web owes its success, in large measure, to the simplicity of HTML. HTML was never intended to be the be-all and end-all. The founders of the Web knew that there would be trade-offs under HTML.

The Bandwidth Dilemma. An important controversy concerning HTML has to do with the original purpose of HTML and its current adaptations.

When the Web became popular in 1993, the bandwidth available over the Internet appeared to be a limitation. The people who started the Web understood the demands placed on the Internet’s resources. Their documents, therefore, didn’t occupy great amounts of bandwidth.

The old saying that one picture is worth a thousand words has received a new twist on the Internet: “One picture may be worth a thousand words, but it’s also worth 100 times the bandwidth.” The point is that pictures take vastly more resources to transmit than text. In the time that it takes a Web server to transmit a one megabyte picture, it can transmit 100 times as much content in textual form (assuming that picture were worth 1000 words). That’s why some Web “purists” complain about the heavy graphics on so many Web pages. Although the speed of a typical modem has improved since 1993, the relative speed of the actual server may have declined. Much more is demanded of Web servers and hardware today because of the heavy graphics and multimedia resources being transferred over networks.


Note:  
Bandwidth refers to the amount of space and time that a file takes up in a network’s connection lines. A bigger file takes up more space and time when it’s transferred, whereas a smaller file takes up less space and time. Hence, smaller files require less bandwidth than bigger files. They also require less attention from the network’s servers than do bigger files.

Bandwidth becomes a measurement of bytes or kilobytes per second or minutes. The longer a server must take on a single file, the fewer files it can deal with from other network clients. Higher average bandwidth makes a network relatively less efficient than networks with lower average bandwidth, as measured by the number of network clients served.


Loading up documents with multimedia content in proprietary and specialized formats defeats the purpose of SGML, whose goal is to enable document interchange among people irrespective of their software or hardware. It’s not that SGML doesn’t want to do the job. The problem is that people have not agreed on a single common standard for handling graphics. In fact, there are too many standards, most of which are not readily available on all hardware platforms. For example, the GIF—the proprietary Graphical Interchange Format popularized by CompuServe—works well as a standard graphics file format for PC users, but it isn’t good for someone on a Sparc station who doesn’t have a UNIX-compatible viewer that handles GIF. Incompatibility is a bigger problem on the Web than many people imagine.

Standardization versus Innovation. Another hotbed of controversy is the trade-off between standardization and innovation. The idea is that if everyone follows standards religiously, how could you ever introduce anything new and wonderful into the mix by way of innovation and experimentation?

Consider Netscape, for example. This powerful Web browser supports extensions that are not a legal part of HTML, which enables creators of Web documents to add special features that Netscape can exploit. This strategy has made Netscape an extremely successful provider of Web software and server resources. Many Web pages now have a statement at the bottom that reads “Powered by Netscape.” The special non-HTML features—such as the <BLINK> element—usually don’t bother non-Netscape browsers, but only Netscape can interpret the tag. When you use Netscape, the documents look attractive. Other browsers ignore the special features.


Tip:  
Even though adding special non-HTML enhancements to your documents invalidates them as HTML document instances or valid SGML, you can sometimes still get them to parse. People often go to great lengths to parse semi-HTML documents as valid SGML document instances. The best policy is to stick with valid HTML. It’s a habit that will keep you safe and on good terms with Web masters and Web clients.

Because these documents are customized for Netscape, they are no longer valid SGML document instances. They do not parse as instances of the HTML DTD. So now you have all these documents in Web space that are not valid SGML documents. They sacrifice transportability and universality for innovative features.

No one has the best answer to the dilemma of balancing innovation and standardization. Throughout the history of the computer age, sharing has been sacrificed when the marketplace has been flooded with proprietary technology. People and companies that can’t upgrade to that technology are left out in the cold because now their hardware or software is no longer compatible. This approach promises profits to the innovators of new technology, but it causes incompatibility problems for people unwilling or unable to purchase the latest technology. Standardization and cooperative innovation are necessary for technology to advance. Competition must be ameliorated by cooperation.

Keep in mind that your creative decisions affect not only yourself, but also every Web server that must transmit your documents and every Web client whose users click the appropriate link. The world is interconnected. When you make a document public, you must keep it standardized and accessible. If you customize a document to make it more innovative and creative, it’s no longer fully public.


Previous Table of Contents Next