Previous Table of Contents Next


Part V
SGML and the World Wide Web

17  How HTML Relates to SGML
18  SGML’s Emergence on the World Wide Web
19  Should You Upgrade to SGML?
20  Practicalities of Working with SGML on the Web
21  Integrating SGML and HTML Environments

Chapter 17
How HTML Relates to SGML

If you have read straight through from the beginning of the book, you can probably write this chapter yourself. It answers some basic questions, just in case you skipped a few chapters.

In this chapter, you learn:

  How HTML is related to SGML
  What SGML includes that HTML does not
  How SGML can make Web sites more flexible
  Whether SGML will make HTML obsolete

How SGML and HTML Are Related

As you have probably gathered by now, HTML is one SGML application. It is a single DTD. SGML is the parent of HTML, and they relate as parent and child.

When the founding fathers of the World Wide Web wanted to create a markup language implementation, they chose SGML for a number of reasons. The most important was that they needed a reliable standard file type that could be compatible with most of the existing applications and protocols of the Internet, such as SMTP mail, FTP, Gopher, WAIS, and UseNet news. SGML offered the potential to build such a Web application.

Why SGML?

HTML is a chip off the old block, so it helps to understand why SGML fit the bill in the first place. That way, you can see what HTML inherited from SGML.

The World Wide Web is only a few years old. The European Laboratory for Particle Physics (CERN) introduced a new set of protocols in 1989. Although they were originally designed to help physics research groups share their data more easily, the protocols had far-reaching implications. Other organizations adopted the CERN protocols, including a new group called the W3 Consortium. They still pool resources to keep improving the World Wide Web standards.

They originally required a proven standard that:

  Was compatible with existing Internet applications and protocols
  Could apply an object-oriented approach to handling binary files that external applications could deal with
  Could be modified easily and grow as technology improved
  Was independent of hardware, software, or any specific environment
  Offered universal access

SGML fulfills every requirement, so SGML was chosen to build the application that solved each problem that the people at CERN were having.


• See “SGML and the ISO and CALS Standards for Data,” p. 10

SGML has a long history, so it is a proven standard. It handles diverse and challenging types of documents. Many huge corporations have selected it for their applications, and it serves them well. Governments have used SGML under challenging conditions, and it has met their needs. CALS is one prominent example.

The capability to encompass existing applications meant that a new protocol was required. It had to be a hypertext protocol because the CERN physicists knew that many kinds of data and documents would be needed. Their documents had to be highly modular. The new protocol would have to absorb SMTP, FTP, and NNTP, as well as other protocols. This is the type of application building that SGML supports. Nothing does hypertext as well as SGML.


• See “The Ways of Organizing Knowledge,” p. 500

• See “How Modular Information Drives the Information Revolution,” p. 513

• See “Object-Oriented Technology and the Future of SGML Development,” p. 550


Object-oriented is a buzzword today, but it is actually part of the information revolution. Older Internet applications, such as Telnet, NNTP, and FTP, are not object- oriented enough to be compatible with one another. They can’t deal with enough instances of different documents in a coherent way. SGML’s beauty is that it’s generalized and standardized, and it’s completely transportable. It does not depend on any one environment. It does not get in the way of a document, no matter how odd that document is. HTML inherited part of this capability from SGML. As HTML matures, it will probably become more flexible so that it can deal with even more types of documents.


• See “Simplifying DTD Maintenance,” p. 195

HTML, as the new markup tool, needed easy maintenance. SGML provides many easy ways to keep documents and document exchange tidy. As you learned in Chapter 11, “Using DTD Components,” there are many ways to get fancy with DTDs to help simplify maintenance. DTDs are powerful, and the applications that you build with them can be comprehensive and powerful. HTML is just such a DTD. Version 3.0 of that DTD is under review, and new extensions and features are being suggested all the time. The DTD was the perfect solution to maintaining the new Internet tool.

With so many computers hooking up to the Internet and so many organizations involved in projects about how to communicate over the Internet, the new tool needed to be independent. It needed to be free from any one type of software, hardware, language, culture, or discipline. It needed to be universal in the computer user sense. Nothing is as broad-based and portable as SGML. HTML pages appear in many different languages, run on all kinds of hardware, and are available through any type of HTML browsing software. When full-blown SGML browsing tools become as widespread as HTML browsing tools, you will be able to view even more types of files and applications.

HTML keeps borrowing capabilities from SGML to add to its own capabilities. People offer support for the HTML extensions that are not even officially released yet.


Note:  
Several browser companies add support to their browsers for features that are not officially a part of HTML yet. One example is Microsoft’s Internet Explorer, which adds some SGML capabilities to its browser. When the new HTML version 3.0 is accepted, Microsoft’s Internet Explorer will already support many of those features.

It also helps the people who design the standard. They receive some real-life input on what works well. It shows them what the market likes. Most browsers experienced success in the shareware market before they were offered commercially.



Previous Table of Contents Next