Table of Contents


Introduction

Within the last ten-year period, global superpowers have dissolved, wars have started, ended, and started again, entire national economies have almost disappeared, balances of power have shifted, and earthquakes, famines, and floods have changed the face of the planet. During the same period of time, SGML has been the international standard for document interchange. While the physical and political tumult above cannot rightly be ascribed to SGML, some of the information revolution can be.

SGML, hypertext, and object-oriented technology have all been in the eye of the information hurricane in the last ten years. While you still see the occasional doctor’s office whose receptionist still uses a DOS program that doesn’t support a mouse, most machines today use the “point and click” type of interface that makes computers friendly and desirable enough to pay $2000 or more for. They play music and movies from CD-ROMs and they even show TV programs and play the radio. But computers are learning to communicate better not only with people, but with themselves.

SGML is now in the eye of the storm because, recently, computers have begun talking together on a wide scale. The explosion of the Internet is the first massive venture into electronic document sharing by people all over the planet. SGML is what enables all these diverse people, with their changing languages, merging cultures, and exploding information revolutions, to communicate via computers that actually understand one another despite all the upheaval. When combined with hypertext and the familiar point and click computer interfaces, SGML allows document sharing on the scale of the World Wide Web, electronic tax return filing for a nation, and perhaps even document sharing between NASA space shuttles and ground control.

Ironically, up until now SGML had remained largely unknown to the public. The public had seen the talking computers and the virtual reality games that take you head-to-head against mortal enemies in a blood-curdling fight for survival, but most people still thought SGML was some sort of medical disease like HIV, or perhaps a rock band. Now that HTML has introduced the idea of document interchange to the world and shown that it’s possible and pleasurable, SGML has finally attracted some attention. Finally, the secret life of SGML is being made public.

SGML: The Mysterious Secret

Over the years, SGML has acquired the aura of hidden magic and forbidden mystery; only the “initiated” could participate in its rites. While it’s hard to say how this thought got started, it’s easy to clear the fog. SGML is not a mystery. It’s a meta-language. Nothing more.


Note:  
“Meta” means “later in time” or “having a higher stage of development.” In one sense, it means “behind” in the sense of “the force behind the scenes.” So a meta-language is a language used to create other languages.

SGML is a meta-language because you use it to create and develop other languages—in this case, other markup languages.

A markup language is a tagging system that lets you maintain a document’s structure so it can be dismantled and moved electronically, and then reassembled in a new location. Because many documents require far more than letters and numbers to express their content, this is a large task. But that’s a job for markup languages.

SGML is a “super” markup language. It’s the mother of all markup languages. It’s the parent of markup languages like the one the IRS uses and even NASA. It’s also the parent of HTML.


Perhaps SGML’s popularity among the Department of Defense and among academia has loaned a stiffness and formality to the early writings about the subject. Perhaps there have just been few resources for anyone without a Ph.D and a tolerance for turgid prose to learn SGML. And perhaps, too, the possibilities for using SGML in the real world have been a little remote to imagine until enough people actually went out and did something useful with it.

SGML was adopted by the International Standards Organization in 1986 (ISO 8879), and there have been lots of successful projects with it since then. Many of them affect how you live your life today. Ever hear of the World Wide Web? HTML is a chip off the ol’ block of SGML. But there are plenty of other real-world successes, too—the Text Encoding Initiative, the Davenport Group, the Oxford Text Initiative, SGML Open, the list goes on and on. But unless you’ve been reading those scholastic publications, you’ve probably never heard about these accomplishments.

The fact is that SGML is and has been a silent force behind the scenes. It may not be clear how electricity works, but that doesn’t stop you from reaching for the light switch. SGML has been just as vital to publishing as electricity has been to the lay person. Many thousands of public documents are routinely stored in SGML for instant access or retrieval. Many CD-ROM publishers depend on SGML to organize their content and make it useable. Most of the text you read in newspapers resides electronically in SGML before, and long after, you read it in hard copies. Many of the resources your banker, accountant, or specialist of whatever kind uses to advise you in your daily affairs probably exist, at least partially, in SGML. Just as computers were suddenly “all around us” ten years ago, so SGML is too. Only you don’t see it, you just see the benefit, sort of like electricity.

Documents and Their Objects

Much of what makes this convenience possible is structure, something you might not have thought much about. But as you learn about SGML, rest assured that will change. As you study document analysis, you’ll soon never be able to look at a document the same way again. Unconsciously, you’ll be asking yourself, “How would I define this phone book as a document type?” and, “What would you call an element for tagging footnotes?” Small features in text that you overlooked before, like superscripts or circumflexes, will suddenly fascinate you. This is because you’re finally seeing what SGML people have seen for a long time: structural objects of a document.

There will be a lot of realizations that come with experience, like how much structure is enough in a document, and other realizations that come more quickly, like simpler is probably better for now. Through it all, you’ll be dealing with structural components of documents—objects—and distinguishing them from the content of a document. The realization I would like to leave with you in this introduction is that structure, content, and format are actually separate tasks that you’ll need to discriminate between.

When you prepare documents to move around the planet with the click of a mouse, you need to be clear about the difference between “the nut and the shell” so to speak. Document content is like the nut and format is like the shell, but without structure you couldn’t recognize it as a document at all…or a nut! What lets you recognize a walnut from a cashew is actually the structure of the content within the shell. So it is with an SGML document. The content is what you read, the structure is how it lets itself be recognized, and the format is how it looks while you’re reading it. When you pay attention to the structure, the document can get to the other side of the planet and be recognizable with the click of a mouse.

What This Book Is

This book is intended to be a comprehensive reference for developing practical and effective SGML applications. Following its instructions, you should be able to build an application as robust and useful as HTML. While this book does not intend to promote Windows-based machines over other machines, you’ll see many screen shots of Windows, relatively few Macintosh displays, and no UNIX displays. That’s simply a reflection of the marketplace and the prevalence of Windows-based PCs. But that’s probably also good for you because, statistically speaking, the odds are good that your machine is a Windows machine, too.

Who should use this book? Anyone with an interest in SGML. The formatting conventions are useful for a veteran SGML user to find handy reference information, and the SGML newbie will appreciate the thorough explanations and familiar tone. There are clear graphics and useful discussions conveniently organized for users of SGML at every level.

The chances are high that you are studying SGML because of your success with HTML. If so, you’ll find this book to your liking because it talks about the migration of SGML to the online world. But if you’re not coming from the HTML world, that’s fine, too, because there are ample discussions from the business and publishing perspectives. SGML was never intended for a single type of enterprise or system. On the contrary, the intention with SGML has always been: have document, will travel.

Here’s a brief overview of the book’s content with a short description of each part:

  Part I, “SGML Development: Essential Ideas, Terms, and Technology,” covers the important background ideas and terminology for studying the development of markup applications.
  Part II, “Document Analysis,” introduces the reader to the process of building an SGML application by helping him or her understand document types and their architecture—how they’re dismantled and assembled.
  Part III, “Content Modeling: Developing the DTD,” takes the SGML student through the process of declarations, design, and validation for the Document Type Declaration (DTD).
  Part IV, “Markup Strategies,” covers special concerns with marking up documents, like output specifications, and transforming documents from one SGML document type to another.
  Part V, “SGML and the World Wide Web,” helps HTML users appreciate their SGML heritage and consider the larger possibilities it can offer to their Web sites.
  Part VI, “Learning from the Pros,” provides special insight into challenging areas of SGML development that you’re likely to encounter at some point.
  Part VII, “SGML Tools and Their Uses,” addresses the expanding subject of authoring, development, and production tools for SGML; tools for the Mac, the PC, and UNIX are surveyed.
  Part VIII, “Becoming an Electronic Publisher,” undertakes the task of analyzing the past of electronic publishing and tracing its destination somewhere in the future.
  Part IX, “Appendixes,” Appendix A, “The SGML CD-ROM,” covers the contents of the CD-ROM that comes with this book; you’ll find lots of handy tools to help you become productive right away.
Appendix B, “Finding Sources for SGML Know-How,” helps you locate further help beyond the scope of this book for your ongoing SGML involvement.

What This Book Is Not

This book does not provide more than the essentials for using SGML in practical day-to-day applications. The intention behind this book is to give someone who needs to develop an SGML application the means to complete that task.


• See “Internet Resources,” p. 569

You won’t find in-depth discussions of HyTime or multilingual transformations because most people who would buy this book probably wouldn’t be interested in that level of depth. This book is intended to be an “everyman’s” guide, not a professor’s handbook.

Conventions Used in This Book

Certain conventions are used in Special Edition Using SGML to help you absorb the ideas easily.


Tip:  
Tips suggest easier or alternate methods of executing a procedure or approaching a task.

Text that is part of SGML markup in a document’s content will look like this: <TAG>content</TAG>. This type of text will appear both in the body of the text (like you see here) and in the figures and samples of markup you see throughout the book. These tags, as well as entities and attributes, appear in both upper- and lowercase, since SGML is not case-sensitive.

New terms are introduced in italic type and text you type appears in boldface. World Wide Web URLS (essentially document addresses) are also presented in boldface.


Note:  
This paragraph format indicates additional information that might help you avoid problems, or that might be considered when using the described features.


Caution:  
This paragraph format warns you of hazardous procedures.


• See “Section Title,” p. xx

• See “Section Title,” p. xx


Special Edition Using SGML uses cross references so you can quickly find related information in the book. These are listed by section or chapter title and page number for convenience. Right-facing triangles point you to related information in later parts of the book. Left-facing triangles point you to information in previous chapters.

Throughout the book, you’ll also see the On the CD icon (shown beside this paragraph) in the margins. Where you see this icon, the text is discussing software or a document that’s included on the CD accompanying this book.

Bombs Away!

This book is designed to lead you from an SGML beginner’s level to an intermediate level. All you need to have is curiosity about the subject and an open mind. This book will take care of the rest.

As the subject matter progresses, you will most likely be impressed with the enormous possibilities SGML gives to prospective publishers and authors. So as you find yourself daydreaming about what you can do with these powerful techniques and tools, don’t be alarmed. I can tell you from experience, no matter how excited you are today, you’ll probably be even more excited about it tomorrow!


Table of Contents