Previous Table of Contents Next


Logical Structures

After pages and pages of preliminaries, it’s time to finally create some elements, attributes, and document structures. The tools covered previously are necessary to this work, but element and attribute declarations are the core of XML. Well-formed documents can be useful for certain situations, but the element structures they use exist only in the document and in the mind of the designer. DTDs are an opportunity for developers to make their vision concrete, creating specifications for documents and not just documents. A well-written set of elements and attributes will make it easy for programs to extract useful information as well as present it beautifully. Even though the other parts of the XML specification may assist in this task, the main work of XML is creating document structures using elements and attributes.

Elements

Before we discuss elements any further, we need to look at two related concepts—parent and child elements. HTML developers are accustomed to using elements without much concern for context, with the significant exceptions of list and table elements. Understanding context is a critical prerequisite to building a DTD that works efficiently. In XML, the context provider is the parent element, and the child element may provide context to elements nested inside of itself. For example, in the structure

  <SECTION >
         <PARAGRAPH>
                 <SENTENCE>
                 </SENTENCE>
         </PARAGRAPH>
  </SECTION >

the SECTION element is the parent element of the PARAGRAPH element, which in turn is the parent element of the SENTENCE element. Similarly, the SENTENCE element is the child element of the PARAGRAPH element, which itself is the child of the SECTION element. XML also includes a document entity, which provides a root from which the markup tree structures can grow. If SECTION was the first element in a document, it would be a child of the document entity. (There isn’t any special way to define a document entity—it just provides parsers with a place to start.)

As we saw earlier in the chapter, creating elements is very simple. The element declaration syntax is

  <!ELEMENT name content>

The name of an element must follow the same rules as the name of an entity: it must be composed only of letters, digits, periods, dashes, underscores, or colons. The name may be defined as a parameter entity, as may the content. The content of an element can be of four types: a mixed-content declaration, a list of elements, the keyword EMPTY, or the keyword ANY. ANY is the simplest declaration, announcing that this element can contain all kinds of data and markup:

  <!ELEMENT BOXOSTUFF ANY>

Using this declaration, all BOXOSTUFF elements will allow any kind of element or data to be included in their content. A document created that used the BOXOSTUFF element declared previously could look like:

  <BOXOSTUFF><DARKSPACE>emptiness</DARKSPACE>more
  junk</BOXOSTUFF>

The DARKSPACE element would need to be declared elsewhere; otherwise, BOXOSTUFF would impose no rules on its contents.

Although using ANY is perfectly acceptable XML, I strongly recommend that developers try to be more specific about document structure. Because HTML and some other forms of markup were primarily used for formatting, tags that effectively used ANY were necessary. Forbidding the use of bold text in a paragraph would undoubtedly cause an uproar. XML changes all that. By providing developers an opportunity to create document structures, XML promises to help create more intelligent documents. A large part of that intelligence is the kind of error checking XML can provide when given a fully developed DTD, complete with rules defining which elements can go where. Try to restrict the use of ANY to early DTD development, replacing it with a more complete content specification as quickly as possible.

The EMPTY keyword is ANY’s polar opposite. Instead of allowing any content, it allows no content. Elements defined with EMPTY content may have attributes but do not permit information to be stored between their beginning and end tags. The element declaration for an empty element is concise:

  <!ELEMENT EMPTYSPACE EMPTY>

Using this empty tag in a document requires even less room:

  <EMPTYSPACE/>
Most of the time EMPTY elements will be written as only an empty element tag ending with a /> (<BR/>, for example). Typical beginning and end tags are permitted, but no elements or data may come between the tags.


Previous Table of Contents Next