Previous | Table of Contents | Next |
Our example for external entities will simply combine two lists of general entities for use in a single DTD. (Using parameter entities to nest more complex DTDs will be covered in later chapters.) It is always a good idea to include comments with external entity declarationsthe URLs in SYSTEM identifiers and even the more complete information in PUBLIC identifiers are often cryptic. Our first entity file, companies.pen, includes the following:
<!ENTITY GLW "Corning Incorporated"> <!ENTITY IBM "International Business Machines"> <!ENTITY T "American Telephone and Telegraph">
Our second file, states.pen, includes the following:
<!ENTITY NC "North Carolina"> <!ENTITY ND "North Dakota"> <!ENTITY NJ "New Jersey"> <!ENTITY NM "New Mexico"> <!ENTITY NY "New York">
Our DTD, although simple, is also stored in an external file:
<!--The following entity connects to a list of companies using stock ticker symbols as entity references. --> <!ENTITY % companies "http://127.0.0.1/companies.pen"> <!--The following entity connects to a list of states using postal abbreviations as entity references. --> <!ENTITY % states "http://127.0.0.1/states.pen"> <!ELEMENT DOCUMENT (#PCDATA)> %companies; %states;
The sample XML file that uses these references reads as:
<?xml version="1.0" Encoding="UTF-8"?> <!DOCTYPE PARAMEXAMPLE SYSTEM "http://127.0.0.1/penex.dtd"> <PARAMEXAMPLE> <DOCUMENT>The company &GLW; is headquartered in &NY;, as is &IBM;. &T; is headquartered in &NJ;.</DOCUMENT> </PARAMEXAMPLE>
Parsing this should yield the following results:
<?xml version="1.0" Encoding="UTF-8"?> <!DOCTYPE PARAMEXAMPLE SYSTEM "http://127.0.0.1/penex.dtd"> <PARAMEXAMPLE> <DOCUMENT> The company Corning Incorporated is headquartered in New York, as is International Business Machines. American Telephone and Telegraph is headquartered in New Jersey. </DOCUMENT> </PARAMEXAMPLE>
As well see in later chapters, parameter entities can be a very useful tool for simplifying complex markup and managing multiple DTDs.
Notation declarations are an announcement that data from an outside (non-XML) source is needed in the document and helps to pass processing to an application other than the parser. Notation declarations are sometimes used in combination with processing instructions to provide a means of handling nontextual information within a document. The notation declaration tells the processor what kind of information there is; the processing instruction announces what process should be used to handle it. Notation names can also be used as attribute values.
The syntax for notation declarations is similar to the document type declaration:
<!NOTATION Name ExternalID>
A notation declaration might read:
<!NOTATION eps SYSTEM "epsview.exe">
The parser does nothing to check the information at the location specified; it just passes the address on to the processing application. If the processing application can handle the information, thats wonderful. If it cant, it doesnt matter to the parser. The SYSTEM keyword is normally followed by a reference to an application that can present the data, but the processing application is definitely not required to use that application. (If a Macintosh or UNIX user was reading this file, a Windows executable wouldnt help them much anyway). Notations that the processing application cannot understand may be errors, but they arent XML errors. The parser will continue its work without announcing an error. The application, of course, may announce its own errors.
Developers who need to test different structures while keeping track of alternatives may want to use the IGNORE and INCLUDE marked sections in DTDs. (In SGML, these also work in documents, but XML has banished them to the DTD.) IGNORE and INCLUDE let developers turn portions of a DTD on and off. IGNORE and INCLUDE are particularly useful for developers who are combining several DTDs and need to limit the side effects of multiple files colliding, or for developers who need to create a single core DTD with optional subsets. IGNORE and INCLUDE sections may be nested inside other IGNORE and INCLUDE sections, but, like elements, their beginnings and ends may not overlap.
The syntax for IGNORE and INCLUDE resembles that of CDATA:
<![IGNORE[ declarations ]]> <![INCLUDE[declarations ]]>
Neither IGNORE nor INCLUDE may appear in the middle of a declaration; both must address a single declaration or a set of declarations. For example,
<![IGNORE[<!ELEMENT YUCK (#PCDATA)>]]> <![INCLUDE[<!ELEMENT HOORAY (#PCDATA)>]]>
would keep the YUCK element from being parsed and would allow the HOORAY element to be parsed normally. Applied in this way, IGNORE seems like a handy wait to edit out useless parts of a DTD, and INCLUDE seems to be just plain useless. Parameter entities give INCLUDE and IGNORE the power they need to be meaningful additions to the XML vocabulary. Instead of using INCLUDE and IGNORE directly to change code throughout a DTD, developers can use parameter entities to make all those changes in one place. This makes INCLUDE and IGNORE far more convenient and occasionally even necessary. The following example provides a simple demonstration:
<!ENTITY % invoice "IGNORE"> <!ENTITY % receipt "INCLUDE"> <![%invoice; [ <!ENTITY notice "Please remit the following payment within thirty days.">]]> <![%receipt; [ <!ENTITY notice "Thank you for your prompt payment. The sums below have been collected and recorded.">]]> <!ENTITY address "555 Twelvetwelve Lane">
Depending on the values assigned to invoice and receipt, the general entity notice will provide either the voice of a bill collector or a grateful vendor. To change the output, just switch the values of the two entities. The value of the address entity, on the other hand, will be the same in either case. Similar markup could have continued throughout the DTD, with parts inappropriate for receipts being struck. Switching the DTD over to receipts would require editing only two lines of the file rather than demanding a search-and-replace of the entire document. In the next chapters, well explore more uses of this limited but powerful tool.
Previous | Table of Contents | Next |