Table of Contents


Glossary

application
Either a program that does something (formats, sorts, imports, etc.) with XML or a set of markup tags created with XML. HTML, for example, is an application of SGML, definable with an SGML DTD.
attribute
A source of additional information about an element. Attribute values may be fixed in the DTD or listed as name-value pairs (name=”value”) in the start-tag of an element.
Cascading Style Sheets (CSS)
A standard that provides formatting control over elements using information contained in <STYLE> tags and STYLE attributes. Less powerful than XSL, it nonetheless looks like it has a bright short-term future as the only style mechanism already recommended by the W3C and (partially) implemented in major browsers.
Channel Definition Format (CDF)
An XML-based “push” standard that describes documents containing URL information along with descriptions, icons, and information on when the material should be automatically retrieved.
character data (CDATA)
Information in a document that should not be parsed at all. This allows the use of the markup characters &, <, and > within the text, even though no elements or entities may appear in the section. CDATA declarations may appear in attributes, and CDATA-marked sections may appear in documents.
child element
An element nested inside another element. In <FIRST><SECOND/></FIRST>, the SECOND element is the child element of the FIRST element.
chunk
A portion of a document identified by an XPointer. A chunk may refer to one element and all its content (including subelements), a group of elements, or even a selection based on content.
document
A “textual object.” In HTML, documents (or “pages”) were single files containing HTML. In XML, documents may contain content from several files or chunks and should included markup structures that make it valid or well-formed.
document object model (DOM)
A means of addressing elements and attributes in a document from a processing application or script. The W3C has a Document Object Model Working Group that is developing a standard model for HTML and XML documents.
document type declaration
In valid documents, the declaration that connects a document to its document type definition. The declaration may connect to an external file or include the definition within itself.
document type definition (DTD)
A set of rules for document construction that lies at the heart of all SGML development and all valid XML document construction. Processing applications and authoring tools rely on DTDs to inform them of the parts required by a particular document type. A document with a DTD may be validated against the definition.
Document Style Semantics and Specification Language (DSSSL)
A transformation and style language for the processing and formatting of valid SGML documents.
element
The fundamental logical unit of an XML document. All content in XML documents must be contained within elements.
empty element
An element that has no textual content. An empty element may be indicated by a start-tag and end-tag placed next to each other (<EMPTY></EMPTY>) or by a start-tag that ends with /> (<EMPTY/>). Empty elements may contain attributes only.
end-tag
A tag that closes an element. An end-tag follows the syntax </Name>, where Name matches the element name declared in the start-tag.
entity
A reference to other data that often acts as an abbreviation or a shortcut. By declaring entities, developers can avoid entering the same information in a document or DTD repetitively.
extended link
A link that contains locator elements rather than a simple HREF attribute to identify the targets of the link.
extended link group
A group of documents whose contents are analyzed for links to help establish two way links without requiring their declaration in every document.
Extensible Markup Language (XML)
A standard under development by the W3C that provides a much simpler set of rules for markup than SGML, while offering considerably more flexibility than HTML.
Extensible Style Language (XSL)
A style sheet standard submitted by Microsoft, ArborText, and Inso Corporation to the W3C. XSL allows developers to specify formatting far more precisely than Cascading Style Sheets permit. XSL seems promising, but is not yet a W3C working draft or recommendation.
external DTD subset
The portion of a document type definition that is stored outside of the document. External DTDs are convenient for storing document type definitions that will be used by multiple documents, allowing them to share a centrally managed definition.
general entity
An entity for use in document content. When used in documents, the name of a general entitity must be preceded by an ampersand (&) and should be followed by a semicolon (;).
Generalized Markup Lanugage (GML)
The predecessor to SGML, developed in 1969 by IBM in efforts led by Charles Goldfarb. GML originated the use of <, >, and / for markup and is still in use for document applications.
Hypertext Markup Language (HTML)
The most popular markup language in use today, HTML is an application of SGML. HTML is one of the foundations of web development, providing formatting and basic structures to documents for presentation via browser applications.
Hypermedia/Time-based Structuring Language (HyTime)
A set of multimedia and linking extensions to SGML, formalized as ISO/IEC 10744-1992. HyTime is one of the foundations for XML-LINK.
Hypertext Transfer Protocol (HTTP)
The protocol that governs communications between clients and servers on the World Wide Web. HTTP allows clients to send requests to servers, which reply with an appropriate document or an error message.
in-line link
A link in which the element making the linking declarations is itself a part of the link.
instance
The actual use of an element or document type in a document, as opposed to its definition. An instance may also refer to an entire document; a document may be an instance of a DTD if it can be validated under that DTD.
internal DTD subset
The portion of a document type definition that appears inside the document to which it applies. Internal DTD subsets can be hard to manage, but provide developers an easy way to test out new features or develop DTDs without disrupting other documents.
ISO
The International Organization for Standardization (the acronym is derived from its French name), which sets industrial standards relating to everything from character sets to quality processes to SGML.
markup
Structural information stored in the same file as the content. Traditionally, structural information is separated from the content and isolated in elements (defined with tags) and entities.
markup declaration
The contents of document type declarations, which are used to define the elements, attributes, entities, and notations. They specify the kinds of markup that will be legal in a given document.
Meta-Content Framework (MCF)
A standard developed by Apple and continued by Netscape that represents metadata as a multidimensional space for user navigation.
name
A name must begin with a letter or underscore and may include letters, digits, hyphens, underscores, and full stops. (Full stops in Latin character sets are periods.)
name characters
Letters, digits, hyphens, underscores, and full stops. (Full stops in Latin character sets are periods.)
name token
Any string composed of name characters.
notation
An XML structure that identifies the type of content contained by an element and suggesting a viewer to present it.
out-of-line link
A link in which the element definining the link is not itself a member of the set of targets defined by the link. Out-of-line links allow developers to declare links separately from the content of the document; out-of-line links may even appear in separate files.
parameter entity
An entity used to represent information within the context of a document type definition. Parameter entities may be used to link the content of additional DTD files to a DTD, or as an abbreviation for frequently repeated declarations. Parameter entitities are distinguished from general entities by their use of a percent sign (%) rather than an ampersand (&).
parent element
An element in which another element is nested. In <FIRST><SECOND/></FIRST>, the FIRST element is the parent element of the SECOND element.
parsed character data (#PCDATA)
Parsed character data is text that will be examined by the parser for entities and markup. Parsed character data should not contain any &, <, or > characters; these need to be represented by the &amp; &lt, and &gt; entities, respectively.
parser
An application that converts a serial stream of markup (an XML file, for example) into an output structure accessible by a program. Parsers may perform validation or well-formedness checking on the markup as they process it.
processing application
An application that takes the output generated by a parser (it may include a parser, or be a parser itself) and does something with it. That something may include presentation, calculation, or anything else that seems appropriate.
processing instruction
Directions that allow XML authors to send instructions directly to a processing application that may be outside the native capacities of XML. A processing intruction is differentiated from normal element markup by question marks after the opening < and before the closing > (i.e. <? instruction ?> ). The XML declaration is itself a processing instruction.
prolog
The opening part of a document, containing the XML declaration and and any document type declarations or markup declarations needed to process the document.
recursion
A programming technique in which a function may call itself. Recursive programming is especially well-suited to parsing nested markup structures.
root element
The first element in a document. The root element is not contained by any other elements and forms the base of the tree structure created by parsing the nested elements.
simple link
A link that includes its target locator in an HREF attribute.
Standard Generalized Markup Language (SGML)
The parent language of HTML and XML. SGML provides a complex set of rules for defining document structures. HTML uses structures defined under that set of rules, whereas XML provides a subset of the rules for defining document structures. SGML is formally standardized as ISO/IEC 8879—1986, although a series of later amendments have continued its development.
start-tag
The opening tag that begins an element. The general syntax for a start-tag is <Name attributes>, where Name is the name of the element being defined, and attributes is a set of name-value pairs. All start tags in XML must either have end-tags or use the empty element syntax, <Name attributes/>.
style sheet
A formatting description for a document. Style sheets may be stored in separate files from the documents they describe.
Unicode
A standard for international character encoding. Unicode supports characters that are 2 bytes wide rather than the 1 byte currently supported by most systems, allowing it to include 65,536 characters rather than the 256 available to 1-byte systems. Visit http://www.unicode.org for more information.
valid
A document is valid if it conforms to a declared document type definition (DTD) and meets the conditions for well-formedness. All elements, attributes, and entities must be declared in the DTD, and all data types must match their definition’s requirements.
W3C
The World Wide Web Consortium, the standards body responsible for many of the standards key to the functionality of the World Wide Web, including HTML, XML, HTTP, and Cascading Style Sheets. The W3C site includes the latest public versions of their standards as well as other information about the web and standards processes. Visit http://www.w3.org for more information.
well-formed
A well-formed document may or may not have a DTD. Well-formed documents must begin with an XML declaration and contain properly nested and marked-up elements.
XML
see eXtensible Markup Language
XML declaration
The processing instruction at the top of an XML document. It begins with <?XML, includes a version identifier, required markup declaration, and encoding identifier, and closes with ?>. (The XML declaration may be case-sensitive at some point; the standard at present is unclear on this issue.)
XPointer
A reference to a chunk of a document. XPointers use a syntax derived from the Text Encoding Initiative, modified to take into account the needs of the HTTP protocol for encoding URLs.
XSL
see eXtensible Style Language


Table of Contents