XML (Extensible Markup Language) is one of the hottest technologies for Web development. It is based on SGML and while it has the advantage of being less complicated to use than SGML, it is both more complex and flexible than HTML. XML offers users the following advantages: it makes networked information easier to find; categorize; customize; and allows users to create documents that look and function exactly the same way through any browser; it allows Web pages to be updated without re-sending all data, thereby saving on bandwidth (for example, XML allows on-line booksellers to use tags such as "price" and "number of pages" for searching and categorising purposes); and support for XML is being built into products from Microsoft, Netcape, Adobe, DataChannel, and WebMethods. This text is the technical reference for Web and application programmers and developers. After a concise overview of the purpose and scope of XML and its principles, readers find a complete and in-depth annotated specification guide that includes sample applications. This comprehensive reference guide is designed for experienced Web developers and programmers and goes beyond comprehensive coverage of the XML specification to offer: namespaces, a recent W3C draft critical for large-scale, distributed applications; Tiny SML, a subset of XML used for special applications; databases and object-oriented models, including object inheritance and architectural forms.
Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.
Ian Graham is Vice President in charge of Research and Development at Groveware, Inc. He is also Senior Instructional Technology Specialist with the University of Toronto Centre for Academic Technology, where he designs and prototypes applications for networks and the Internet. Liam Quin is a member of the W3C Standards Committee for XML.
Chapter 2: Declaring Markup: The Document Type Declaration
Concepts Covered: Document type declaration, element type declarations, content models, attribute list declarations, string attribute types, enumerated list attribute types, internal entities, general entities, entity references, standalone documents.
The previous chapter examined the structure of element-level markup that is, the markup that organizes the text of a document to give it structure and meaning. However, provided the basic markup rules were followed, there was no specified syntax for the nesting or placement of these elements, or even their names. Consequently, documents such as the one listed in Figure 1.1 are essentially free- form, and can have entirely arbitrary elements, attributes, and hierarchical structures.
Often, however, there are specific rules that an author (or XML application designer) may wish to impose on the structure of a document - for example, that only certain named elements can be used, and that these elements must be "nested" in certain ways (e.g., a menu element must contain a desc and a price). Indeed, this is an important aspect of markup, since intelligent processing of marked-up data requires that there be some defined structure, and some common way for defining structural rules, and verifying that those rules are obeyed.
In order to impose such requirements, XML needs a mechanism for defining syntax rules for a document or class of documents. Indeed, XML supports such a mechanism via a markup component called the document type declaration. This chapter introduces this concept, while the next several go into the details of how it works. Indeed, most of the hard parts of XML are associated with understanding how the document type declaration (and its contents) work and affect the processing of the document element and its content.
2.1 The Document Type Declaration
Figure 2.1 is a reworked version of Figure 1.1, including a single new component - a document type declaration. A document type declaration defines syntax rules for the elements and attributes of a document, and also defines entity and notation declarations. This chapter looks only at declarations governing document syntax-entity and notation declarations are looked at in later chapters.
2.1.1 The Document Type Declaration
Internal Subset
As mentioned in, Chapter 1, a document type declaration is part of the document prolog, and must appear in front of the first element of the document. It must also appear after the XML declaration. A document type declaration begins with the string:
where string ( a Name [5]) is the internal identifier for the declaration, and essentially names the document element type to which the declaration applies. Consequently, the name specified in the declaration must match the name of the root element of the document. Note, in Figure 2.1, how these two names match (menu).
The content of the document type declaration (the markup declarations between the strings ! DOCTYPE string [ and I >) is called the internal document type declaration subset, or internal DTD subset, or often just the internal subset. It is called "internal" because it is actually a part of - or internal to-the document entity being processed. Not surprisingly, there is also a thing called the "external subset," which we will discuss in Chapter 5 of Part 1.1
A document type declaration contains markup declarations-declarations that define the allowed grammar for the use of elements and attributes, and that define reusable component parts for the XML document (i.e., entities). There are four types of markup declarations: element type declarations, attribute-list declarations, entity declarations, and notation declarations. This example looks only at element type, attribute-list, and some simple entity declarations. Entity declarations are also discussed in more detail in Chapters 3, 4, and 6 of Part 1, while notation declarations are discussed in Chapter 6.
2.1.2 Internal General Entity Declaration
The first line in the document type declaration
!ENTITY resto "Liam's Chowder House and Grill" >
declares an entity named resto to be equivalent to the string inside the double quotes (you could equally well use single quotes). This particular entity is an internal entity, because the actual content of the entity is given in the declaration. By contrast, external entities (discussed in Part 1, Chapter 4) are those whose content is external to the document, for example in a file. The value for an internal entity is given by the quoted string present in the declaration. Note that this value cannot contain the quotation character used to delimit the value (here the double quote "), or the characters & and % when they are not part of a character or entity reference (the % is used to reference parameter entities).
This type of entity is also called a parsed entity, as it contains parsable XML markup and character data. By definition, all parsed entities must contain good (i.e., well-formed) XML-and all internal entities are parsed entities.
This form of entity declaration (i.e., beginning with !ENTITY resto ... ) defines what is called a general entity. General entities are used, or referenced, using the notation &resto; where resto is the entity name. Such references can appear in a variety of places. This particular entity is used in two places: within the rname element of the document, and within the attribute- list declaration for the body element.
Note that an entity name must be a name, as defined by production rule [5].
2.1.3 General Entity References and Replacement Text
As mentioned above, internal general entities can be referenced within a document by means of a general entity reference. Such entity references take the form &ent-name; where ent -name is the name of the entity in question. The actual content of an internal entity (that is, the text present in the definition of the entity) is called the literal entity value: Thus the literal value for the entity resto is the string:
Liam's Chowder House and Grill
When entity references &resto; are expanded (i.e., when an XML Processor parses the document), they will be replaced by this text.
Formally, the text that replaces an entity reference is called the replacement text. In the preceding case, the replacement text and the literal entity value were the same - but this is not always so. The literal entity value and the replacement text may be different because entities can themselves contain entity references - for example, the resto entity might have been defined by:
! ENT Y resto "Liam's chowder House &stuff; and Grill" >
where the literal entity value contains the entity reference &stuff;. It is now less dear what to use as replacement text for &resto; -should it be replaced by the value inside the double quotes, or by the value inside double quotes after replacing &stuff; by the contents of that reference?
We will actually not answer that question here-we need to introduce some other concepts (namely: character references, parameter entities, and external/internal entities) before we can properly explain the rules for constructing replacement text. We will do so in the next few chapters.
„Über diesen Titel“ kann sich auf eine andere Ausgabe dieses Titels beziehen.
Anbieter: Wonder Book, Frederick, MD, USA
Zustand: Very Good. Very Good condition. A copy that may have a few cosmetic defects. May also contain light spine creasing or a few markings such as an owner's name, short gifter's inscription or light stamp. Artikel-Nr. M14R-00350
Anzahl: 1 verfügbar