The SGML declaration is the mandatory first part of every SGML document. In the special case of XML, the SGML declaration is fixed and must not be specified, but it exists nonetheless, and conforming SGML parsers must start by reading and evaluating an SGML declaration for every validation.
The SGML declaration determines the lexical rules for the SGML document, that is it specifies which characters are permissible, which characters are control characters, which options of SGML are used and which are forbidden, etc.
The SGML declaration can only be written in one particular syntax (the
Reference Concrete Syntax. The following Document Type Definition
(DTD) may be written either in Reference Concrete Syntax
(if the declaration specifies SCOPE INSTANCE
) or in the
syntax which is declared in the SGML declaration (if it specifies
SCOPE DOCUMENT
). Therefore the parser must know the lexical rules,
i.e. the SGML declaration, before it attempts to parse the document
The SGML declaration is hardly ever of real concern, and it is usually included implicitly. This document describes how this is done in detail.
If no SGML declaration is provided at all, then the parser must, according to the standard, assume the default SGML declaration (which is part of the standard). This is normally very impractical because the default declaration imposes some very limiting constraints.
The first proper way to include the SGML declaration in a document (apart from falling back on the default) is to simply not put anything inside the actual document and call the parser with two separate filenames, first the declaration and then the document. The parser will then just read the two files sequentially.
The second way is to link a specific SGML declaration permanently to the SGML document (as opposed to specifying it at each parser call). This again can be done in a number of different ways.
To start with, one can include the entire SGML declaration:
<!SGML "ISO 8879:1986 (WWW)" -- SGML declaration here... -- >
<!DOCTYPE doctype [ -- DTD here... -- ]>
<doctype> ... document instance ... </doctype>
Alternatively, the actual declaration (starting with "ISO 8879:1986 (WWW)"
)
may reside in a separate file. In that case, the SGML declaration gets a name (like any
other meta-declaration, though it does not really matter here) and the filename
is passed on as a system identifier:
<!SGML name SYSTEM "filename" >
<!DOCTYPE doctype [ -- DTD here... -- ]>
<doctype> ... document instance ... </doctype>
At last, publicly identified SGML declarations can be referred to via their public identifier as usual, and a catalog entry will have to point to the actual file. An explicit filename may be specified optionally, again as usual:
<!SGML name PUBLIC "public_identifier"
"system_identifier" >
<!DOCTYPE doctype [ -- DTD here... -- ]>
<doctype> ... document instance ... </doctype>
When working with publicly standardized document types, most meta declarations make reference to a public identifier only. These identifiers are then resolved into system identifiers (file names) by an SGML catalog file. Catalog files can also be used to specify the SGML declaration. There are two ways of doing this:
To specify an SGML declaration that gets used when no other declaration is present, use the following line in the catalog:
SGMLDECL "filename"
This will in effect replace the standard SGML declaration for all documents that provide no other declaration.
Secondly, one can specifically link one SGML declaration to each document type public identifier with the following catalog entry:
DTDDECL "DTD_public_identifier" "filename"
This way, every document that provides no SGML declaration of its
own and starts with <!DOCTYPE doctype "DTD_public_identifier" [...] >
will have the SGML declaration from filename prepended.
Attention:
Files that contain SGML declarations for inclusion via catalogs
like described in this section must contain the
declaration inside the <!SGML ... >
markup!
(The same holds for files that are called alongside the document
in the parser call, as described above.) That is to say that
in total there must be precisely one tag
<!SGML ... >
in the complete SGML document.
The first characters of the SGML declaration proper (after the <!SGML
markup) must be the minimum literal, whose minimum data must
be one of the following:
"ISO 8879:1986"
"ISO 8879:1986 (ENR)"
"ISO 8879:1986 (WWW)"
The first line specifies that the document follows the original SGML standard,
the middle line refers to Annex J of the standard, and the last line
refers to Annex K of the standard, the Web SGML Adaptations. Annex K supercedes
all other standards and should be used exclusively. It allows finer
tuning of the syntax, as for instance required for XML’s compact form
of empty elements (<element/>
).
<!SGML HTML3.2 PUBLIC "+//IDN W3C.ORG//SD HTML Version 3.2//EN">