The GLOSS-xml system, part 3

One of the objectives of GLOSS is that almost all well-formed XML should be authorable in GLOSS. This page covers some of the less-used features of XML, such as processing instructions.

1 Entity references

We have seen how characters like & and < are automatically escaped by GLOSS when entered as part of text. Sometimes you genuinely want to use a particular escaped form, such as &amp; or &#38; or &#x26;. These character (or entity) references can be achieved in gloss by just typing the reference outside text mode. Thus

fruit-combinations
  combination [blackberry]&amp;[apple]
  combination [lemon]&#38;[lime]
  combination [strawberries]&#x26;[blackpepper]

generates

<fruit-combinations>
  <combination>blackberry&amp;apple</combination>
  <combination>lemon&#38;lime</combination>
  <combination>strawberries&#x26;blackpepper</combination>
</fruit-combinations>

as you'd hope.

2 Processing instructions

These are generated just like comments with text content, but with a "pseudo element name" ?target rather that !. Thus

?myapp [text for my app but note that ?> is 
       illegal in XML PI's and will be escaped by gloss]
<?myapp text for my app but note that ?&gt; is 
       illegal in XML PI's and will be escaped by gloss?>

and

?latex [\\!]
<?latex \!?>

which might insert a backspace (\!) in a LaTeX output document from applications that recognise the ?latex instruction, a quite useful device sometimes.

3 XML declarations

XML declarations, of the form <?xml version="1.0" encoding="UTF-8"?> could be generated as processing instructions using

?xml [version="1.0" encoding="UTF-8"]

as the first line of a text input to gloss. This may be the only option for such a processing-instruction if gloss does not support the character encoding you are planning to use. (So in this scenario you'd generate a document in UTF-8 with a false XML and encoding declaration, and recode it to something else with another program.) However it is recommended that where possible xml declarations are generated in a different way via parameters in the MV file.

The relevant parameters are as follows (where "xml:" is an abbreviation for "{http://gloss.bham.ac.uk/mv/xml/xml}" which might be defined with <mv:declare-prefix>).

xml:decl
Whether to include an XML declaration. Valid values are "include" or "omit".
xml:version
The version number of XML, e.g. 1.0 or 1.1 (without quote marks).
xml:encoding
The encoding to be used. The output file will be generated in this encoding, and the list of valid values is system dependent. UTF-8 and UTF-16 are supported.
xml:standalone
The value for standalone: no or yes.
xml:bom
Whether to emit a BOM (byte order mark). This is manditory for some encodings, and optional for others. Sadly, it is in practice necessary for clients which don't read or act on the 'encoding="..."' part of the declaration. MS-IE is one of these clients that does not support XML in this regard, and a BOM is required for this browser to ensure it correctly identifies the encoding.

The file <glossdir>/lib/decl-xml.mv is a variation on the standard <glossdir>/lib/xml.mv that generates files in UTF-8 with the xml declaration <?xml version="1.0" encoding="UTF-8"?>.

4 DOCTYPE declarations

The standard GLOSS-xml MVs allow the pseudo-element !DOCTYPE to be present where a DOCTYPE declaration is allowed. This pseudo-element takes attributes of the form @system[URL] and @public[PUBLIC-ID] and its content is gloss-encoded DTD content. (See elsewhere for a discussion of GLOSS and DTDs.) Thus

!DOCTYPE 
  @system[http://my.url.com/mydtd]
  @public[MY PUBLIC DOCUMENT ID]
edible fruits

will generate

<!DOCTYPE edible
    PUBLIC "MY PUBLIC DOCUMENT ID"
    "http://my.url.com/mydtd" >
<edible><fruits/></edible>

Note that the root element name is found automatically.

It is possible to re-define the xml:doctype mode so that a different DOCTYPE is generated automatically. The html and xhtml mv's do this for example.

5 Document fragments and DTDs

It may be required to use gloss to generate a document fragment rather than a document. A document fragment is an xml-like document that can be included in an XML document using one of the standard XML include mechanisms. So for example, a document fragment may contain more than one element at top level. To generate these just use <glossdir>/lib/docfrag.mv.

For example,

apple
  latin[Malus pumila]
  french[pomme]
pear
  latin[Pyrus communis]
  french[poire]

produces

<apple><latin>Malus pumila</latin><french>pomme</french></apple>
<pear><latin>Pyrus communis</latin><french>poire</french></pear>

Similarly GLOSS may be used to generate DTDs with <glossdir>/lib/dtd.mv. The GLOSS encoding and syntax for DTDs is similar to that for gloss itself except it heavily uses pseudo-elements starting with an !. This syntax will be discussed elsewhere.

This page is copyright. Web page design and creation by GLOSS.