DTD for standard GLOSS tokens

This software is part of the gloss system. Author: Richard Kaye, May 2006, copyright reserved. Licence: This software may be used under the conditions of the latest version of the GPL. No warranty.

Namespace prefix

Here, we set things up so we can have either no namespace prefix, or a prefix, as we require. The default is to use prefix tok: and the namespace is 'http://gloss.bham.ac.uk/xmlns/tokens/'. Define %tokens.prefixed; to be IGNORE to remove prefix or define %tokens.nsprefix; to be your favourite prefix.

<!ENTITY % tokens.nsprefix "tok">
<!ENTITY % tokens.prefixed "INCLUDE">
<!ENTITY % tokens.namespace "'http://gloss.bham.ac.uk/xmlns/tokens'">

<![%tokens.prefixed;[
  <!ENTITY % tokens.nsattr "xmlns:%tokens.nsprefix;">
  <!ENTITY % tokens.prefix "%tokens.nsprefix;:">
]]>
<!ENTITY % tokens.nsattr "xmlns">
<!ENTITY % tokens.prefix "">

Common attributes

The common attributes for a token define its depth, line number and column number. Each of these may be omitted. We also allow a xmlns attribute. Thus a token often has format:

<type
  xmlns[:prefix]="namespace"
  d="depth"
  l="line number"
  c="column number"
/>

Here is the DTD code for this.

<!ENTITY % tokens.common.attr "
   %tokens.nsattr; CDATA %tokens.namespace;
   d CDATA #IMPLIED
   l CDATA #IMPLIED
   c CDATA #IMPLIED
">

Name attribute

Where required, we use attribute name="full-name" to identify the name of a token.

<!ENTITY % tokens.name.attr "
   name CDATA #IMPLIED
">

Token types

We now define the token types. This DTD is deliberately as free as possible, but the content of the elements when marked as ANY should resolve to a string of text being the value of the token, and the value of the name attribute should be a suitable name, i.e., having the correct syntax. The following list is now almost self-explanatory.

A attr (attribute) token, with syntax @name:

<!ENTITY % tokens.attr "%tokens.prefix;attr">
<!ELEMENT %tokens.attr; ANY >
<!ATTLIST %tokens.attr; %tokens.common.attr; %tokens.name.attr; >

A b64 (base 64 data) token, encoded as =data:

<!ENTITY % tokens.b64 "%tokens.prefix;b64">
<!ELEMENT %tokens.b64; ANY >
<!ATTLIST %tokens.b64; %tokens.common.attr; >

A char (character) token, with syntax 'c' or '\e':

<!ENTITY % tokens.char "%tokens.prefix;char">
<!ELEMENT %tokens.char; ANY >
<!ATTLIST %tokens.char; %tokens.common.attr; >

A cref (character reference) token, with syntax &#[0-9]+; or &#x[0-9a-f]+;:

<!ENTITY % tokens.cref "%tokens.prefix;cref">
<!ELEMENT %tokens.cref; EMPTY >
<!ATTLIST %tokens.cref; %tokens.common.attr; %tokens.name.attr; >

A elt (element) token, with syntax name:

<!ENTITY % tokens.elt "%tokens.prefix;elt">
<!ELEMENT %tokens.elt; ANY >
<!ATTLIST %tokens.elt; %tokens.common.attr; %tokens.name.attr; >

A eos (end of stream) token representing the end of input:

<!ENTITY % tokens.eos "%tokens.prefix;eos">
<!ELEMENT %tokens.eos; EMPTY >
<!ATTLIST %tokens.eos; %tokens.common.attr; >

A eref (entity reference) token, with syntax &name;:

<!ENTITY % tokens.eref "%tokens.prefix;eref">
<!ELEMENT %tokens.eref; EMPTY >
<!ATTLIST %tokens.eref; %tokens.common.attr; %tokens.name.attr; >

A fp (floating point number) token, with usual syntax with or without exponent E :

<!ENTITY % tokens.fp "%tokens.prefix;fp">
<!ELEMENT %tokens.fp; ANY >
<!ATTLIST %tokens.fp; %tokens.common.attr; >

A hex (hexadecimal integer) token, starting 0x:

<!ENTITY % tokens.hex "%tokens.prefix;hex">
<!ELEMENT %tokens.hex; ANY >
<!ATTLIST %tokens.hex; %tokens.common.attr; >

An int (signed integer) token, eg -3453545435435535353533:

<!ENTITY % tokens.int "%tokens.prefix;int">
<!ELEMENT %tokens.int; ANY >
<!ATTLIST %tokens.int; %tokens.common.attr; >

A label token, at beginning of the line preceding TAB:

<!ENTITY % tokens.label "%tokens.prefix;label">
<!ELEMENT %tokens.label; ANY >
<!ATTLIST %tokens.label; %tokens.common.attr; %tokens.name.attr; >

The null token i.e., corresponding to java "null":

<!ENTITY % tokens.null-token "%tokens.prefix;null-token">
<!ELEMENT %tokens.null-token; EMPTY >
<!ATTLIST %tokens.null-token; %tokens.common.attr; >

(Note: null is not available in the accept attribute of a mv mode.)

A pdef (parameter definition) token, with syntax ${name}="value":

<!ENTITY % tokens.pdef "%tokens.prefix;pdef">
<!ELEMENT %tokens.pdef; ANY >
<!ATTLIST %tokens.pdef; %tokens.common.attr; %tokens.name.attr; >

A pelt (pseudo element) token, with syntax !name:

<!ENTITY % tokens.pelt "%tokens.prefix;pelt">
<!ELEMENT %tokens.pelt; ANY >
<!ATTLIST %tokens.pelt; %tokens.common.attr; %tokens.name.attr; >

A pi (processing instruction) token, with syntax ?name:

<!ENTITY % tokens.pi "%tokens.prefix;pi">
<!ELEMENT %tokens.pi; ANY >
<!ATTLIST %tokens.pi; %tokens.common.attr; %tokens.name.attr; >

A pref (parameter reference) token, with syntax ${name} or $c:

<!ENTITY % tokens.pref "%tokens.prefix;pref">
<!ELEMENT %tokens.pref; ANY >
<!ATTLIST %tokens.pref; %tokens.common.attr; %tokens.name.attr; >

A punctuation token, with user-defined syntax:

<!ENTITY % tokens.punc "%tokens.prefix;punc">
<!ELEMENT %tokens.punc; ANY >
<!ATTLIST %tokens.punc; %tokens.common.attr; >

A str (string) token, with syntax "value":

<!ENTITY % tokens.str "%tokens.prefix;str">
<!ELEMENT %tokens.str; ANY >
<!ATTLIST %tokens.str; %tokens.common.attr; >

A uc (arbitrary unicode character) token, c:

<!ENTITY % tokens.uc "%tokens.prefix;uc">
<!ELEMENT %tokens.uc; ANY >
<!ATTLIST %tokens.uc; %tokens.common.attr; >

A uri token, with syntax ~value, where value has a restricted character set:

<!ENTITY % tokens.uri "%tokens.prefix;uri">
<!ELEMENT %tokens.uri; ANY >
<!ATTLIST %tokens.uri; %tokens.common.attr; >

author home

This page is copyright. Web page design and creation by GLOSS.