GLOSS is about converting plain text into XML. The plain text may be specially written for input into gloss, or may be some existing legacy document. Normally output from gloss will be processed further, for example by XSLT stylesheets.
The main stages carried out by the computer in glossing a text are as follows:
Gloss has its own built-in tokenizer that recognises certain kinds of tokens. This includes: XML names, numbers, base64 sections, individual characters and strings. What's more, the kinds of tokens that will be recognised are context dependent, so the tokenizer will often work differently at different stages of the process, depending on which mode the system is in.
Amongst the token types GLOSS uses are individual characters
and user-defined punctuation combinations
, so in principle
a very wide variety of text files can be parsed by the system.
Not every transformation will be feasible to write or to operate,
however. GLOSS was designed with certain specifical transformations
in mind. But, if what you want can't be solved in GLOSS, there may be
other solutions, including writing your own tokenizer or
writing a text preprocessor, or using XSLT or some other such
transformation afterwards.
The mode's actions, when it receives a token is it can handle,
typically involve adding data to the xml DOM representation
of the XML being built up in memory. Quite complex data constructs
can be added from a single token. The processing on one token
may also involve entering other modes and scanning scan one or
more subtokens
. Of course, there are a number of ways
a mode may complete its set of actions, as described elsewhere,
including a return
command and/or placing a limit on the
number of tokens allowed.
The xml DOM representing the xml dataset in memory is stored using a special xml vocabulary designed to represent the required syntactical details of a textual XML document in XML itself. (This is necessary because in-memory XML datasets on their own do not uniquely determine any particular textual representation.) The GLOSS printer class recognises the special xml representation tags and transforms them to a textual representation in the intended way.
Before the xml dataset in memory is printed, it can be processed additionally with XSLT. This happens in the standard HTML processing to resolve cross-references and generate a menu bar, for example. XSLT allows several transformations that would be impossible in GLOSS alone: GLOSS's job is to generate XML is as straightfoward way as possible for further processing with standard XML tools.
This page is copyright. Web page design and creation by GLOSS.