This page lists some work that is required to be done. Some of these tasks are larger than others, some more or less interesting.
A difficult and as yet unsolved problem is how the user's experience and usage of GLOSS may be improved with appropriate XML DTD or schema for the target document type. One possibility is that a program might be written to convert a schema or DTD to a customized MV vocabulary. Another is that the vocabulary itself is expanded to include schema-type information to aid the user and editor. A third is that a GLOSS schema language for GLOSS-XML syntax might define suitable input formats and provide editing hints to the user.
This is a huge area, and it is not obvious whether it is necessary or a good idea.
There are a number of smaller issues to do with adopting properly the 32-bit UNICODE characters in GLOSS and tightening the code. By and large GLOSS does use 32-bit UNICODE properly, but this area is poorly tested and will need some cerful review. One area where work is required is in redefining the syntax in the tokenizer (etc) for element names, parameter names, prefixes and GLOSS element names, GLOSS prefixes, etc.
Normalization of unicode characters is required by the standards and not currently done.
Validation reports currently use line numbers of the XML file and not line numbers in gloss sources. It should be possible to fix this using processing instructions, so that GLOSS's validator can identify the position in the source code.
Support for xsl:space and whitespace processing has been added. This needs to be checked, confirmed and finalized. Whitespace seems to be a perennial problem with XML processing, and and new ideas that might tidy things up better would be welcome.
The relationship between similar xr: and mv: tags must be explored further. This is nearly complete as the definitions and DTDs are being finalized.
These are core MVs, and are likely to be used a lot. They nedd documenting thoroughly and reviewing for hooks, usefulness etc. More development needed to make them more robust and more useful
The current Tokenizer need some attention. In particular it doesnt back-track properly. Some work and re-writing is needed.
The current tokenizer possibly seems a bit rigid and not flexible enough for more general text to xml applications. Should we build the possibility of recognizing other tokens? If so (a big if) then how? A reserved format accept="$..." is available. (Is this documented?)
The system providing modularity via hooks and namespaces is interesting and quite different to normal notions of modularity via objects. There are interesting theortical questions concerning this. Practical problems include defining an interface for a module that is well defined and will present some stability and dependency for subsequent modules that build on it.
This page is copyright. Web page design and creation by GLOSS.