The IMesh Toolkit
Annotation Service Design : Introduction
Origins of work
The following design has arisen from work undertaken in the IMesh Toolkit which has been identifying specific tools which might be useful to developers of subject gateways. In this context, it was decided to undertake work upon the usefulness and functionality of annotations in a variety of contexts and for a variety of users.
In focusing upon a design for a specific service as a means of identifying the likely design implications, it was resolved to take as a starting point a service such as the RDN onto which an annotation service could be layered. The intention was to produce a design that had as little impact as possible upon the existing service. In effect this meant that the annotation system in its basic design did no more than work with the url to a description of a resource as its starting point.
In this way it was anticipated that the system could be adopted by a number of subject gateways since the annotation system did not depend unduly upon other system specific data.
To facilitate the use of the design, a glossary of terms used in this collection of documents and drawings was produced.
Main objective of annotation system
The principal objective of the proposed system was to provide users of a resource retrieval system such as the RDN with the capacity to comment upon the usefulness and other virtues or faults of resources as represented in the resource description or summary. In other words with the annotation system it would be possible to create and view a series of comments upon the resource descriptions associated with each description.
As is often the case with such annotations, the design anticipated their being stored on and retrieved from a database separate from the base document, in this case, a resource description. This database would store other data relevant to the annotation such as its author, date of creation, etc. in addition to the most vital data, i.e. the identifier of the resource description being commented upon.
The basic design would provide registered users of the RDN service to view all annotations on a given resource description, create and edit new ones and in the instance of their own creations, delete their original creations.
For an approach to the design of the basic annotation system see below. For a broader view of the both pre-digital and digital concepts of annotation, with a view to how annotation tools might be used in the subject-gateway environment, see Ariadne article.
Extension to functionality - automoderator
An interesting addition to the main objective above was the subject of a feasibility exercise. This was the inclusion in the system of a moderation system for all annotations. It was included in the design as it was realised that a considerable amount of the design might be re-used in creating the functionality for a process to check automatically, i.e. in a human moderator's absence, all incoming annotations and edited annotations.
The rationale for this extension, i.e. of an automoderator, was that, properly configured, the process would be able to allow the overwhelming majority of uncontentious annotations to pass immediately onto the annotation database without having to await the attention of a human moderator. The aim of such an auto-moderator would be diametrically opposite to the way manual moderation systems operate. Whilst it is not usually a stated intention, manual systems suffer from the drawback that all incoming annotations, however acceptable, are withheld from viewing by the user populace until the moderator is able to access the moderation system. The advantage that this method has is that no annotation can slip onto the system unchecked and therefore possibly in contravention of the annotation system operator's policy. However a major drawback is that the overwhelming majority of annotations, in a small community of users possibly over 95%, may have to wait days before becoming available, despite their complete acceptability.
The purpose of the auto-moderator extension would be to achieve a fitting compromise that permitted immediate viewability of appropriate annotations. It would also, meanwhile, withhold for human judgement any annotation that, on testing, appears to satisfy a set of criteria which identifies the elements of language or author provenance which constitute a threat to the system operator's acceptable use policy, for example, the use of abusive or obscene language, as a simple example.
A principal feature of the automoderator's design is the high degree of configurability afforded the system moderator who is able to add to the criteria and even determine the degree to which any item on the criteria list is considered to represent a threat.
The role of the automoderator is to test and evaluate a variety of data which can indicate a threat , i.e. an annotation that might not be automatically stored for user viewing. A simple threat, as stated above, might be inappropriate language. A watchlist of unacceptable words, but also phrases, can be built upon with experience. Each item can be individually rated by the moderator to show to what degree the word constitutes a threat to the AUP. Certain items, in themselves, can trigger automatic refusal to pass the annotation for viewing.
The entire system operates on the notion of tolerance. It will tolerate annotations that do not contain threats, i.e. elements that cause the annotation to threaten the AUP and pass them for user viewing. As the system tests an annotation, it keeps a running total of the threats and their attendant values detected in the process. When this total , or threat value reaches a pre-determined threshold, the threat threshold, the annotation is flagged as unviewable and is marked as an unacceptable threat for the Moderator's subsequent attention. All values can be raised or lowered by the Moderator at any point. The auto-moderator can also be rendered more intolerant of incoming annotations by raising the initial priority value. Normally set to 0, if it raised to say, 10, this would mean that the auto-moderator would more quickly reach the threat threshold than with the default setting. This permits the moderator to impose a more cautious regime without the necessity of raising the individual values of each criteria item.
The system can also determine the degree of suitability of the annotator by testing his/her email address. If it does not conform to criteria about its domain, for example, since the Moderator can fill a list with most likely domains for the systems annotators, then a figure is added to the threat value, though usually not a large one, except for explicit exclusion of a given address. These tests of provenance serve to decrease the degree of tolerance without necessarily automatically excluding the annotation. However if tests for content match items on the Watchlist of unacceptable vocabulary, a properly configured system should push the threat value over the threat threshold and the annotation will fail the acceptability test. Depending on how complete the adverse criteria are entered onto the system, the likelihood increases that the moderator, on logging on, will confirm the automoderator's decision. If well configured, there should be a small number of cases where the annotation failed, just, to pass the acceptability test but then turns out to be innocent.
It is possible to increase the degree of tolerance of the automoderator such that in a large volume system, annotations with uncomplimentary star-ratings, i.e. below average, are automatically loaded with a threat value since they represent the most likely threats. This is achieved by switching on the relevant filter in the Set Moderation Values function.
Therefore by the configuration and application of a series of criteria concerning provenance and content, it is possible for the automoderator to permit the overwhelming majority of annotations submitted for immediate general viewing. See screen prototyping exercise of the auto-moderator which gives an interactive view of operations.
This final section provides a brief summary of how the following design pages are organised. The initial diagrams show the domain model and also the overall use case model for the entire system. (Use cases are at the centre of the UML design approach. They provide a blow-by-blow analysis of how users, or to use the UML terminology, actors, (usually humans), interact with the system and how it responds). Read on ...