Back to IMesh Toolkit Home Page
Back to IMesh Toolkit Homepage
Subject Gateway Requirements
Technology Review
Work In Hand
  Personalization
Annotation
Reading Lists
OAI  Normalization tools
Metadata Exchange
RDF queries
Evaluation
Dissemination
Project Documentation
Related Links
Project Partners
IMesh Home Page

The IMesh Toolkit

[ Work In Hand > Technology Review > Standards and Protocols ]

Z39.50

Overall Purpose

Z39.50 is a search and retrieval protocol, currently at version 3, (Z39.50-1995 is a compatible superset of the 1992 version)[1], which is maintained by the Library of Congress and is capable of operating over TCP/IP. It is a client-server-based model in which search and retrieval are separated in the protocol. Once the session is established with the database server, the server responds to the query with a set of results which can be retrieved as a whole or as a subset.

Brief Overview of Functionality

Operations included in version 3 are as follows:
Search, within which there is a number of query syntaxes; single query can initiate a simultaneous search across multiple databases within a server. Moreover, new attribute sets may be defined with less replication. Also new data types for attribute values are defined (in version 2 only numeric values are allowed). Furthermore, an attribute set definition may now list alternative sets of evaluation rules (for example, whether the server is allowed to substitute an attribute that it thin ks is more appropriate), and the query may select one of the alternatives. The enhanced bib-1 attribute set definition exploits this new feature [1].

Present function will retrieve both whole or partial sets of records in various formats including SGML/XML and a variety of MARC formats, as well as presenting selected fields of the records obtained. Z39.50 also supports client authentication as well as the sorting of one or more results sets across multiple fields. The latter allows the client to request that a particular result set be sorted, for example the three most recently published documents. Therefore, based on a specific key, (i.e.. "date of publication", descending), it is possible to retrieve the records by publication date order. If the server supports the sort service (and also supports sorting on the requested key, in this case "date of publication") then following the Sort, the client may subsequently retrieve the first three records, and they will be the three most recent [2]. Also new to version 1995 is Scan which is used to scan terms in a list or index [1]. Additionally this version provides a Browse facility that matches all words deemed close to a given term.

Finally there is the Explain facility. This allows a client to retrieve details of the server implementation: general features (description, contact information, hours of operation, restrictions, usage cost, etc.), databases available for searching, indexes, attribute sets, attribute details, schemas, record syntaxes, sort capabilities and extended services. The server maintains Explain information in a special database that may be accessed by the client using the Z39.50 search and retrieval facilities [1].

Deployment

There is still interest and curiosity over the relationship between Z39.50 and the XML Query. There has been discussion whether to incorporate it as a Z39.50 query type; and there has been the suggestion that this could potentially be useful: there is no "protocol" to wrap around the XML Query, so Z39.50 could be useful to XQL, and having XQL as a Z39.50 query type could prove useful to the Z39.50 community. But in order for this to make sense,the data model for XQL has to be compatible with that of Z39.50. It is not entirely clear currently what this implies [3]. Furthermore, work has begun on an experimental encoding of the holdings schema in XML and whilst GRS will not be deprecated, XML will be introduced into more usage, for instance the schema for server information for distributed discovery. The Z39.50 Implementors Group has discussed what it should be doing about XML and RDF query work going on in W3C [4].

In examining the adoption of a search protocol, particularly in the context of the bibliographic domain, it may be argued that Z39.50 at least merits consideration because of the high level of usage and acceptance within the bibliographic community. Moreover it is an accepted standard where few other competitors pertain. As a communication protocol it places relatively few restrictions on underlying databases, most of which are capable of some sort of Z39.50 interface. Because of its modular nature, it is thought to be able to cope with improvements and extensions in services. For example, within the Z39.50 -1995 the concept of a "negotiation record" is introduced. The client may include a negotiation record within the initialization message to propose that some condition be in effect for the session (for example, the use of a particular language and one or more character sets). The server may respond, indicating whether the proposal is accepted, or indicate a counter-proposal. The negotiation record is an application of the new extensibility feature [1].

Z39.50 is a large and complex standard. Widely used in bibliographic and digital library applications, Z39.50 represents a significant body of experience relating to bibliographic data retrieval and the specification of interoperable semantics for community-specific metadata attribute sets [5 ]. It has the advantage of accepting new or differing query syntaxes, formats and field semantics thanks to the aforementioned modularity. However, by the same token, its size and complexity has tended to encourage the implementation of only subsets of the standard on the one hand, and limited and local versions on the other. To combat the consequent threat of a loss of interoperability, a variety of profiles have sprung up to define the fields and features that each implementation supports. Such profiles are in effect agreements between communities of services and users as to what exactly each implementation defines; GILS (Government Information Locator Service) and the Bath Profile are two examples of such profiles.

The size and complexity of Z39.50 makes for a very comprehensive protocol which gives a depth to the search and retrieval operations that some might regard as largely unparalleled. Not only does it enjoy the status of a full standard but it is widely accepted in the library community and appears to be growing in acceptance. However that same complexity has discouraged the creation of many advanced implementations destined as open source to the research community and, equally, may have discouraged other implementers from ever experimenting with it, causing them to investigate and experiment with light- weight "equivalents" in order to save time or resources. It has been argued that Z39.50 has also been, to some extent, a victim of its own success -- or at least promise. Recent versions of the standard are highly extensible, and the consensus process of standards development has made it hospitable to an ever-growing set of new communities and requirements. As this process of extension has proceeded, it has become ever less clear what the appropriate scope and boundaries of the protocol should be, and what expectations one should have of practical interoperability among implementations of the standard [6].

Related Standards

LDAP

The Lightweight Directory Access Protocol evolved to meet the need for a less bulky and resource-consuming alternative to the X.500 Directory Access Protocol. It can run directly on top of TCP/IP and employs simpler encoding than X500.DAP. It could be argued that interest has waned in the protocol since the appearance of more powerful PC's but this would be an over-simplification for LDAP has regained a degree of acceptance and some users report significant activity with it. In its purest form, an LDAP scenario greatly resembles Whois++ in its generation of referrals to likeliest servers for the user. LDAP is employed by the ISAAC Project based at University of Wisconsin-Madison.

Whois++

Based on Whois, a rather restricted white pages directory service, Whois++ is a lightweight extension offering cross-searching over a distributed network of databases including multiple gateways. It is designed to function as a simple lookup service but with a degree of flexibility that avoids imposing constraints upon developers.

In its evolution beyond the original Whois, this protocol has acquired more advanced search processes enhanced by the addition of global and local constraints and the use of Boolean operators. Further options include languages other than English, additional character sets and, most importantly, the use of structured data to make searching more effective. The structured data is in the form of an information template which is central to the Whois++ operation.

Its close relationship to the Common Indexing Protocol is a major strength of Whois++. Indeed CIP was embedded in version 1 of Whois++ and came to be abstracted from it in subsequent versions. Consequently there is a particularly close mapping whereas other pre-existing protocols may require more work to collaborate with CIP. Equally if the Whois++ handle is substituted by the DSI, (Dataset Identifier), the original Whois++ mesh traversal algorithm can operate unchanged with CIP.

Relevance to IMesh context

In the context of the IMesh Toolkit, we may wish to promote the connection with the Bath Profile if we are intending to make recommendations as to conformance. The Bath Group, authors of the profile, state that conformance to this profile's specifications will improve international or extranational search and retrieval among library catalogues, union catalogues, and other electronic resource discovery services worldwide. The Bath Profile will evolve as the environment and the standard change, and it is intended to facilitate global resource sharing [7]. The structure of the profile is modular; it supports the future specification for separate but compatible functional requirements involving a range of applications useful to librarians and library users. The profile is structured into functional areas that group similar functional requirements, Z39.50 specifications, and levels of conformance, these areas being:
a) Basic Bibliographic Search & Retrieval, with Primary Focus on Library Catalogues
b) Bibliographic Holdings Search & Retrieval
c) Cross-Domain Search & Retrieval
Other functional areas may be defined in future releases of this profile such as a functional area for union catalogue updating and a functional area for item order and document delivery [8].

Where Z39.50 has come in for criticism for its lack of approachability, the Bath Profile has been seen by some as a useful adjunct : whilst important in its own right, it has received even more attention because of the high level of semantic exposition - being a very readable and informative document that might be read even by people not necessarily planning to implementthem. This is seen by such commentators as an important lesson for the Z39.50 community - namely that it should be producing more readable and informative documents [3].

References

[1] Z39.50 Maintenance Agency (Library of Congress), Information Retrieval (Z39.50-1995): Application Service Definition and Protocol Specification
http://lcweb.loc.gov/z3950/agency/markup/01.html

[2] ZIG Commentaries: summary of the Sort Service
http://lcweb.loc.gov/z3950/agency/wisdom/sort.html

[3] ZIG Meeting Report, (July 2000, Leuven, Belgium)
http://lcweb.loc.gov/z3950/agency/zig/meetings/leuven/report.html

[4] ZIG WG Session Meeting Report on Z39.50 and the Web, January 20, 2000 at San Antonio ZIG Meeting
http://lcweb.loc.gov/z3950/agency/zig/meetings/texas/zweb-report.html
and
Zig Plenary Meeting Report, San Antonio Public Library, January 21, 2000
http://lcweb.loc.gov/z3950/agency/zig/meetings/texas/minutes.html

[5] Mozilla RDF / Z39.50 Integration Project : Z39.50 Background, August 1999
http://www.mozilla.org/rdf/doc/z3950.html

[6] The Z39.50 Information Retrieval Standard Part I: A Strategic View of Its Past, Present and Future, Clifford A. Lynch, Director, Library Automation Office of the President, University of California, Oakland, D-Lib Magazine, April 1997
http://www.dlib.org/dlib/april97/04lynch.html#intro

[7] The Bath Profile Maintenance Agency
http://www.nlc-bnc.ca/bath/bath-e.htm

[8] The Bath Profile: An International Z39.50 Specification for Library Application s and Resource Discovery, Release 1.1 Internationally Registered Profile Developed by The Bath Group, released June 2000
http://www.ukoln.ac.uk/interop-focus/bath/current/

Other Standards and Protocols

CIP DC LDAP OAI
RDF RSS SDLIP SOAP
WHOIS++ XHTML XML Z39.50