|
|
The IMesh Toolkit
[ Work In Hand > Technology Review > Standards and
Protocols ]
Z39.50
|
Overall Purpose
|
| Z39.50 is a search and retrieval protocol,
currently at version 3, (Z39.50-1995 is a compatible superset of
the 1992 version)[1], which is maintained by the Library of
Congress and is capable of operating over TCP/IP. It is a
client-server-based model in which search and retrieval are
separated in the protocol. Once the session is established with
the database server, the server responds to the query with a set
of results which can be retrieved as a whole or as a subset. |
Brief Overview of Functionality
|
Operations included in version 3 are as
follows:
Search, within which there is a number of query syntaxes; single
query can initiate a simultaneous search across multiple
databases within a server. Moreover, new attribute sets may be
defined with less replication. Also new data types for attribute
values are defined (in version 2 only numeric values are
allowed). Furthermore, an attribute set definition may now list
alternative sets of evaluation rules (for example, whether the
server is allowed to substitute an attribute that it thin ks is
more appropriate), and the query may select one of the
alternatives. The enhanced bib-1 attribute set definition
exploits this new feature [1].
Present function will retrieve both whole or partial sets of
records in various formats including SGML/XML and a variety of
MARC formats, as well as presenting selected fields of the
records obtained. Z39.50 also supports client authentication as
well as the sorting of one or more results sets across multiple
fields. The latter allows the client to request that a particular
result set be sorted, for example the three most recently
published documents. Therefore, based on a specific key, (i.e..
"date of publication", descending), it is possible to retrieve
the records by publication date order. If the server supports the
sort service (and also supports sorting on the requested key, in
this case "date of publication") then following the Sort, the
client may subsequently retrieve the first three records, and
they will be the three most recent [2]. Also new to version 1995
is Scan which is used to scan terms in a list or index [1].
Additionally this version provides a Browse facility that matches
all words deemed close to a given term.
Finally there is the Explain facility. This allows a client to
retrieve details of the server implementation: general features
(description, contact information, hours of operation,
restrictions, usage cost, etc.), databases available for
searching, indexes, attribute sets, attribute details, schemas,
record syntaxes, sort capabilities and extended services. The
server maintains Explain information in a special database that
may be accessed by the client using the Z39.50 search and
retrieval facilities [1].
|
Deployment
|
There is still interest and curiosity over
the relationship between Z39.50 and the XML Query. There has been
discussion whether to incorporate it as a Z39.50 query type; and
there has been the suggestion that this could potentially be
useful: there is no "protocol" to wrap around the XML Query, so
Z39.50 could be useful to XQL, and having XQL as a Z39.50 query
type could prove useful to the Z39.50 community. But in order for
this to make sense,the data model for XQL has to be compatible
with that of Z39.50. It is not entirely clear currently what this
implies [3]. Furthermore, work has begun on an experimental
encoding of the holdings schema in XML and whilst GRS will not be
deprecated, XML will be introduced into more usage, for instance
the schema for server information for distributed discovery. The
Z39.50 Implementors Group has discussed what it should be doing
about XML and RDF query work going on in W3C [4].
In examining the adoption of a search protocol, particularly
in the context of the bibliographic domain, it may be argued that
Z39.50 at least merits consideration because of the high level of
usage and acceptance within the bibliographic community. Moreover
it is an accepted standard where few other competitors pertain.
As a communication protocol it places relatively few restrictions
on underlying databases, most of which are capable of some sort
of Z39.50 interface. Because of its modular nature, it is thought
to be able to cope with improvements and extensions in services.
For example, within the Z39.50 -1995 the concept of a
"negotiation record" is introduced. The client may include a
negotiation record within the initialization message to propose
that some condition be in effect for the session (for example,
the use of a particular language and one or more character sets).
The server may respond, indicating whether the proposal is
accepted, or indicate a counter-proposal. The negotiation record
is an application of the new extensibility feature [1].
Z39.50 is a large and complex standard. Widely used in
bibliographic and digital library applications, Z39.50 represents
a significant body of experience relating to bibliographic data
retrieval and the specification of interoperable semantics for
community-specific metadata attribute sets [5 ]. It has the
advantage of accepting new or differing query syntaxes, formats
and field semantics thanks to the aforementioned modularity.
However, by the same token, its size and complexity has tended to
encourage the implementation of only subsets of the standard on
the one hand, and limited and local versions on the other. To
combat the consequent threat of a loss of interoperability, a
variety of profiles have sprung up to define the fields and
features that each implementation supports. Such profiles are in
effect agreements between communities of services and users as to
what exactly each implementation defines; GILS (Government
Information Locator Service) and the Bath Profile are two
examples of such profiles.
The size and complexity of Z39.50 makes for a very
comprehensive protocol which gives a depth to the search and
retrieval operations that some might regard as largely
unparalleled. Not only does it enjoy the status of a full
standard but it is widely accepted in the library community and
appears to be growing in acceptance. However that same complexity
has discouraged the creation of many advanced implementations
destined as open source to the research community and, equally,
may have discouraged other implementers from ever experimenting
with it, causing them to investigate and experiment with light-
weight "equivalents" in order to save time or resources. It has
been argued that Z39.50 has also been, to some extent, a victim
of its own success -- or at least promise. Recent versions of the
standard are highly extensible, and the consensus process of
standards development has made it hospitable to an ever-growing
set of new communities and requirements. As this process of
extension has proceeded, it has become ever less clear what the
appropriate scope and boundaries of the protocol should be, and
what expectations one should have of practical interoperability
among implementations of the standard [6].
|
Related Standards
|
LDAP
The Lightweight Directory Access Protocol evolved to meet the
need for a less bulky and resource-consuming alternative to the
X.500 Directory Access Protocol. It can run directly on top of
TCP/IP and employs simpler encoding than X500.DAP. It could be
argued that interest has waned in the protocol since the
appearance of more powerful PC's but this would be an
over-simplification for LDAP has regained a degree of acceptance
and some users report significant activity with it. In its purest
form, an LDAP scenario greatly resembles Whois++ in its
generation of referrals to likeliest servers for the user. LDAP
is employed by the ISAAC Project based at University of
Wisconsin-Madison.
Whois++
Based on Whois, a rather restricted white pages directory
service, Whois++ is a lightweight extension offering
cross-searching over a distributed network of databases including
multiple gateways. It is designed to function as a simple lookup
service but with a degree of flexibility that avoids imposing
constraints upon developers.
In its evolution beyond the original Whois, this protocol has
acquired more advanced search processes enhanced by the addition
of global and local constraints and the use of Boolean operators.
Further options include languages other than English, additional
character sets and, most importantly, the use of structured data
to make searching more effective. The structured data is in the
form of an information template which is central to the Whois++
operation.
Its close relationship to the Common Indexing Protocol is a
major strength of Whois++. Indeed CIP was embedded in version 1
of Whois++ and came to be abstracted from it in subsequent
versions. Consequently there is a particularly close mapping
whereas other pre-existing protocols may require more work to
collaborate with CIP. Equally if the Whois++ handle is
substituted by the DSI, (Dataset Identifier), the original
Whois++ mesh traversal algorithm can operate unchanged with
CIP.
|
Relevance to IMesh context
|
In the context of the IMesh Toolkit, we may
wish to promote the connection with the Bath Profile if we are
intending to make recommendations as to conformance. The Bath
Group, authors of the profile, state that conformance to this
profile's specifications will improve international or
extranational search and retrieval among library catalogues,
union catalogues, and other electronic resource discovery
services worldwide. The Bath Profile will evolve as the
environment and the standard change, and it is intended to
facilitate global resource sharing [7]. The structure of the
profile is modular; it supports the future specification for
separate but compatible functional requirements involving a range
of applications useful to librarians and library users. The
profile is structured into functional areas that group similar
functional requirements, Z39.50 specifications, and levels of
conformance, these areas being:
a) Basic Bibliographic Search & Retrieval, with Primary Focus
on Library Catalogues
b) Bibliographic Holdings Search & Retrieval
c) Cross-Domain Search & Retrieval
Other functional areas may be defined in future releases of this
profile such as a functional area for union catalogue updating
and a functional area for item order and document delivery [8].
Where Z39.50 has come in for criticism for its lack of
approachability, the Bath Profile has been seen by some as a
useful adjunct : whilst important in its own right, it has
received even more attention because of the high level of
semantic exposition - being a very readable and informative
document that might be read even by people not necessarily
planning to implementthem. This is seen by such commentators as
an important lesson for the Z39.50 community - namely that it
should be producing more readable and informative documents
[3].
|
References
|
[1] Z39.50 Maintenance Agency (Library of
Congress), Information Retrieval (Z39.50-1995): Application
Service Definition and Protocol Specification
http://lcweb.loc.gov/z3950/agency/markup/01.html
[2] ZIG Commentaries: summary of the Sort Service
http://lcweb.loc.gov/z3950/agency/wisdom/sort.html
[3] ZIG Meeting Report, (July 2000, Leuven, Belgium)
http://lcweb.loc.gov/z3950/agency/zig/meetings/leuven/report.html
[4] ZIG WG Session Meeting Report on Z39.50 and the Web,
January 20, 2000 at San Antonio ZIG Meeting
http://lcweb.loc.gov/z3950/agency/zig/meetings/texas/zweb-report.html
and
Zig Plenary Meeting Report, San Antonio Public Library, January
21, 2000
http://lcweb.loc.gov/z3950/agency/zig/meetings/texas/minutes.html
[5] Mozilla RDF / Z39.50 Integration Project : Z39.50
Background, August 1999
http://www.mozilla.org/rdf/doc/z3950.html
[6] The Z39.50 Information Retrieval Standard Part I: A
Strategic View of Its Past, Present and Future, Clifford A.
Lynch, Director, Library Automation Office of the President,
University of California, Oakland, D-Lib Magazine, April
1997
http://www.dlib.org/dlib/april97/04lynch.html#intro
[7] The Bath Profile Maintenance Agency
http://www.nlc-bnc.ca/bath/bath-e.htm
[8] The Bath Profile: An International Z39.50 Specification
for Library Application s and Resource Discovery, Release 1.1
Internationally Registered Profile Developed by The Bath Group,
released June 2000
http://www.ukoln.ac.uk/interop-focus/bath/current/
|
|