Back to IMesh Toolkit Home Page
Back to IMesh Toolkit Homepage
Subject Gateway Requirements
Technology Review
Work In Hand
  Personalization
Annotation
Reading Lists
OAI  Normalization tools
Metadata Exchange
RDF queries
Evaluation
Dissemination
Project Documentation
Related Links
Project Partners
IMesh Home Page

The IMesh Toolkit

[ Work In Hand > Technology Review > Standards and Protocols ]

Open Archives Initiative (OAI)

Overall Purpose

This standard originated within the Open Archives Initiative [1]. Specifications were initially intended for dealing with eprint materials (author self-archiving solutions), but are now seen as having wider applicability. The standard provides interoperability agreements for archiving materials so that they can be accessible to mediator services. Data providers are given a set of mechanisms (conventions) to follow to make their information externally available. Service Providers - These mediator services provide higher layers of functionality, and can access information within archives that follow the convention. These services may combine and process information from individual archives and then offer increased functionality to support discovery, presentation and analysis of data originating from compliant archives.[3]

Brief Overview of Functionality

As described on the home page [1] OAI is an effort that is still in transition, increasing its knowledge of , and responding to feedback from, the adopter communities. A technical meeting held in September 2000 identified areas of the original Santa Fe convention [2] that would be retained, whereas others were expected to need modification. A revised technical agreement should be finalised and made available in early 2001, together with documentation and tools developed in parallel. A report on the Santa Fe Convention in DLib magazine [2] summarised the agreements reached then as follows:

"The mechanisms for establishing this interoperability................are three-fold:
  • The definition of a set of simple metadata elements -- the Open Archives Metadata Set (OAMS) -- for the sole purpose of enabling coarse granularity document discovery among archives;
  • The agreement to use a common syntax, XML, for representing and transporting both the OAMS and archive-specific metadata sets;
  • The definition of a common protocol -- the Open Archives Dienst Subset -- to enable extraction of OAMS and archive-specific metadata from participating archives."
A more expanded overview of the Technical Components of the Santa Fe Convention is:

Open Archives Metadata Set
The Open Archives Metadata Set (OAMS) is a collection of nine metadata elements intended to facilitate coarse granularity resource discovery among the records in distributed and dissimilar archives. The semantics of this set have purposely been kept simple in the interest of easy creation and widest applicability. There is no provision for qualification or extension of the nine elements. The expectation is that individual archives will maintain metadata with more expressive semantics and the Open Archives Dienst Subset provides the mechanism for retrieval of this richer metadata.

Open Archives Dienst Subset
The Open Archives Dienst Subset is a set of protocol requests that are delivered via HTTP. This protocol is a subset of the full Dienst protocol. The protocol requests in the subset provide the following functionality:
  • List the full identifiers for records stored in an archive. An optional argument permits the client to specify that the list should only include records added after a specific date. Another optional argument allows the client to specify that the records should be accompanied by the metadata associated with the identifier.
  • Return the metadata for a specific record in a requested format.
  • Return the list of metadata formats supported by an archive.
  • Return the list of metadata formats available for a specific record.
  • Return the structure of the partitions by which an archive is organized.
  • All responses to these requests are formatted in XML.
As stated earlier, the Santa Fe agreement has been revised and it was intended to do the following [3]:
  • The following fundamental principles are to be maintained:
    • Open, harvestable archives
    • Data provider and service provider model
    • Managed archives
  • The following abstract principles are to be maintained:
    • Metadata harvesting
    • OAI namespace
    • Acceptable use
    • Registration of data providers, service providers, metadata formats
  • Extensions and changes would be considered on the following concepts:
    • Open Archives intiative not concerned only with eprints but with scholarly data-archives in general
    • Definition of a record in an archive
    • Shared metadata set and parallelel metadata set
  • Technical implementation of all the mentioned abstract principles was to be reconsidered in the meeting because of the extension in scope and the implementation experiences.
Examples of the changes agreed upon following the technical meeting are [3]:
  • that the record in an archive is to be a meta-data record that describes and can contain an entry point to the full content
  • The Dublin Core element set was selected as the common metadata set and community-specific harvestable metadata sets would be developed in conjunction with the various communities by issuing a call for proposals
  • An independent OAI protocol was developed for transfer of metadata between data providers and service providers. The protocol contains a small set of service requests which (together with their parameters) are encoded into standard HTTP URIs. Responses to the requests are XML documents which allow for simple implementation and use of CGI scripts or similar technology for data providers while service providers can exploit XML tools which become available.

Deployment

Status
OAI is a relatively young specification, the first agreement being reached only in October 1999 [2]. A major revision of the original convention agreements is underway and version 2 of the technical agreements is being reformulated (expected publishing date is January 2001). In this respect, the "openarchives" webpage carries the following advice [1]: "Therefore, it is advised not to start an implementation of the current Santa Fe Convention specifications without contacting openarchives@openarchives.org"

Software - The eprints software [4]
This software is developed at the University of Southampton as part of the Open Citation Project, a DLI2 International Digital Libraries Project funded by the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, in collaboration with the National Science Foundation. It originated within CogPrints and is used for running the CogPrints Cognitive Sciences Eprint Archive, a JISC-funded Open Archive for research literature.

The software, which can be freely downloaded, allows institutions to create an archive containing metadata (formats can be configured) accessible to both authors and readers, in a discipline-based, distributed, institution-based manner. It enables submission of archive material via the WWW and submission can be moderated. Open Archives created using eprints.org software and registered as Open Archives Data Providers can be harvested into one global "virtual archives" Open Archives Service Provider such as the Cross Archive Searching Service.

Future Development
  • On November 1st 2000, an alpha-test of the new specification was launched. Beta specifications are expected in December 2000, while the publication of the new specifications is scheduled for January 2001.
  • OAI Open Day for the US will be on January 23rd 2001, Washington DC. Central to this meeting is the public dissemination of the new OAI specifications in the US.
  • OAI Open Day for Europe, 26th February 2001, Berlin. Central to this meeting is the public dissemination of the new OAI specifications in Europe.
  • Workshop on the Open Archives Initiative and Peer Review journals in Europe, Geneva, 22-24th March 2001.

Related Standards

The Dienst Protocol

This is the original protocol from which is derived the Open Archives protocol used for negotiating the transfer of requests for information and returning the results of those requests.

XML [UKOLN XML review]

XML is the format used for encoding messages sent during communications between data providers and service providers.

Relevance to IMesh context

The IMesh toolkit project may wish to provide tools for making records available via the OAI protocol. This might require transformation of metadata records into a format compliant with OAI. Since OAI is still evolving, and tools are being developed, our contribution could also have a secondary effect of being significant in shaping and contributing to OAI development. On the other hand, because of the transitory nature of OAI, trying to fit in with this initiative may pose the dangers of attempting to hit a moving target.

References

[1] Open Archives Initiative Homepage
http://www.openarchives.org/
[2] The Santa Fe Convention report in D-Lib Magazine, February 2000
http://www.dlib.org/dlib/february00/vandesompel-oai/02vandesompel-oai.html
[3] The Open Archives Dienst Subset Jim Davis, David Fielding, Carl Lagoze, Richard Marisa, May 2000 http://www.cs.cornell.edu/cdlrg/dienst/protocols/OpenArchivesDienst.htm
[4] Dienst Protocol Specification Jim Davis, David Fielding, Carl Lagoze, Richard Marisa, May 2000 http://www.cs.cornell.edu/cdlrg/dienst/protocols/DienstProtocol.htm
[5] Report on Open Archives Initiative Technical Committee Meeting, Ithaca NY, 7-8 September 2000
http://www.openarchives.org/oai-tech-cornell/cornell_report.pdf
[6] The eprints software
http://www.eprints.org/

Other Standards and Protocols

CIP DC LDAP OAI
RDF RSS SDLIP SOAP
WHOIS++ XHTML XML Z39.50