|
|
The IMesh Toolkit
[ Work In Hand > Technology Review > Standards and
Protocols ]
Open Archives Initiative (OAI)
|
Overall Purpose
|
| This standard originated within the Open
Archives Initiative [1]. Specifications were initially intended for
dealing with eprint materials (author self-archiving solutions),
but are now seen as having wider applicability. The standard
provides interoperability agreements for archiving materials so
that they can be accessible to mediator services. Data
providers are given a set of mechanisms (conventions) to
follow to make their information externally available.
Service Providers - These mediator services provide
higher layers of functionality, and can access information within
archives that follow the convention. These services may combine
and process information from individual archives and then offer
increased functionality to support discovery, presentation and
analysis of data originating from compliant archives.[3] |
Brief Overview of Functionality
|
As described on the home page [1] OAI is an
effort that is still in transition, increasing its knowledge of ,
and responding to feedback from, the adopter communities. A
technical meeting held in September 2000 identified areas of the
original Santa Fe convention [2] that would be retained, whereas
others were expected to need modification. A revised technical
agreement should be finalised and made available in early 2001,
together with documentation and tools developed in parallel. A
report on the Santa Fe Convention in DLib magazine [2] summarised
the agreements reached then as follows:
"The mechanisms for establishing this
interoperability................are three-fold:
- The definition of a set of simple metadata elements -- the
Open Archives Metadata Set (OAMS) -- for the sole purpose of
enabling coarse granularity document discovery among
archives;
- The agreement to use a common syntax, XML, for representing
and transporting both the OAMS and archive-specific metadata
sets;
- The definition of a common protocol -- the Open Archives
Dienst Subset -- to enable extraction of OAMS and
archive-specific metadata from participating archives."
A more expanded overview of the Technical Components of the Santa
Fe Convention is:
Open Archives Metadata Set
The Open Archives Metadata Set (OAMS) is a collection of nine
metadata elements intended to facilitate coarse granularity
resource discovery among the records in distributed and
dissimilar archives. The semantics of this set have purposely
been kept simple in the interest of easy creation and widest
applicability. There is no provision for qualification or
extension of the nine elements. The expectation is that
individual archives will maintain metadata with more expressive
semantics and the Open Archives Dienst Subset provides the
mechanism for retrieval of this richer metadata.
Open Archives Dienst Subset
The Open Archives Dienst Subset is a set of protocol requests
that are delivered via HTTP. This protocol is a subset of the
full Dienst protocol. The protocol requests in the subset provide
the following functionality:
- List the full identifiers for records stored in an archive.
An optional argument permits the client to specify that the list
should only include records added after a specific date. Another
optional argument allows the client to specify that the records
should be accompanied by the metadata associated with the
identifier.
- Return the metadata for a specific record in a requested
format.
- Return the list of metadata formats supported by an
archive.
- Return the list of metadata formats available for a specific
record.
- Return the structure of the partitions by which an archive is
organized.
- All responses to these requests are formatted in XML.
As stated earlier, the Santa Fe agreement has been revised and it
was intended to do the following [3]:
- The following fundamental principles are to be maintained:
- Open, harvestable archives
- Data provider and service provider model
- Managed archives
- The following abstract principles are to be maintained:
- Metadata harvesting
- OAI namespace
- Acceptable use
- Registration of data providers, service providers, metadata
formats
- Extensions and changes would be considered on the following
concepts:
- Open Archives intiative not concerned only with eprints but
with scholarly data-archives in general
- Definition of a record in an archive
- Shared metadata set and parallelel metadata set
- Technical implementation of all the mentioned abstract
principles was to be reconsidered in the meeting because of the
extension in scope and the implementation experiences.
Examples of the changes agreed upon following the technical
meeting are [3]:
- that the record in an archive is to be a meta-data record
that describes and can contain an entry point to the full
content
- The Dublin Core element set was selected as the common
metadata set and community-specific harvestable metadata sets
would be developed in conjunction with the various communities by
issuing a call for proposals
- An independent OAI protocol was developed for transfer of
metadata between data providers and service providers. The
protocol contains a small set of service requests which (together
with their parameters) are encoded into standard HTTP URIs.
Responses to the requests are XML documents which allow for
simple implementation and use of CGI scripts or similar
technology for data providers while service providers can exploit
XML tools which become available.
|
Deployment
|
Status
OAI is a relatively young specification, the first agreement being
reached only in October 1999 [2]. A major revision of the
original convention agreements is underway and version 2 of the
technical agreements is being reformulated (expected publishing
date is January 2001). In this respect, the "openarchives"
webpage carries the following advice [1]: "Therefore, it is
advised not to start an implementation of the current Santa Fe
Convention specifications without contacting
openarchives@openarchives.org"
Software - The eprints software [4]
This software is developed at the University of Southampton as
part of the Open Citation Project, a DLI2 International Digital
Libraries Project funded by the Joint Information Systems
Committee (JISC) of the Higher Education Funding Councils, in
collaboration with the National Science Foundation. It originated
within CogPrints and is used for running the CogPrints Cognitive
Sciences Eprint Archive, a JISC-funded Open Archive for research
literature.
The software, which can be freely downloaded, allows
institutions to create an archive containing metadata (formats
can be configured) accessible to both authors and readers, in a
discipline-based, distributed, institution-based manner. It
enables submission of archive material via the WWW and submission
can be moderated. Open Archives created using eprints.org
software and registered as Open Archives Data Providers can be
harvested into one global "virtual archives" Open Archives
Service Provider such as the Cross Archive Searching Service.
Future Development
- On November 1st 2000, an alpha-test of the new specification
was launched. Beta specifications are expected in December 2000,
while the publication of the new specifications is scheduled for
January 2001.
- OAI Open Day for the US will be on January 23rd 2001,
Washington DC. Central to this meeting is the public
dissemination of the new OAI specifications in the US.
- OAI Open Day for Europe, 26th February 2001, Berlin. Central
to this meeting is the public dissemination of the new OAI
specifications in Europe.
- Workshop on the Open Archives Initiative and Peer Review
journals in Europe, Geneva, 22-24th March 2001.
|
Related Standards
|
The Dienst Protocol
This is the original protocol from which is derived the Open
Archives protocol used for negotiating the transfer of requests
for information and returning the results of those requests.
XML [UKOLN XML review]
XML is the format used for encoding messages sent during
communications between data providers and service providers.
|
Relevance to IMesh context
|
| The IMesh toolkit project may wish to
provide tools for making records available via the OAI protocol.
This might require transformation of metadata records into a
format compliant with OAI. Since OAI is still evolving, and tools are
being developed, our contribution could also have a secondary
effect of being significant in shaping and contributing to OAI
development. On the other hand, because of the transitory nature
of OAI, trying to fit in with this initiative may pose the
dangers of attempting to hit a moving target. |
References
|
[1] Open Archives Initiative Homepage
http://www.openarchives.org/
[2] The Santa Fe Convention report in D-Lib Magazine, February
2000
http://www.dlib.org/dlib/february00/vandesompel-oai/02vandesompel-oai.html
[3] The Open Archives Dienst Subset
Jim Davis, David Fielding, Carl Lagoze, Richard Marisa,
May 2000
http://www.cs.cornell.edu/cdlrg/dienst/protocols/OpenArchivesDienst.htm
[4] Dienst Protocol Specification
Jim Davis, David Fielding, Carl Lagoze, Richard Marisa,
May 2000
http://www.cs.cornell.edu/cdlrg/dienst/protocols/DienstProtocol.htm
[5] Report on Open Archives Initiative Technical Committee
Meeting, Ithaca NY, 7-8 September 2000
http://www.openarchives.org/oai-tech-cornell/cornell_report.pdf
[6] The eprints software
http://www.eprints.org/
|
|