Back to IMesh Toolkit Home Page
Back to IMesh Toolkit Homepage
Subject Gateway Requirements
Technology Review
Work In Hand
  Personalization
Annotation
Reading Lists
OAI  Normalization tools
Metadata Exchange
RDF queries
Evaluation
Dissemination
Project Documentation
Related Links
Project Partners
IMesh Home Page

The IMesh Toolkit

[ Work In Hand > Technology Review > Standards and Protocols ]

Dublin Core (DC) Metadata Set

Overall Purpose

The Dublin Core Metadata Initiative has promoted a resource description format that is worthy of remark for its intention to offer users and authors a simplified set of elements which is considerably more manageable than many other formats together with a means through Dublin Core qualifiers of extending its usefulness. Dublin Core metadata has been specifically designed to support resource discovery and enjoys broad agreement on its core set of 15 elements, seen as very useful in the context of resource discovery. [1] Dublin Core can be stored and transferred in a variety of ways including HTML, XML, XML/RDF and in relational databases.

Brief Overview of Functionality

The simple or unqualified elements are each defined through the application of a set of ten attributes under the ISO/ IEC 11179 standard. They are structured as attribute-value pairs, with no mandates syntax. They can be expressed as HTML meta-tags, XML or RDF/XML. The 15 elements can be used with or without the recommended DC Qualifiers [4]. With regard to the values for these elements, it is frequently recommended that they conform to a vocabulary, for example, for Resource Type to use the current draft of the Dublin Core Types (DCT1) [2], or for Coverage to adopt the Thesaurus of geographic Names (TGN).[3]

To support interoperability the DC-Usage Committee recently recently recommended a number of qualifiers that would permit applications using encoding schemes and element refinements to sharpen the semantic precision of metadata; these qualifiers were placed in a recommmendation in July 2000 [4]. Encoding schemes are qualifiers that assist in the interpretation of the element value and which hold vocabularies and formal notations. Element refinements make the meaning more specific; for example, for 'Relation' there are refinements such as "Is Part Of", "Is Referenced By", etc. or 'Date' can be further refined by terms such as "Available", "Modified" and so forth. So they make an element's meaning more specific without extending its meaning. A refined element shares the meaning of the unqualified element, but with a more restricted scope. [5]

Note that with both encoding schemes and element refinements, it is essential that their definitions be publicly available and that during a search a client should be able to ignore a qualifier and proceed. By 'ignore' one refers to the principle of what some have called "dumbing-down"; that is that implementers need to ensure that even if a client fails to understand an element qualifier such as "DC.Date.Modified", which refers not to the date of creation but a change to the original version, it can still at least assign the date value to the simple element DC.Date . It is further suggested that adoption of human readable values, where appropriate, (such as named places under 'Coverage'), provides a further benefit.

Deployment

A criticism that has been made of Dublin Core is that is unduly open to variable interpretation by users, [6]. A difficulty that has been identified [7] is the lack of boundaries that exists in the 1:1 rule. The latter states that a discrete source should automatically attact a discrete metadata record. However a genuine doubt exists as to what exactly constitutes a discrete source. It is possible to maintain that a photograph, diagram or graph within an article constitute a discrete source in themselves. It follows from that argument that the choice of an article, for example, as the basic discrete source is purely arbitrary.

One of the strengths claimed for Dublin Core is its rising star in the international context [8]; as of November 1999, there were versions in over 20 languages. The Working Group on Dublin Core in Multiple Languages is coordinating efforts to link these versions in a distributed registry using the Resource Description Framework technology being developed by the W3C. [9] However the major claim for Dublin Core is its simplicity. It is designed to be used by non-cataloguers as well as resource description experts. It is far more accessible to the former, for example, than say USMARC. Yet at the same time Dublin Core supports extensibility and flexibility through its qualifiers sufficient to permit the semantics of more complex description standards. It is anticipated that different communities will be able to use the DC elements for core descriptive information which will be usable across the Internet, whilst at the same time allowing domain specific additions which make sense within a more limited arena [9]. Dublin Core provides, it is claimed, a widely understood set of descriptors which supports semantic interoperability across disciplines.

Yet by the same token, such extensibility, where it is possible to use local extensions to the core set and so stray from the defined DC qualifiers, is regarded by some as a weakness. On the other hand, it is regarded as a distinct benefit by those who would argue that imposing too many constraints discourages adoption by a wider range of the resource description community. In its survey of participating services, the Renardus Project [7] discovered that most participating services wished not only to see the adoption of Dublin Core but support for DC semantics and XML/RDF syntax. (It is worth noting that Dublin Core concepts are equally applicable to virtually any file format, as long as the metadata is in a form suitable for interpretation both by search engines and by human beings [9] ). Whilst Dublin Core is simpler than some other element sets, it can be argued that it is generic enough to work across disciplines and provides a standard format onto which many services would be able to map their own formats [10]. It could be justly argued that Dublin Core, because of the increasing buy-in of many diverse communities, offers a convenient organizing point for "translating" between metadata systems [8].

Related Standards

Other options as regards resource description formats proliferate; to name a few: Encoding Archival Description (EAD) , Government Information Locator Service (GILS), IAFA/whois++ Templates, LDAP Data Interchange Format (LDIF), MARC and various forms of MARC e.g. USMARC and UKMARC, Summary Object Interchange Format (SOIF) and Text Encoding Initiative (TEI) Independent Headers. The list is much longer but it would be reasonable to argue that many of the formats listed above and elsewhere [11] are domain-specific, often having evolved from within a particular area of cataloguing to meet the needs of that specific domain. This therefore invites debate about the scope of a resource description format in terms of what it is called upon to do; the complexity and granularity of its structure is a distinct issue as regards the specificity of the environment or domain in which it must operate.

Relevance to IMesh context

Therefore the question to be posed about Dublin Core is whether it is appropriate to the needs of the IMesh project. It would be reasonable to anticipate some of those requirements. The format's complexity will to some degree be determined by the context in which it will work, i.e. the description of resources on the web. Given the international context of IMesh, it will have to be general enough to cover a variety of disciplines in all likelihood. Therefore it must be flexible enough to provide participating organisations with the capacity to map their own formats against Dublin Core in those circumstances where their format does not automatically comply.

However drawbacks to the acceptance of Dublin Core include the likelihood that not all services will hold the qualifiers and so would be faced with the fact of an incomplete reading of their records during searches. Alternatively, and just as unpopular, they would have to engage in a programme to adapt their records to rectify this, something that not all might be prepared to do. In the other camp, there will be some who would regard the refinements as they stand as insufficient. In the context of format conversion, the criticism has been made that simple DC is quite insufficient to produce a MARC record without the addition of DC qualifiers. In the other direction, conversion of format from MARC to Dublin Core Simple cannot but result in the loss of data [12]. A further possible drawback is that Dublin Core imposes no rules for content even though it defines semantics. There is still ground to be covered with regard to cataloguing rules as well as standardisation with regard to URI, where there is the suggestion abroad that W3C should initiate activity.

There is an inherent tension within Dublin Core which pulls it between two poles of criticisms and plaudits. The former relate to its being over-simplified and lacking refinements or, conversely, its supporters hold that it represents a sufficient degree of manageable complexity such that authors may use it to describe their web resources over a variety of disciplines, and this without recourse to over-complex domain-specific resource description formats.

References

[1] Diffuse -- Metadata Interchange Standards : ISO 11179: Specification and Standardization of Data Elements
http://www.diffuse.org/meta.html

[2] Dublin Core Metadata Initiative: DCMI Type Vocabulary
http://purl.org/DC/documents/rec/dcmi-type-vocabulary-20000711.htm

[3] The Getty Thesaurus of Geographic Names:
http://shiva.pub.getty.edu/tgn_browser/

[4] Dublin Core Metadata Initiative: Dublin Core Qualifiers
http://purl.oclc.org/dc/documents/rec/dcmes-qualifiers-20000711.htm

[5] list discussion Approval of initial Dublin Core Interoperability Qualifiers / mail message by Stu Weibel to dc-general, 17 April 2000
http://www.mailbase.ac.uk/lists/dc-general/2000-04/0010.html

Approval of initial Dublin Core Interoperability Qualifiers / mail message by Roy Tenant to dc-general, 25 April 2000
http://www.mailbase.ac.uk/lists/dc-general/2000-04/0012.html

Approval of initial Dublin Core Interoperability Qualifiers / mail message by Ray Denenberg, Library of Congress, to dc-general, 27 Apr 2000
http://www.mailbase.ac.uk/lists/dc-general/2000-04/0019.html

[6] Cross-domain Resource Discovery : Integrated Discovery and use of Textual, Numeric and Spatial Data
Ray R. Larson (University of California, Berkeley) and Paul B. Watry (University of Liverpool), January 1999
http://cheshire.lib.berkeley.edu/proposal.html

[7] Renardus Project Technical Standards Report, May 2000
http://nwi.dtv.dk/RENARDUS/D2.1/

[8] Metadata Summit : Meeting Report, Organized by the Research Libraries Group, California, July 1997
http://www.rlg.org/meta9707.html#introductory

[9] Dublin Core Metadata Initiative: Using Dublin Core, Diane Hillmann July 2000
http://purl.org/DC/documents/wd/usageguide-20000716.htm

[10] Metadata : Mapping between metadata formats, Edited by M.Day, UKOLN
http://www.ukoln.ac.uk/metadata/interoperability/

[11] Dempsey, L., Heery, R., Specification for resource description methods.
Part 1: A review of metadata: a survey of current resource description formats. (DESIRE D3.2 part 1)
http://www.surfnet.nl/innovatie/desire1/deliver/WP3/D32-1.html

[12] Dublin Core and metadata: a tutorial Metdata Workshop, Luxembourg, December 1997 Andy Powell,Lorcan Dempsey, UKOLN
http://hosted.ukoln.ac.uk/ec/metadata-1997/tutorial/presentation/

Other Standards and Protocols

CIP DC LDAP OAI
RDF RSS SDLIP SOAP
WHOIS++ XHTML XML Z39.50