Back to IMesh Toolkit Home Page
Back to IMesh Toolkit Homepage
Subject Gateway Requirements
Technology Review
Work In Hand
  Personalization
Annotation
Reading Lists
OAI  Normalization tools
Metadata Exchange
RDF queries
Evaluation
Dissemination
Project Documentation
Related Links
Project Partners
IMesh Home Page

The IMesh Toolkit

[ Work In Hand > Components > OAI Normalization tools]

Using trans.pl version 0.1: a step-by-step guide


This page guides you through the process of downloading and using trans.pl to make tranformations on OAI records.
  • Download the perlscript
    Download the file trans.pl
    (If the browser displays the file and does not prompt you to download, right-click with the mouse and use save target as.)
    Place the script anywhere you can run Perl files.
    You will need the XML::Simple library which you can obtain from CPAN.
  • Prepare your input files
    The input files need to conform to the OAI-defined schema for encoding DC in XML. You can consult the official version of the schema, or take a look at a schematic view of a record that conforms to the schema. If you are making your records available in an OAI repository, you should already support a version of your records in the required schema, as this is a mandatory requirement for repositories.
    The input file may consist of one or more records within the same file. You must then add a root <Records> element to the document. For example, if your list of records is a result of an OAI ListRecords request, change the <ListRecords> opening and closing tags to a <Records> tag (and delete any enclosing elements so that Record becomes the root element).
    Sample file with one record
    Sample file containing multiple records
  • Define the configuration file
    The configuration file is made up of instruction lines, each instruction is on a new line, and consists of the name of the element that you wish to modify and a regular expression to be applied to the content of the element. For example the instruction
    type s/Tutorials/LearningMaterialCourseware/i
    applies the regular expression s/Tutorials/LearningMaterialCourseware/i to all the type elements that are children of the dc child element of the metadata element of each record. In this example, the text Tutorials within the type element is changed to LearningMaterialCourseware.
    To apply different regular expressions, edit the file, changing the name of the element to which the regular expression must be applied. Edit the expression so that the first portion contains the text that you want to change, and the second portion is the new text. So if for example you would like the type to be Educational, change the expression to:
    type s/Tutorials/Educational/i
    For more about Perl Regular Expressions see here and a more general article on Regular Expressions in UNIX.
    Note that in this version only the type elements in the metadata portion will be affected. The content of <type> tags within other sections (e.g. children of the about element) will not be affected.
    Sample configuration file
  • Run the script trans.pl -i <inputfile> -o <outputfile> -c <configfile> where <input> file is the name of the file containing the XML <outputfile> is the name of the new file containing the transformed XML and <configfile> is the name of the configuration file where the transformations are specified.
    Example: if your records are in a file called myInputrecords.xml, and you would like the output to be written to a file called myChangedRecords.xml, use the command:
    trans.pl -i ./myInputRecords.xml -o myChangedRecords.xml -c myConfig
    Note that there is no requirement for all the files to be in the same directory. The following would work just as well:
    OAI/scripts/trans.pl -i OAI/records/myInputRecords.xml -o OAI/newrecords/myChangedRecords.xml -c OAI/configfiles/myConfig
    assuming those are the correct paths to all the files.
    Remember that the full path must be used for the input file, even if it is in the same directory as the Perl Script.
  • Example of output
    The result of applying the instructions in the sample configuration file to the sample input file with multiple records will produce this output file.
Back to Tools page XSLT Stylesheet 1