|
|
The IMesh Toolkit
[ Work In Hand > Components >
OAI Normalization tools]
Using trans.pl version 0.1: a step-by-step guide
This page guides you through the process of downloading and using trans.pl to
make tranformations on OAI records.
- Download the perlscript
Download the file trans.pl
(If the browser displays the file and does not prompt you to download,
right-click with the mouse and use save target as.)
Place the script anywhere you can run Perl files.
You will need the XML::Simple library which you can obtain from CPAN.
- Prepare your input files
The input files need to conform to the OAI-defined schema for encoding
DC in XML. You can consult the official version of the schema, or take
a look at a schematic view of a record that conforms to the schema.
If you are making
your records available in an OAI repository, you should already support
a version of your records in the required schema, as this is a mandatory requirement for repositories.
The input file may consist of one or more records within the same file.
You must then add a root <Records> element to the document.
For example, if your list of records is a result of an OAI ListRecords request, change the
<ListRecords> opening and closing tags to a <Records> tag (and delete any enclosing elements so that Record becomes the root element).
Sample file with one record
Sample file containing multiple records
- Define the configuration file
The configuration file is made up of instruction lines, each instruction is on
a new line, and consists of the name of the element that you wish to modify and a regular expression to be applied to the content of the element.
For example the instruction
type s/Tutorials/LearningMaterialCourseware/i
applies the regular expression s/Tutorials/LearningMaterialCourseware/i
to all the type elements that are children of the dc child element of the
metadata element of each record. In this example, the text Tutorials
within the type element is changed to LearningMaterialCourseware.
To apply different regular expressions, edit the file, changing the name
of the element to which the regular expression must be applied. Edit
the expression so that the first portion contains the text that you want to
change, and the second portion is the new text. So if for example you would
like the type to be Educational, change the expression to:
type s/Tutorials/Educational/i
For more about Perl Regular Expressions see here and a
more general article
on Regular Expressions in UNIX.
Note that in this version only the type elements in the metadata portion will be affected.
The content of <type> tags within other sections (e.g. children of the about element) will not be affected.
Sample configuration file
- Run the script
trans.pl -i <inputfile> -o <outputfile> -c <configfile>
where <input> file is the name of the file containing the XML
<outputfile> is the name of the new file containing the transformed XML
and <configfile> is the name of the configuration file where the
transformations are specified.
Example: if your records are in a file called myInputrecords.xml,
and you would like the output to be written to a file called
myChangedRecords.xml, use the command:
trans.pl -i ./myInputRecords.xml -o myChangedRecords.xml -c myConfig
Note that there is no requirement for all the files to be in the same directory.
The following would work just as well:
OAI/scripts/trans.pl -i OAI/records/myInputRecords.xml -o OAI/newrecords/myChangedRecords.xml -c OAI/configfiles/myConfig
assuming those are the correct paths to all the files.
Remember that the full path must be used for the input file, even if it is in
the same directory as the Perl Script.
- Example of output
The result of applying the instructions in the sample configuration file
to the sample input file with multiple records will produce this output file.
|