cyverse-de/oai-ore

Name: oai-ore

Owner: CyVerse Discovery Environment

Description: Library for generating OAI-ORE files.

Created: 2018-03-29 18:20:32.0

Updated: 2018-04-06 02:15:58.0

Pushed: 2018-04-06 02:15:57.0

Homepage: null

Size: 34

Language: Clojure

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

oai-ore

A Clojure library used to generate Open Archives Initiative Object Reuse and Exchange files for the CyVerse Data Commons repository. The primary purpose of this library is to make it easier to generate OAI-ORE files from a set of URIs and attribute metdata stored in the CyVerse Data Store.

Usage

All examples assume that these commands have been executed in the REPL:

uire '[org.cyverse.oai-ore :as ore])
uire '[clojure.data.xml :refer :all])
 agg-uri "http://foo.org")
 arch-uri "http://foo.org/bar.xml")
 file-uris ["http://foo.org/bar1.txt" "http://foo.org/bar2.txt"])

The suggested way to build an OAI-ORE file is to use the build-ore function. The result of this function is an instance of org.cyverse.oai-ore.Ore, which can then be converted to RDF/XML and serialized. The simplest such file is an empty archive:

 empty-ore (ore/build-ore agg-uri arch-uri []))

Once you have the org.cyverse.oai-ore.Ore instance, you can convert it to RDF/XML by calling its to-rdf method:

 rdf (ore/to-rdf empty-ore))

The result of calling this method is an instance of clojure.data.xml.Element, which can then be serialized using any serialization method available in org.clojure/data.xml. For example, you can use the following commands to emit pretty-printed RDF/XML:

nt (indent-str rdf))

An empty OAI-ORE file is good for an example, but not very useful. It's possible to add aggregated entites by including URIs in the third argument to build-ore:

 populated-ore (ore/build-ore agg-uri arch-uri file-uris))

This library also supports several of the metadata attributes that are used when a data set is published to the CyVerse Data Commons repository. The following attributes are currently supported:

| CyVerse Attribute | ORE Element | | ——————— | —————- | | datacite.title | dc:title | | datacite.publisher | dc:publisher | | datacite.creator | dc:creator | | datacite.resourcetype | dc:type | | contributorName | dc:contributor | | Subject | dc:subject | | Rights | dc:rights | | Description | dc:description | | Identifier | dc:identifier | | geoLocationBox | dcterms:Box | | geoLocationPlace | dcterms:Location | | geoLocationPoint | dcterms:Point |

Any attribute that is associated with the data set that is not in this list is ignored. Similarly, any attribute that is in the list but either contains an empty value or is not associated with the data set is ignored:

 attr-ore (ore/build-ore agg-uri arch-uri file-uris [{:attr "datacite.title" :value "The Title"}
                                                     {:attr "datacite.creator" :value "The Creator"}
                                                     {:attr "ignored.attribute" :value "Who Cares?"}
                                                     {:attr "Subject" :value ""}]))

Serializing the RDF using (print (indent-str (ore/to-rdf attr-ore))) should produce the following output:

l version="1.0" encoding="UTF-8"?>
:RDF xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:owl="http://www.w3.org/2002/07/owl#"
     xmlns:foaf="http://xmlns.com/foaf/0.1/"
     xmlns:dcterms="http://purl.org/dc/terms/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:ore="http://www.openarchives.org/ore/terms/"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
df:Description rdf:about="http://foo.org">
<rdf:type rdf:resource="http://www.openarchives.org/ore/terms/Aggregation"/>
<ore:aggregates rdf:resource="http://foo.org/bar1.txt"/>
<ore:aggregates rdf:resource="http://foo.org/bar2.txt"/>
<dc:title>The Title</dc:title>
<dc:creator>The Creator</dc:creator>
<dc:subject/>
rdf:Description>
df:Description rdf:about="http://foo.org/bar.xml">
<rdf:type rdf:resource="http://www.openarchives.org/ore/terms/ResourceMap"/>
<ore:describes rdf:resource="http://foo.org"/>
rdf:Description>
df:Description rdf:about="http://foo.org/bar1.txt"/>
df:Description rdf:about="http://foo.org/bar2.txt"/>
f:RDF>

Note: the namespace declarations have been reformatted for readability.

License

http://www.cyverse.org/license


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.