Downloading from the ORA repository
ORA is an Open Access platform, and we are committed to making as much of our content available to as many users as possible. We are happy to work with those who want to download all or a large sample of the ORA website. However, we do ask you to get in contact, so we can advise the best way to do this without causing any problems to the ORA website. Unexpected 'spidering' or 'webscraping' of our content may lead to us blocking your access to the service.
Please note that ORA is a repository platform and not a publisher. Although our content is free to download, we are not the copyright owners. By using this service you are agreeing to the ORA Terms of Use. Individual records and binary files may also have their own licences or terms that describe how that content can be reused.
We do make a version of our metadata available via CC0 licence (see RIOXX terms, below).
OAI-PMH: a guide for harvesters
ORA supports and participates in the Open Archives Initiative (OAI). ORA is a registered OAI-PMH data-provider and provides metadata for all public records which is updated as soon as each record is published or updated.
Base URL
The OAI-PMH endpoint uses OAI_PMH v2.0 and is available at the base URL https://ora.ox.ac.uk/oai2
Item = ORA record
Each record in ORA is modeled as an Item in the OAI-PMH interface. Only the most recent version of each record is exposed via this interface.
Datestamps
Every OAI-PMH metadata record has a datestamp associated with it, which is the last modification time of that record in the ORA public website.
Because the current ORA public website dates from April 2018, the OAI-PMH datestamp values do not correspond with the original submission or publication times for older records, and may not for newer records because of administrative and bibliographic updates.
The earliest datestamp is given by the <earliestDatestamp>
element of the Identify response.
The OAI-PMH interface does not support selective harvesting based on publication date. The datestamps are designed to support incremental harvesting of updates on an ongoing basis. It is not possible to selectively harvest only, say, records published in February 2017.
Except for selective harvesting based on subject areas (see description of Sets below) the interface is designed to support copying and synchronization of a complete set of ORA metadata. In order to harvest metadata for all articles, either make requests without a datestamp range (recommended), or make requests from the <earliestDatestamp>
through to the present (but be aware that because of bulk updates there are some dates on which there were large numbers of updates).
Once an initial harvest has been completed, the copy may be maintained by making incremental harvesting requests with the from date set to the date of last harvest (from
is best taken from the last server response; don't set the until
date).
Sets
ORA records are available for selective harvesting as a separate set based on their 'Type of work' within the ORA system, e.g. 'thesis', 'dataset', 'journal article'. You may request a list of all the sets supported with the ListSets verb.
https://ora.ox.ac.uk/oai2/?verb=ListSets
Update schedule
New records are made available immediately on publication.
Record deletion policy
The ORA OAI-PMH service does not maintain information about deletions. Once deleted from the ORA system, deleted records are removed from the OAI-PMH service immediately.
Service availability
If required, ORA performs scheduled maintenance activity on Tuesday mornings from 07:00 to 09:00 (UK time). This may result in the OAI-PMH service being unavailable for short periods.
Identifiers
Internal ORA identifiers (record identifiers) are in the form uuid:12345678-1234-1234-12345678abcd
.
ORA OAI-PMH identifiers are in the format oai_scheme:repository_identifier:record_identifier
, e.g. oai:ora.ox.ac.uk:uuid:000d2073-9081-4a5b-b238-021cc7178e49
. This is a change from the previous ORA OAI-PMH endpoint, where identifiers did not have the OAI scheme or Repository Identifier prefixes.
Harvesters which used the previous endpoint can map identifiers by prefixing them with ora:ora.ox.ac.uk:
.
Metadata formats and downstream targets
Metadata for each item (record) is available in several formats. Not all formats are supported for all records.
You may request a list of all the metadata formats supported with the ListMetadataFormats verb.
https://ora.ox.ac.uk/oai2/?verb=ListMetadataFormats
Format | Metadata Prefix | Restriction | Description |
---|---|---|---|
OAI DC | oai_dc | OAI-PMH standard Dublin Core (DC). | |
Datacite | datacite_dc | Datasets only | Customised DC format for the DataCite service. |
DART | dart_dc | Theses only | Customised DC format for the DART-Europe service. |
SOLO | solo_dc | Customised DC format for the Oxford University SOLO service | |
BASE | base_dc | Customised DC format for the BASE service. | |
OpenAIRE | oai_openaire | Customised metadata format for the OpenAIRE Literature Repository Guidelines v4.0. | |
EThOS | uketd_dc | Theses only | Extended DC format for the EThOS service. |
RIOXX Terms | rioxx_terms | Metadata format for the RIOXX V2 Metadata application profile. This format has additions for deposit and record publication dates in support of UKRI and CORE recommendations. These updates use the RIOXX V3 Beta formats for these fields | |
RIOXX Terms CC0 | rioxx_terms_cc0 | The rioxx_terms_cc0 metadata format is released under a CC-0 licence. It is identical to the rioxx_terms metadata format, with the exception of abstracts/summary descriptions, which are not included. |