diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..e43b0f9 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +.DS_Store diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..5f19d88 --- /dev/null +++ b/LICENSE @@ -0,0 +1 @@ +This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. \ No newline at end of file diff --git a/README.md b/README.md new file mode 100644 index 0000000..413cbc1 --- /dev/null +++ b/README.md @@ -0,0 +1,774 @@ +# Crossref REST API + + + +- [Crossref REST API](#crossref-rest-api) + - [Preamble](#preamble) + - [Meta](#meta) + - [API overview](#api-overview) + - [Result types](#result-types) + - [Resource components](#resource-components) + - [Parameters](#parameters) + - [Queries](#queries) + - [Field Queries](#field-queries) + - [Sorting](#sorting) + - [Facet counts](#facet-counts) + - [Filter names](#filter-names) + - [Result controls](#result-controls) + - [API versioning](#api-versioning) + - [Documentation history](#documentation-history) + + + + + +## Preamble + +The Crossref REST API is one of [a variety of tools and APIs](https://www.crossref.org/services/metadata-delivery/) that allow anybody to search and reuse our members' metadata in sophisticated ways. + + +## Meta + +### Frequency of indexing + +Records typically appear in the REST API within 20 minutes of their having been successfully deposited with Crossref. + +Summary information (e.g. counts, etc.) are processed in batch every 24 hours. + +### Learning about performance or availability problems + +We record and report service issues on our [status page](http://status.crossref.org). + +You might want to check this to see if we are already aware of a problem before you report it. + +We also post notice of any ongoing performance problems with our services on our twitter feeds at [CrossrefOrg](https://twitter.com/CrossrefOrg) and [CrossrefSupport](https://twitter.com/@CrossrefSupport). + +### Reporting performance or availability problems + +Report performance/availability at our [support site](https://support.crossref.org/hc/en-us). + +### Reporting bugs, requesting features + +Please report bugs with the API or the documentation on our [issue tracker](https://github.com/Crossref/rest-api-doc/issues). + +### Documentation License + +Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License. + +### Metadata License + +Crossref asserts no claims of ownership to individual items of bibliographic metadata and associated Digital Object Identifiers (DOIs) acquired through the use of the Crossref Free Services. Individual items of bibliographic metadata and associated DOIs may be cached and incorporated into the user's content and systems. + +### Privacy + +We also have a [privacy policy](https://www.crossref.org/privacy/). + +### Libraries + +You might be able to avoid reading all this documentation if you instead use one of the several excellent libraries that have been written for the Crossref REST API. For example: + +- [crossref-commons](https://gitlab.com/crossref/crossref_commons_py) (Python, developed by Crossref) +- [habanero](https://github.com/sckott/habanero) (Python) +- [serrano](https://github.com/sckott/serrano) (Ruby) +- [rcrossref](https://github.com/ropensci/rcrossref) (R) +- [crossrefapi](https://github.com/fabiobatalha/crossrefapi) (Python) + +If you know of another library you would like to see listed here, please let us know about it via the [issue tracker](https://github.com/Crossref/rest-api-doc/issues). + +### Etiquette + +We want to provide a public, open, and free API for all. And we don't want to unnecessarily burden developers (or ourselves) with cumbersome API tokens or registration processes in order to use the public REST API. For that to work, we ask that you be polite and try not to do anything that will take the public REST API down or otherwise make it unusable for others. Specifically, we encourage the following polite behaviour: + +- Cache data so you don't request the same data over and over again. +- Actively monitor API response times. If they start to go up, back-off for a while. For example, add pauses between requests and/or reduce the number of parallel requests. +- Specify a `User-Agent` header that properly identifies your script or tool and that provides a means of contacting you via email using "mailto:". For example: +`GroovyBib/1.1 (https://example.org/GroovyBib/; mailto:GroovyBib@example.org) BasedOnFunkyLib/1.4`. + +This way we can contact you if we see a problem. + +- report problems and/or ask questions on our [issue tracker](https://github.com/Crossref/rest-api-doc/issues). + +Alas, not all people are polite. And for this reason we reserve the right to impose rate limits and/or to block clients that are disrupting the public service. + +### Good manners = more reliable service. + +But we prefer carrots to sticks. As of September 18th 2017 any API queries that **use HTTPS and have appropriate contact information** will be directed to a special pool of API machines that are reserved for polite users. + +Why are are we doing this? Well- we don't want to force users to have to register with us. But this means that if some user of the public server writes a buggy script or ignores timeouts and errors- they can really bring the API service to its knees. What's more, it is very hard for us to identify these problem users because they tend to work off multiple parallel machines and use generic User-Agent headers. They are effectively anonymous. We're starting to have to spend a lot of time dealing with these problems and the degraded performance of the public API is affecting all the polite users as well. + +So... we are keeping the public service as is. It will probably continue to fluctuate widely in performance. But now, if a client connects to the API using HTTPS and provides contact information either in their User-Agent header or as a parameter on their queries, then we will send them to a separate pool of machines. We expect to be able to better control the performance of these machines because, if a script starts causing problems, we can contact the people responsible for the script to ask them to fix it. Or, in extremis, we can block it. + +How does it work? Simple. You can do one of two things to get directed to the "polite pool": + +1) Include a "mailto" parameter in your query. For example: + +`https://api.crossref.org/works?filter=has-full-text:true&mailto=GroovyBib@example.org` + +2) Include a "mailto:" in your User-Agent header. For example: + +`GroovyBib/1.1 (https://example.org/GroovyBib/; mailto:GroovyBib@example.org) BasedOnFunkyLib/1.4`. + +Note that this only works if you query the API using HTTPS. You really should be doing that anyway (wags finger). + +##### Frequently anticipated questions + +**Q:** Will you spam me with marketing [bumf](https://en.oxforddictionaries.com/definition/bumf) once you have our contact info? + +**A:** No. We will only use it to contact you about problems with your scripts. + + +**Q:** Is this a secret plot to kill public access to your API? + +**A:** No. It is an attempt to keep the public API reliable. + + +**Q:** What if I provide fake or incorrect contact info? + +**A:** That is not very polite. If there is a problem and you don't respond, we'll block you. + + +**Q:** Does the contact info have to be a real name? + +**A:** No. As long as somebody actually receives and pays attention to email at the address, it can be pseudo-anonymous, or whatever. + + + +#### Rate limits + +From time to time Crossref needs to impose rate limits to ensure that the free API is usable by all. Any rate limits that are in effect will be advertised in the `X-Rate-Limit-Limit` and `X-Rate-Limit-Interval` HTTP headers. + +For ease-of-parsing, the `X-Rate-Limit-Interval` will always be expressed in seconds. So, for example the following tells you that you should expect to be able to perform 50 requests a second: + +``` +`X-Rate-Limit-Limit`: 50 +`X-Rate-Limit-Interval`: 1s +``` + +Note that if we wanted to adjust the measurement window, we could specify: + +``` +`X-Rate-Limit-Limit`: 3000 +`X-Rate-Limit-Interval`: 60s +``` + + +#### Blocking + +This is always our last resort, and you can generally avoid it if you include contact information in the `User-Agent` header or `mailto` parameter as described above. + +But seriously... this is a bummer. We really want you to use the API. If you are polite about it, you shouldn't have any problems. + +### Use for production services + +What if you want to use our API for a production service that cannot depend on the performance uncertainties of the free and open public API? What if you don't want to be affected by impolite people who do not follow the [API Etiquette](#api-etiquette) guidelines? Well, if you’re interested in using these tools or APIs for production services, we [have a service-level offering](https://www.crossref.org/services/metadata-delivery/plus-service/) called "Plus". This service provides you with access to all supported APIs and metadata, but with extra service and support guarantees. + +#### Authorization token for Plus service + +When you sign up for the Plus service, you will be issued an API token that you should put in the `Authorization` header of all your rest API requests. This token will ensure that said requests get directed to a pool of machines that are reserved for "Plus" SLA users. For example, with [curl](https://curl.haxx.se/): + +``` +curl -X GET \ + https://api.crossref.org/works \ + -H 'Authorization: Bearer yJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJodHRwOi8vY3Jvc3NyZWYub3JnLyIsImF1ZXYZImVuaGFuY2VkY21zIiwianRpIjoiN0M5ODlFNTItMTFEQS00QkY3LUJCRUUtODFCMUM3QzE0OTZEIn0.NYe3-O066sce9R1fjMzNEvP88VqSEaYdBY622FDiG8Uq' \ + -H 'User-Agent: GroovyBib/1.1 (https://example.org/GroovyBib/; mailto:GroovyBib@example.org) BasedOnFunkyLib/1.4' +``` + +Note that you can still be "polite" and identify yourself as well. And, of course, replace the fake token above with the real token. + +## API overview + +The API is generally RESTFUL and returns results in JSON. JSON formats returned by the API are documented [here](https://github.com/Crossref/rest-api-doc/blob/master/api_format.md). + +The API supports HTTP and HTTPS. Examples here are provided using HTTPS. + +You should always url-encode DOIs and parameter values when using the API. DOIs are notorious for including characters that break URLs (e.g. semicolons, hashes, slashes, ampersands, question marks, etc.). + +Note that, for the sake of clarity, the examples in this document do *not* url-encode DOIs or parameter values. + +The API will only work for Crossref DOIs. You can test the registration agency for a DOI using the following route: + +`https://api.crossref.org/works/{doi}/agency` + +Testing the following Crossref DOI: + +`10.1037/0003-066X.59.1.29` + +Using the URL: + +`https://api.crossref.org/works/10.1037/0003-066X.59.1.29/agency` + +Will return the following result: + + { + status: "ok", + message-type: "work-agency", + message-version: "1.0.0", + message: { + DOI: "10.1037/0003-066x.59.1.29", + agency: { + id: "crossref", + label: "Crossref" + } + } + } + +If you use any of the API calls listed below with a non-Crossref DOI, you will get a `404` HTTP status response. Typical agency IDs include `crossref`, `datacite`, `medra` and also `public` for test DOIs. + +## Result types + +All results are returned in JSON. There are three general types of results: + +- Singletons +- Headers-only +- Lists + +The mime-type for API results is `application/vnd.crossref-api-message+json` + +### Singletons + +Singletons are single results. Retrieving metadata for a specific identifier (e.g. DOI, ISSN, funder_identifier) typically returns in a singleton result. + +### Headers only + +You can use HTTP HEAD requests to quickly determine "existence" of a singleton. The advantage of this technique is that it is very fast because it does not return any metadata- it only retruns headers and an HTTP status code (200=exists, 404=does not exist). + +To determine if member ID `98` exists: + +`curl --head "http://api.crossref.org/members/98"` + +To determine if a journal with ISSN `1549-7712` exists: + +`curl --head "http://api.crossref.org/journals/1549-7712"` + +### Lists +Lists results can contain multiple entries. Searching or filtering typically returns a list result. A list has two parts: + +- Summary, which include the following information: + + - status (e.g. "ok", error) + - message-type (e.g. "work-list" ) + - message-version (e.g. 1.0.0 ) + +- Items, which will will contain the items matching the query or filter. + +Note that the "message-type" returned will differ from the mime-type: + +- funder (singleton) +- prefix (singleton) +- member (singleton) +- work (singleton) +- work-list (list) +- funder-list (list) +- prefix-list (list) +- member-list (list) + +Normally, an API list result will return both the summary and the items. If you want to just retrieve the summary, you can do so by specifying that the number of rows returned should be zero. + +#### Sort order + +If the API call includes a query, then the sort order will be by the relevance score. If no query is included, then the sort order will be by DOI update date. + +### Selecting which elements to return + +Crossref metadata records can be quite large. Sometimes you just want a few elements from the schema. You can "select" a subset of elements to return using the `select` parameter. This can make your API calls much more efficient. For example: + +`http://api.crossref.org/works?sample=10&select=DOI,title` + + +## Resource components +Major resource components supported by the Crossref API are: + +- works +- funders +- members +- prefixes +- types +- journals + +These can be used alone like this + +| resource | description | +|:--------------|:----------------------------------| +| `/works` | returns a list of all works (journal articles, conference proceedings, books, components, etc), 20 per page +| `/funders` | returns a list of all funders in the [Funder Registry](https://github.com/Crossref/open-funder-registry) +| `/members` | returns a list of all Crossref members (mostly publishers) | +| `/types` | returns a list of valid work types | +| `/licenses` | return a list of licenses applied to works in Crossref metadata | +| `/journals` | return a list of journals in the Crossref database | + + +### Resource components and identifiers +Resource components can be used in conjunction with identifiers to retrieve the metadata for that identifier. + +| resource | description | +|:----------------------------|:----------------------------------| +| `/works/{doi}` | returns metadata for the specified Crossref DOI. | +| `/funders/{funder_id}` | returns metadata for specified funder **and** its suborganizations | +| `/prefixes/{owner_prefix}` | returns metadata for the DOI owner prefix | +| `/members/{member_id}` | returns metadata for a Crossref member | +| `/types/{type_id}` | returns information about a metadata work type | +| `/journals/{issn}` | returns information about a journal with the given ISSN | + +### Combining resource components + +The works component can be appended to other resources. + +| resource | description | +|:----------------------------|:----------------------------------| +| `/works/{doi}` | returns information about the specified Crossref `DOI` | +| `/funders/{funder_id}/works`| returns list of works associated with the specified `funder_id` | +| `/types/{type_id}/works` | returns list of works of type `type` | +| `/prefixes/{owner_prefix}/works` | returns list of works associated with specified `owner_prefix` | +| `/members/{member_id}/works` | returns list of works associated with a Crossref member (deposited by a Crossref member) | +| `/journals/{issn}/works` | returns a list of works in the given journal | + +## Parameters + +Parameters can be used to query, filter and control the results returned by the Crossref API. They can be passed as normal URI parameters or as JSON in the body of the request. + +| parameter | description | +|:-----------------------------|:----------------------------| +| `query` | query terms | +| `filter={filter_name}:{value}`| filter results by specific fields | +| `rows={#}` | results per per page | +| `offset={#}` (max 10k) | result offset (user `cursor` for larger `/works` result sets) | +| `sample={#}` (max 100) | return random N results | +| `sort={#}` | sort results by a certain field | +| `order={#}` | set the sort order to `asc` or `desc` | +| `facet={#}` | enable facet information in responses | +| `cursor={#}` | deep page through `/works` result sets | + +Multiple filters can be specified by separating name:value pairs with a comma: + + https://api.crossref.org/works?filter=has-orcid:true,from-pub-date:2004-04-04 + +### Example query using URI parameters + + https://api.crossref.org/funders/100000015/works?query=global+state&filter=has-orcid:true&rows=1 + +## Queries + +Free form search queries can be made, for example, works that include `renear` and `ontologies`: + + https://api.crossref.org/works?query=renear+ontologies + +## Field Queries + +Field queries are available on the `/works` route and allow for queries that match only particular fields +of metadata. For example, this query matches records that contain the tokens `richard` or `feynman` (or both) +in any author field: + + https://api.crossref.org/works?query.author=richard+feynman + +Field queries can be combined with the general `query` paramter and each other. Each query parameter +is ANDed with the others: + + https://api.crossref.org/works?query.title=room+at+the+bottom&query.author=richard+feynman + +### `/works` Field Queries + +These field queries are available on the `/works` route: + +| Field query parameter | Description | +|-----------------------|-------------| +| `query.title` | Query `title` and `subtitle` | +| `query.container-title` | Query `container-title` aka. publication name | +| `query.author` | Query author given and family names | +| `query.editor` | Query editor given and family names | +| `query.chair` | Query chair given and family names | +| `query.translator` | Query translator given and family names | +| `query.contributor` | Query author, editor, chair and translator given and family names | +| `query.bibliographic` | Query bibliographic information, useful for citation look up. Includes titles, authors, ISSNs and publication years | +| `query.affiliation` | Query contributor affiliations | + +## Sorting + +Results from a listy response can be sorted by applying the `sort` and `order` parameters. Order +sets the result ordering, either `asc` or `desc`. Sort sets the field by which results will be +sorted. Possible values are: + +| Sort value | Description | +|------------|-------------| +| `score` or `relevance` | Sort by relevance score | +| `updated` | Sort by date of most recent change to metadata. Currently the same as `deposited`. | +| `deposited` | Sort by time of most recent deposit | +| `indexed` | Sort by time of most recent index | +| `published` | Sort by publication date | +| `published-print` | Sort by print publication date | +| `published-online` | Sort by online publication date | +| `issued` | Sort by issued date (earliest known publication date) | +| `is-referenced-by-count` | Sort by number of times this DOI is referenced by other Crossref DOIs | +| `references-count` | Sort by number of references included in the references section of the document identified by this DOI | + +An example that sorts results in order of publication, beginning with the least recent: + + https://api.crossref.org/works?query=josiah+carberry&sort=published&order=asc + +## Facet counts + +Facet counts can be retrieved by enabling faceting. Facets are enabled by providing facet field names along with a maximum number of returned term values. The larger the number of returned values, the longer the query will take. Some facet fields +can accept a `*` as their maximum, which indicates that all values should be returned. + +Facets are specified with the `facet` parameter: + + https://api.crossref.org/works?rows=0&facet=type-name:* + +| Facet name | Maximum values | Description | +|:-----------|:---------------|-------------| +| `affiliation` | `*` | Author affiliation | +| `funder-name` | `*` | Funder literal name as deposited alongside DOIs | +| `funder-doi` | `*` | Funder DOI | +| `orcid` | 100 | Contributor ORCID | +| `container-title` | 100 | Work container title, such as journal title, or book title | +| `assertion` | `*` | Custom Crossmark assertion name | +| `archive` | `*` | Archive location | +| `update-type` | `*` | Significant update type | +| `issn` | 100 | Journal ISSN (any - print, electronic, link) | +| `published` | `*` | Earliest year of publication | +| `type-name` | `*` | Work type name, such as `journal-article` or `book-chapter` | +| `license` | `*` | License URI of work | +| `category-name` | `*` | Category name of work | +| `relation-type` | `*` | Relation type described by work or described by another work with work as object | +| `assertion-group` | `*` | Custom Crossmark assertion group name | +| `publisher-name` | `*` | Publisher name of work | + +## Filter names + +Filters allow you to narrow queries. All filter results are lists. + + +The following filters are supported for the `/works` route: + +| filter | possible values | description| +|:-----------|:----------------|:-----------| +| `has-funder` | | metadata which includes one or more funder entry | +| `funder` | `{funder_id}` | metadata which include the `{funder_id}` in FundRef data | +| `location` |`{country_name}` | funder records where location = `{country name}`. Only works on `/funders` route | +| `prefix` | `{owner_prefix}` | metadata belonging to a DOI owner prefix `{owner_prefix}` (e.g. `10.1016` ) | +| `member` | `{member_id}` | metadata belonging to a Crossref member | +| `from-index-date` | `{date}` | metadata indexed since (inclusive) `{date}` | +| `until-index-date` | `{date}` | metadata indexed before (inclusive) `{date}` | +| `from-deposit-date` | `{date}` | metadata last (re)deposited since (inclusive) `{date}` | +| `until-deposit-date` | `{date}` | metadata last (re)deposited before (inclusive) `{date}` | +| `from-update-date` | `{date}` | Metadata updated since (inclusive) `{date}`. Currently the same as `from-deposit-date`. | +| `until-update-date` | `{date}` | Metadata updated before (inclusive) `{date}`. Currently the same as `until-deposit-date`. | +| `from-created-date` | `{date}` | metadata first deposited since (inclusive) `{date}` | +| `until-created-date` | `{date}` | metadata first deposited before (inclusive) `{date}` | +| `from-pub-date` | `{date}` | metadata where published date is since (inclusive) `{date}` | +| `until-pub-date` | `{date}` | metadata where published date is before (inclusive) `{date}` | +| `from-online-pub-date` | `{date}` | metadata where online published date is since (inclusive) `{date}` | +| `until-online-pub-date` | `{date}` | metadata where online published date is before (inclusive) `{date}` | +| `from-print-pub-date` | `{date}` | metadata where print published date is since (inclusive) `{date}` | +| `until-print-pub-date` | `{date}` | metadata where print published date is before (inclusive) `{date}` | +| `from-posted-date` | `{date}` | metadata where posted date is since (inclusive) `{date}` | +| `until-posted-date` | `{date}` | metadata where posted date is before (inclusive) `{date}` | +| `from-accepted-date` | `{date}` | metadata where accepted date is since (inclusive) `{date}` | +| `until-accepted-date` | `{date}` | metadata where accepted date is before (inclusive) `{date}` | +| `has-license` | | metadata that includes any `` elements. | +| `license.url` | `{url}` | metadata where `` value equals `{url}` | +| `license.version` | `{string}` | metadata where the ``'s `applies_to` attribute is `{string}`| +| `license.delay` | `{integer}` | metadata where difference between publication date and the ``'s `start_date` attribute is <= `{integer}` (in days)| +| `has-full-text` | | metadata that includes any full text `` elements. | +| `full-text.version` | `{string}` | metadata where `` element's `content_version` attribute is `{string}`. | +| `full-text.type` | `{mime_type}` | metadata where `` element's `content_type` attribute is `{mime_type}` (e.g. `application/pdf`). | +| `full-text.application` | `{string}` | metadata where `` link has one of the following intended applications: `text-mining`, `similarity-checking` or `unspecified` | +| `has-references` | | metadata for works that have a list of references | +| `reference-visibility` | `[open, limited, closed]` | metadata for works where references are either `open`, `limited` (to [Metadata Plus subscribers](https://www.crossref.org/services/metadata-delivery/plus-service/)) or `closed` | +| `has-archive` | | metadata which include name of archive partner | +| `archive` | `{string}` | metadata which where value of archive partner is `{string}` | +| `has-orcid` | | metadata which includes one or more ORCIDs | +| `has-authenticated-orcid` | | metadata which includes one or more ORCIDs where the depositing publisher claims to have witness the ORCID owner authenticate with ORCID | +| `orcid` | `{orcid}` | metadata where `` element's value = `{orcid}` | +| `issn` | `{issn}` | metadata where record has an ISSN = `{issn}`. Format is `xxxx-xxxx`. | +| `isbn` | `{isbn}` | metadata where record has an ISBN = `{issn}`. | +| `type` | `{type}` | metadata records whose type = `{type}`. Type must be an ID value from the list of types returned by the `/types` resource | +| `directory` | `{directory}` | metadata records whose article or serial are mentioned in the given `{directory}`. Currently the only supported value is `doaj`. | +| `doi` | `{doi}` | metadata describing the DOI `{doi}` | +| `updates` | `{doi}` | metadata for records that represent editorial updates to the DOI `{doi}` | +| `is-update` | | metadata for records that represent editorial updates | +| `has-update-policy` | | metadata for records that include a link to an editorial update policy | +| `container-title` | | metadata for records with a publication title exactly with an exact match | +| `category-name` | | metadata for records with an exact matching category label. Category labels come from [this list](https://www.elsevier.com/solutions/scopus/content) published by Scopus | +| `type` | | metadata for records with type matching a type identifier (e.g. `journal-article`) | +| `type-name` | | metadata for records with an exacty matching type label | +| `award.number` | `{award_number}` | metadata for records with a matching award nunber. Optionally combine with `award.funder` | +| `award.funder` | `{funder doi or id}` | metadata for records with an award with matching funder. Optionally combine with `award.number` | +| `has-assertion` | | metadata for records with any assertions | +| `assertion-group` | | metadata for records with an assertion in a particular group | +| `assertion` | | metadata for records with a particular named assertion | +| `has-affiliation` | | metadata for records that have any affiliation information | +| `alternative-id` | | metadata for records with the given alternative ID, which may be a publisher-specific ID, or any other identifier a publisher may have provided | +| `article-number` | | metadata for records with a given article number | +| `has-abstract` | | metadata for records which include an abstract | +| `has-clinical-trial-number` | | metadata for records which include a clinical trial number | +| `content-domain` | | metadata where the publisher records a particular domain name as the location Crossmark content will appear | +| `has-content-domain` | | metadata where the publisher records a domain name location for Crossmark content | +| `has-domain-restriction` | | metadata where the publisher restricts Crossmark usage to content domains | +| `has-relation` | | metadata for records that either assert or are the object of a relation | +| `relation.type` | | One of the relation types from the Crossref relations schema (e.g. `is-referenced-by`, `is-parent-of`, `is-preprint-of`) | +| `relation.object` | | Relations where the object identifier matches the identifier provided | +| `relation.object-type` | | One of the identifier types from the Crossref relations schema (e.g. `doi`, `issn`) | + + +The following filters are supported for the `/members` route: + +| filter | possible values | description| +|:-----------|:----------------|:-----------| +| `has-public-references` | | Member has made their references public for one or more of their prefixes | +| `reference-visibility` | `[open, limited, closed]` | Members who have made their references either `open`, `limited` (to [Metadata Plus subscribers](https://www.crossref.org/services/metadata-delivery/plus-service/)) or `closed` | +| `backfile-doi-count` | {integer} | count of DOIs for material published more than two years ago | +| `current-doi-count` | {integer} | count of DOIs for material published within last two years | + +The following filters are supported for the `/funders` route: + +| filter | possible values | description| +|:-----------|:----------------|:-----------| +| `location` | | funders located in specified country | + + +### Multiple filters + +Multiple filters can be specified in a single query. In such a case, different filters will be applied with AND semantics, while specifying the same filter multiple times will result in OR semantics - that is, specifying the filters: + +- `is-update:true` +- `from-pub-date:2014-03-03` +- `funder:10.13039/100000001` +- `funder:10.13039/100000050` + +would locate documents that are updates, were published on or after 3rd March 2014 and were funded by either the National Science Foundation (`10.13039/100000001`) or the National Heart, Lung, and Blood Institute (`10.13039/100000050`). These filters would be specified by joining each filter together with a comma: + + /works?filter=is-update:true,from-pub-date:2014-03-03,funder:10.13039/100000001,funder:10.13039/100000050 + +### Dot filters + +A filter with a dot in its name is special. The dot signifies that the filter will be applied to some other record type that is related to primary resource record type. For example, with work queries, one can filter on works that have an award, where the same award has a particular award number and award-giving funding agency: + + /works?filter=award.number:CBET-0756451,award.funder:10.13039/100000001 + +Here we filter on works that have an award by the National Science Foundation that also has the award number `CBET-0756451`. + +### Notes on owner prefixes + +The prefix of a Crossref DOI does **NOT** indicate who currently owns the DOI. It only reflects who originally registered the DOI. Crossref metadata has an **owner_prefix** element that records the current owner of the Crossref DOI in question. + +Crossref also has member IDs for depositing organisations. A single member may control multiple owner prefixes, which in turn may control a number of DOIs. When looking at works published by a certain organisaton, member IDs and the member routes should be used. + +### Notes on dates + +Note that dates in filters should always be of the form `YYYY-MM-DD`, `YYYY-MM` or `YYYY`. Also note that date information in Crossref metadata can often be incomplete. So, for example, a publisher may only include the year and month of publication for a journal article. For a monograph they might just include the year. In these cases the API selects the earliest possible date given the information provided. So, for instance, if the publisher only provided 2013-02 as the published date, then the date would be treated as 2013-02-01. Similarly, if the publisher only provided the year 2013 as the date, it would be treated at 2013-01-01. + +### Notes on incremental metadata updates + +When using time filters to retrieve periodic, incremental metadata updates, +the `from-index-date` filter should be used over `from-update-date`, +`from-deposit-date`, `from-created-date` and `from-pub-date`. The +timestamp that `from-index-date` filters on is guaranteed to be updated +every time there is a change to metadata requiring a reindex. + +## Result controls + +You can control the delivery and selection results using the `rows`, `offset` and `sample` parameters. + + If you are expecting results beyond 10K, then use a `cursor` to deep page through the results. Note that not all routes support cursors. + +### Rows + +Normally, results are returned 20 at a time. You can control the number of results returns by using the `rows` parameter. To limit results to 5, for example, you could do the following: + + https://api.crossref.org/works?query=allen+renear&rows=5 + +If you would just like to get the `summary` of the results, you can set the rows to 0 (zero). + + https://api.crossref.org/works?query=allen+renear&rows=0 + +The maximum number rows you can ask for in one query is `1000`. + +### Offset + +The number of returned items is controlled by the `rows` parameter, but you can select the offset into the result list by using the `offset` parameter. So, for example, to select the second set of 5 results (i.e. results 6 through 10), you would do the following: + + https://api.crossref.org/works?query=allen+renear&rows=5&offset=5 + +Offsets for `/works` are limited to 10K. Use `cursor` (see below) for larger `/works` results sets. + +### Deep paging with cursors + +Using large `offset` values can result in extremely long response times. Offsets in the 100,000s and beyond will likely cause a timeout before the API is able to respond. An alternative to paging through very large result sets (like a corpus used for text and data mining) it to use the API's exposure of Solr's deep paging cursors. Any combination of query, filters and facets may be used with deep paging cursors. While `rows` may be specified along with `cursor`, `offset` and `sample` cannot be used. To use deep paging make a query as normal, but include the `cursor` parameter with a value of `*`. In this example we will page through all `journal-article` works from member `311`: + + https://api.crossref.org/members/311/works?filter=type:journal-article&cursor=* + +A `next-cursor` field will be provided in the JSON response. To get the next page of results, pass the value of `next-cursor` as the `cursor` parameter. For example: + + https://api.crossref.org/members/311/works?filter=type:journal-article&cursor=AoE/CGh0dHA6Ly9keC5kb2kub3JnLzEwLjEwMDIvdGRtX2xpY2Vuc2VfMQ== + +Note that the actual cursor value will be different from this illustration. + +Clients should check the number of returned items. If the number of returned items is fewer than the number of expected rows then the end of the result set has been reached. Using `next-cursor` beyond this point will result in responses with an empty items list. + +The `cursor` parameter is available on all `/works` resources. + +### Sample + +Being able to select random results is useful for both testing and sampling. You can use the `sample` parameter to retrieve random results. So, for example, the following select 10 random works: + + https://api.crossref.org/works?sample=10 + +Note that when you use the `sample` parameter, the `rows` and `offset` parameters are ignored. + + +### Example queries + +**All works published by owner prefix `10.1016` in January 2010** + +``` +https://api.crossref.org/prefixes/10.1016/works?filter=from-pub-date:2010-01,until-pub-date:2010-01 +``` + +**All works funded by `10.13039/100000001` that have a CC-BY license** + +``` +https://api.crossref.org/funders/10.13039/100000001/works?filter=license.url:http://creativecommons.org/licenses/by/3.0/ +``` + +**All works published by owner prefix 10.6064 from February 2010 to February 2013 that have a CC-BY license** + +``` +https://api.crossref.org/prefixes/10.6064/works?filter=license.url:http://creativecommons.org/licenses/by/3.0/,from-pub-date:2010-02,until-pub-date:2013-02 +``` + +**All works funded by `10.13039/100000015` where license = CC-BY and embargo <= 365 days** + +``` +https://api.crossref.org/funders/10.13039/100000015/works?filter=license.url:http://creativecommons.org/licenses/by/3.0/,license.delay:365 +``` +Note that the filters for license URL and maximum license embargo period (license.url and license.delay) combine to filter each document's metadata for a license with both of these properties. + +**All works where the archive partner listed = 'CLOCKSS'** + +``` +https://api.crossref.org/works?filter=archive:CLOCKSS +``` + +**All members with `hind` in their name (e.g. Hindawi)** + +``` +https://api.crossref.org/members?query=hind +``` + +**All licenses linked to works published by Elsevier** + +``` +http://api.crossref.org/v1/works?facet=license:*&filter=member:78&rows=0 +``` + +**All licenses applied to works published in the journal `Pathology Research International`** + +``` +https://api.crossref.org/works?facet=license:*&filter=issn:2090-8091 +``` + +**All works with an award numbered roughly `1 F31 MH11745` also awarded by funder with ID `10.13039/100000025`** + +``` +https://api.crossref.org/works?filter=award.number:1F31MH11745,award.funder:10.13039/100000025 +``` + +**The number of DOIs that have references AND where references are `open` faceted by publisher name** + +``` +http://api.crossref.org/v1.0/works?filter=has-references:true,reference-visibility:open&facet=publisher-name:*&rows=0 +``` + +## API versioning + +In theory, the syntax of the API can vary independently of the result representations. In practice, major version changes in either will require changes to API clients and so versioning of the API will apply to both the API syntax and the result representation. + +The API uses a semantic versioning scheme whereby the version number is divided into three parts delimited by periods. The first number represents the "major" release number. The second represents a "minor" release number. + + Version 1.20 + ^ ^ + | | + major | + minor + + **Major** version increments are defined as releases that can break backwards compatibility. Crossref will only commit to supporting the latest two major releases simultaneously and legacy major releases will be supported for no more than nine months. Exceptions to these rules may be made when major releases are required to ensure the security or stability of the system. + +**Minor** version increments are defined as backwards compatible. There is no limit on the number of minor versions that Crossref can roll out. Note that client applications should not have dependencies on minor versions, and Crossref will only maintain the latest minor version for the two most recent major versions. + +Adding syntax options or metadata to representations will normally be backwards compatible and will thus normally only trigger minor version changes. Renaming or restructuring syntax options of metadata tends not to be backward compatible and will thus typically trigger major version changes + +### How to manage API versions + +If you need to tie your implementation to a specific major version of the API, you can do so by using version-specific routes. The default route redirects to the most recent version of the API. Some older major versions may be available using a version prefix. For example, to access version `v1` of the API: + + https://api.crossref.orv/v1/works + +Each major version has no backwards incompatible changes within its public interface. + +## Documentation history + +- V1: 2013-09-08, first draft. +- V2: 2013-09-24, reference platform deployed +- v3: 2013-09-25, reworked filters. Added API versioning doc +- v4: 2013-09-25, more filter changes. +- v5: 2013-09-27, doc mime-type and message-type relationship +- v6: 2013-10-01, updated `sample` & added examples with filters +- v6: 2013-10-01, corrected warning date +- v7: 2013-10-02, fixed typos +- v8: 2013-10-17, updated warning. Added email address +- v9: 2013-12-13, update example urls +- v10: 2013-12-13, /types routes, type filter, issn filter +- v11: 2013-12-14, indexed timestamps, has-archive and archive implemented +- v12: 2014-01-06, directory filter +- v13: 2014-02-10, new `/members`, `/publishers` becomes `/prefixes`, new `member` filter, `publisher` filter becomes `prefix` +- v14: 2014-02-14, new `has-funder` filter. +- v15: 2014-02-27, new `/licenses` route +- v16: 2014-05-19, new `/journals` route, new CrossMark (updates and update policy) filters, new `sort` and `order` parameters +- v17: 2014-05-19, new `facet` query parameter +- v18: 2014-05-29, new `/works/{doi}/agency` route +- v19: 2014-06-23, new textual filters - `container-title`, `category-name`. +- v20: 2014-06-24, OR filter queries, `type-name` filter. +- v21: 2014-07-01, new `award.number` and `award.funder` relational filters. +- v22: 2014-07-16, changed title to more accurately reflect scope of API. +- v23, 2014-09-01, semantics of mutliple filters, dot filters +- v24, 2014-10-15, added info on license of Crossref metadata itself. Doh. +- v25, 2015-05-06, added link to issue tracker. Removed Warning section. +- v26, 2015-10-20, added new filters - `from-created-date`, `until-created-date`, `affiliation`, `has-affiliation`, `assertion-group`, `assertion`, `article-number`, `alternative-id` +- v27, 2015-10-30, added `cursor` parameter to `/works` resources +- v28, 2016-05-09, added link to source of category lables +- v29, 2016-05-24, added field queries +- v30, 2016-09-26, highlight issue tracker +- v31, 2016-10-05, document `has-clinical-trial-number` and `has-abstract` filters +- v32, 2016-10-27, document rate limit headers +- v33, 2016-11-07, guidance on when to use `offset` vs `cursor` +- v34, 2017-04-26, document support for HTTPS. Update examples to use HTTPS. +- v35, 2017-04-26, document use of head reqeusts to determine `existence` +- v36, 2017-04-27, fixed license route examples to use facet/filter instead +- v37, 2017-04-27, `query.bibliographic` +- v38, 2017-04-27, add v1.1 filters and sort fields +- v39, 2017-04-27, remove mention of dismax +- v40, 2017-04-27, clarify faceting feature +- v41, 2017-04-28, document `sample` max = 100, clarify cursors only work on some routes +- v42, 2017-04-28, life, the universe, and everything +- v43, 2017-04-28, reminder on the wisdom of url-encoding +- v44, 2017-04-28, clarify that field queries apply to `/works` route +- v45, 2017-04-28, document `location` filter for `/funders` route +- v46, 2017-06-14, minor text changes and new funder registry link +- v47, 2017-07-04, clarify `query.affiliation` +- v48, 2017-07-13, correct "first and given" names to "given and family" +- v49, 2017-07-20, move document version history, add section on libraries +- v50, 2017-07-20, add TOC, move document history, add etiquet section, add production use section, general formatting + cleanup +- v51, 2017-07-24, clarified license of the documentation (as opposed to metadata) +- v52, 2017-07-27, removed service notice and what's new section. +- v53, 2017-08-11, mention `full-text.application` filter +- v54, 2017-09-18, add info about new "polite pool" +- v55, 2017-09-21, document `/member` and `/funder` filters. document `publisher-name` facet. document `select` parameter. +- v56, 2018-01-26, add info on frequency of indexing +- v57, 2018-02-01, document ISBN filter +- v58, 2018-02-13, document `reference-visibility` filter for `/works` and `/members` routes +- v59, 2018-02-13, added info about Mtedata Plus service. Corrected spelling. Added example of using `reference-visibility` filter. +- v60, 2018-02-22, added info for "Plus" users on use of token in `Authorization` header. +- v61, 2018-02-26, add curl example for use of token. +- v62, 2018-06-18, clarify how to parse `X-Rate-Limit-Limit-Interval` +- v63, 2018-08-16, remove mistakenly listed `year` facet. `published` is correct facet name. +- v64, 2018-09-04, add text and link to status page. diff --git a/api_format.md b/api_format.md new file mode 100644 index 0000000..931f834 --- /dev/null +++ b/api_format.md @@ -0,0 +1,226 @@ +# Crossref Metadata API JSON Format + +## Versioning + +| Version | Release Date | Comments | +|---------|--------------|----------| +| v1 | 11th July 2016 | First documented version | +| v2 | 26th July 2017 | Add abstract, authenticated-orcid, fix contributor fields | +| v3 | 15th May 2018 | Add peer review fields | + +## Work + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| publisher | String | Yes | Name of work's publisher | +| title | Array of String | Yes | Work titles, including translated titles | +| original-title | Array of String | No | Work titles in the work's original publication language | +| short-title | Array of String | No | Short or abbreviated work titles | +| abstract | XML String | No | Abstract as a JSON string or a JATS XML snippet encoded into a JSON string | +| reference-count | Number | Yes | *Deprecated* Same as `references-count` | +| references-count | Number | Yes | Count of outbound references deposited with Crossref | +| is-referenced-by-count | Number | Yes | Count of inbound references deposited with Crossref | +| source | String | Yes | Currently always `Crossref` | +| prefix | String | Yes | DOI prefix identifier of the form `http://id.crossref.org/prefix/DOI_PREFIX` | +| DOI | String | Yes | DOI of the work | +| URL | URL | Yes | URL form of the work's DOI | +| member | String | Yes | Member identifier of the form `http://id.crossref.org/member/MEMBER_ID` | +| type | String | Yes | Enumeration, one of the type ids from `https://api.crossref.org/v1/types` | +| created | [Date](#date) | Yes | Date on which the DOI was first registered | +| deposited | [Date](#date) | Yes | Date on which the work metadata was most recently updated | +| indexed | [Date](#date) | Yes | Date on which the work metadata was most recently indexed. Re-indexing does not imply a metadata change, see `deposited` for the most recent metadata change date | +| issued | [Partial Date](#partial-date) | Yes | Earliest of `published-print` and `published-online` | +| posted | [Partial Date](#partial-date) | No | Date on which posted content was made available online | +| accepted | [Partial Date](#partial-date) | No | Date on which a work was accepted, after being submitted, during a submission process | +| subtitle | Array of String | No | Work subtitles, including original language and translated | +| container-title | Array of String | No | Full titles of the containing work (usually a book or journal) | +| short-container-title | Array of String | No | Abbreviated titles of the containing work | +| group-title | String | No | Group title for posted content | +| issue | String | No | Issue number of an article's journal | +| volume | String | No | Volume number of an article's journal | +| page | String | No | Pages numbers of an article within its journal | +| article-number | String | No | | +| published-print | [Partial Date](#partial-date) | No | Date on which the work was published in print | +| published-online | [Partial Date](#partial-date) | No | Date on which the work was published online | +| subject | Array of String | No | Subject category names, a controlled vocabulary from Sci-Val. Available for most journal articles | +| ISSN | Array of String | No | | +| issn-type | Array of [ISSN with Type](#issn-with-type) | No | List of ISSNs with ISSN type information | +| ISBN | Array of String | No | | +| archive | Array of String | No | | +| license | Array of [License](#license) | No | | +| funder | Array of [Funder](#funder) | No | | +| assertion | Array of [Assertion](#assertion) | No | | +| author | Array of [Contributor](#contributor) | No | | +| editor | Array of [Contributor](#contributor) | No | | +| chair | Array of [Contributor](#contributor) | No | | +| translator | Array of [Contributor](#contributor) | No | | +| update-to | Array of [Update](#update) | No | | +| update-policy | URL | No | Link to an update policy covering Crossmark updates for this work | +| link | Array of [Resource Link](#resource-link) | No | URLs to full-text locations | +| clinical-trial-number | Array of [Clinical Trial Number](#clinical-trial-number) | No | | +| alternative-id | String | No | Other identifiers for the work provided by the depositing member | +| reference | Array of [Reference](#reference) | No | List of references made by the work | +| content-domain | [Content Domain](#content-domain) | No | Information on domains that support Crossmark for this work | +| relation | [Relations](#relations) | No | Relations to other works | +| review | [Review](#review) | No | Peer review metadata | + + +## Work Nested Types + +### Funder + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| name | String | Yes | Funding body primary name | +| DOI | String | No | Optional [Open Funder Registry](http://www.crossref.org/fundingdata/registry.html) DOI uniquely identifing the funding body | +| award | Array of String | No | Award number(s) for awards given by the funding body | +| doi-asserted-by | String | No | Either `crossref` or `publisher` | + +### Clinical Trial Number + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| clinical-trial-number | String | Yes | Identifier of the clinical trial | +| registry | String | Yes | DOI of the clinical trial regsitry that assigned the trial number | +| type | String | No | One of `preResults`, `results` or `postResults` | + +### Contributor + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| family | String | Yes | | +| given | String | No | | +| ORCID | URL | No | URL-form of an [ORCID](http://orcid.org) identifier | +| authenticated-orcid | Boolean | No | If true, record owner asserts that the ORCID user completed ORCID OAuth authentication | +| affiliation | Array of [Affiliation](#affiliation) | No | | + +### Affiliation + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| name | String | Yes | | + +### Date + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| date-parts | Array of Number | Yes | Contains an ordered array of `year`, `month`, `day of month`. Note that the field contains a nested array, e.g. `[ [ 2006, 5, 19 ] ]` to conform to citeproc JSON dates | +| timestamp | Number | Yes | Seconds since UNIX epoch | +| date-time | String | Yes | ISO 8601 date time | + +### Partial Date + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| date-parts | Array of Number | Yes | Contains an ordered array of `year`, `month`, `day of month`. Only `year` is required. Note that the field contains a nested array, e.g. `[ [ 2006, 5, 19 ] ]` to conform to citeproc JSON dates | + +### Update + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| updated | [Partial Date](#partial-date) | Yes | Date on which the update was published | +| DOI | String | Yes | DOI of the updated work | +| type | String | Yes | The type of update, for example `retraction` or `correction` | +| label | String | No | A display-friendly label for the update type | + +### Assertion + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| name | String | Yes | | +| value | String | Yes | | +| URL | URL | No | | +| explanation | URL | No | | +| label | String | No | | +| order | Number | No | | +| group | [Assertion Group](#assertion-group) | No | | + +### Assertion Group + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| name | String | Yes | | +| label | String | No | | + +### License + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| content-version | String | Yes | Either `vor` (version of record,) `am` (accepted manuscript,) `tdm` (text and data mining) or `unspecified` | +| delay-in-days | Number | Yes | Number of days between the publication date of the work and the start date of this license | +| start | [Partial Date](#partial-date) | Yes | Date on which this license begins to take effect | +| URL | URL | Yes | Link to a web page describing this license | + +### Resource Link + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| intended-application | String | Yes | Either `text-mining`, `similarity-checking` or `unspecified` | +| content-version | String | Yes | Either `vor` (version of record,) `am` (accepted manuscript) or `unspecified` | +| URL | URL | Yes | Direct link to a full-text download location | +| content-type | String | No | Content type (or MIME type) of the full-text object | + +### Reference + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| key | String | Yes | | +| DOI | String | No | | +| doi-asserted-by | String | No | One of `crossref` or `publisher` | +| issue | String | No | | +| first-page | String | No | | +| volume | String | No | | +| edition | String | No | | +| component | String | No | | +| standard-designator | String | No | | +| standards-body | String | No | | +| author | String | No | | +| year | String | No | | +| unstructured | String | No | | +| journal-title | String | No | | +| article-title | String | No | | +| series-title | String | No | | +| volume-title | String | No | | +| ISSN | String | No | | +| issn-type | String | No | One of `pissn` or `eissn` | +| ISBN | String | No | | +| isbn-type | String | No | | + +### ISSN with Type + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| value | String | Yes | | +| type | String | Yes | One of `eissn`, `pissn` or `lissn` | + +### Content Domain + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| domain | Array of String | Yes | | +| crossmark-restriction | Boolean | Yes | | + +### Relations + +A hashmap containing relation name, [Relation](#relation) pairs. + +### Relation + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| id-type | String | Yes | | +| id | String | Yes | | +| asserted-by | String | Yes | One of `subject` or `object` | + + +### Review + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| running-number | String | No | | +| revision-round | String | No | | +| stage | String | No | One of `pre-publication` or `post-publication` | +| recommendation | String | No | One of `major-revision` or `minor-revision` or `reject` or `reject-with-resubmit` or `accept` | +| type | String | No | One of `referee-report` or `editor-report` or `author-comment` or `community-comment` or `aggregate` | +| competing-interest-statement | String | No | | +| language | String | No | | diff --git a/demos/api-demo.postman_collection.json b/demos/api-demo.postman_collection.json new file mode 100644 index 0000000..be5049c --- /dev/null +++ b/demos/api-demo.postman_collection.json @@ -0,0 +1,612 @@ +{ + "variables": [], + "info": { + "name": "api-demo", + "_postman_id": "34579ef5-e86d-c116-c607-eeae07a1dae5", + "description": "Crossref REST API demo", + "schema": "https://schema.getpostman.com/json/collection/v2.0.0/collection.json" + }, + "item": [ + { + "name": "members", + "request": { + "url": "https://api.crossref.org/members", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "find a member", + "request": { + "url": { + "raw": "https://api.crossref.org/members?query=\"Hindawi\"", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "members" + ], + "query": [ + { + "key": "query", + "value": "\"Hindawi\"", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "a specific member", + "request": { + "url": "https://api.crossref.org/members/78", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "members with open references", + "request": { + "url": { + "raw": "https://api.crossref.org/members?filter=has-public-references:true", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "members" + ], + "query": [ + { + "key": "filter", + "value": "has-public-references:true", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "journals", + "request": { + "url": "https://api.crossref.org/journals", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "specific journal", + "request": { + "url": "https://api.crossref.org/journals/1476-4687", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "types", + "request": { + "url": "https://api.crossref.org/types", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "works", + "request": { + "url": "https://api.crossref.org/works", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "a specific work", + "request": { + "url": "https://api.crossref.org/works/10.1371/journal.pbio.2001655", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "another specific work", + "request": { + "url": "https://api.crossref.org/works/10.6084/m9.figshare.1314859.v1", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "what the hell?", + "request": { + "url": "https://api.crossref.org/works/10.6084/m9.figshare.1314859.v1/agency", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "a Crossref DOI", + "request": { + "url": "https://api.crossref.org/works/10.1371/journal.pbio.2001655/agency", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "10 random works", + "request": { + "url": { + "raw": "https://api.crossref.org/works?sample=10", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "works" + ], + "query": [ + { + "key": "sample", + "value": "10", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "works that are monographs", + "request": { + "url": { + "raw": "https://api.crossref.org/works?filter=type:monograph", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "works" + ], + "query": [ + { + "key": "filter", + "value": "type:monograph", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "works containing \"zika\"", + "request": { + "url": { + "raw": "https://api.crossref.org/works?query=\"zika\"", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "works" + ], + "query": [ + { + "key": "query", + "value": "\"zika\"", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "works containing \"zika\" in title", + "request": { + "url": { + "raw": "https://api.crossref.org/works?query.title=\"zika\"", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "works" + ], + "query": [ + { + "key": "query.title", + "value": "\"zika\"", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "works containing \"zika\" faceted by top ten funders", + "request": { + "url": { + "raw": "https://api.crossref.org/works?query.title=\"zika\"&facet=funder-name:10", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "works" + ], + "query": [ + { + "key": "query.title", + "value": "\"zika\"", + "equals": true, + "description": "" + }, + { + "key": "facet", + "value": "funder-name:10", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "works containing \"zika\" faceted by top ten publishers", + "request": { + "url": { + "raw": "https://api.crossref.org/works?query.title=\"zika\"&facet=publisher-name:10", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "works" + ], + "query": [ + { + "key": "query.title", + "value": "\"zika\"", + "equals": true, + "description": "" + }, + { + "key": "facet", + "value": "publisher-name:10", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "works- matching a reference", + "request": { + "url": { + "raw": "https://api.crossref.org/works?query.bibliographic=Neylon, C., Pattinson, D., Bilder, G., Lin, J. (2017). On the origin of nonequivalent states: How we can talk about preprints. F1000Research, 6, 608.", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "works" + ], + "query": [ + { + "key": "query.bibliographic", + "value": "Neylon, C., Pattinson, D., Bilder, G., Lin, J. (2017). On the origin of nonequivalent states: How we can talk about preprints. F1000Research, 6, 608.", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "works that have a full text link + a license", + "request": { + "url": { + "raw": "https://api.crossref.org/works?filter=has-full-text:true,has-license:true", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "works" + ], + "query": [ + { + "key": "filter", + "value": "has-full-text:true,has-license:true", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "a work's updates", + "request": { + "url": { + "raw": "https://api.crossref.org/works?filter=updates:10.1007/s00540-006-0468-8", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "works" + ], + "query": [ + { + "key": "filter", + "value": "updates:10.1007/s00540-006-0468-8", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "works with orcid faceted by top 10 publishers", + "request": { + "url": { + "raw": "http://api.crossref.org/works?filter=has-orcid:true&facet=publisher-name:10&rows=0", + "protocol": "http", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "works" + ], + "query": [ + { + "key": "filter", + "value": "has-orcid:true", + "equals": true, + "description": "" + }, + { + "key": "facet", + "value": "publisher-name:10", + "equals": true, + "description": "" + }, + { + "key": "rows", + "value": "0", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "funders", + "request": { + "url": "https://api.crossref.org/funders", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "funders in United Kingdom", + "request": { + "url": { + "raw": "https://api.crossref.org/funders?filter=location:United Kingdom", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "funders" + ], + "query": [ + { + "key": "filter", + "value": "location:United Kingdom", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "find a funder", + "request": { + "url": { + "raw": "https://api.crossref.org/funders?query=\"Wellcome\"", + "protocol": "https", + "host": [ + "api", + "crossref", + "org" + ], + "path": [ + "funders" + ], + "query": [ + { + "key": "query", + "value": "\"Wellcome\"", + "equals": true, + "description": "" + } + ], + "variable": [] + }, + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "a specific funder", + "request": { + "url": "https://api.crossref.org/funders/100000936", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + }, + { + "name": "a funder sub-organisation", + "request": { + "url": "https://api.crossref.org/funders/100008902", + "method": "GET", + "header": [], + "body": {}, + "description": "" + }, + "response": [] + } + ] +} \ No newline at end of file diff --git a/demos/crossref-api-demo.ipynb b/demos/crossref-api-demo.ipynb new file mode 100644 index 0000000..3790f2c --- /dev/null +++ b/demos/crossref-api-demo.ipynb @@ -0,0 +1,1080 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "## Introduction\n", + "\n", + "The Crossref REST API is entirely based on URLs and is [documented extensively](https://api.crossref.org). This means, that, in theory, you can simply get all the data that you want using a normal browser. For example, you might want to see the latest DOI records in the Crossref system. You can see this with the following URL:\n", + "\n", + "[`https://www.crossref.org/works`](https://api.crossref.org/works)\n", + "\n", + "\n", + "This means the REST API is pretty easy to use with basic low level HTTP libraries(e.g. Python's `requests`), but for this tutorial we are going to use a [higher level python library](https://github.com/fabiobatalha/crossrefapi) developed by Fabio Batalha C. Santos at [SciELO](http://www.scielo.org).\n", + "\n", + "The examples here are in Python 3. Sorry- but you're going to have to make the move sometime ;)\n", + "\n", + "To use this libra ry you can:\n", + "\n", + "`pip install crossrefapi`\n", + "\n", + "Then, import the library and get ready to look at so `works` data. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "# If veiwing in pineapple notebook, uncomment the next two lines and then run the cell.\n", + "#import pineapple\n", + "#%require crossrefapi" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "# If viewing in jupyter notebook, then uncomment the next line and run the cell.\n", + "!pip install crossrefapi" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "from crossref.restful import Works\n", + "works = Works()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "## Working with \"works\"\n", + "\n", + "Let's start by looking briefly at\"works\". The route refers to items identified by a DOI in the index. These can be articles, books, components, etc.\n", + "\n", + "**TIP:** Crossref does not use \"works\" in the [FRBR](https://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records) sense of the word. In Crossref parlance, a \"work\" is just a thing identified by a DOI. In practice, Crossref DOIs are used as citation identifiers. So, in FRBR terms, this means, that a Crossref DOI tends to refer to one _expression_ which might include multiple _manifestations_. So, for example, the ePub, HTML and PDF version of an article will share a Crossref DOI because the differences between them should not effect the interpretation or crediting of the content. In short, they can be cited interchangeably. The same is true of the \"accepted manuscript\" and the \"version-of-record\" of that accepted manuscript.\n", + "\n", + "In order to start querying information about works, we need to import the library and make things convenient." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "from crossref.restful import Works\n", + "works = Works()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "Now we are ready to ask our first question- How many Crossref DOI records are indexed by the API?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "works.count()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "Note that above I said \"How many Crossref DOIs\". There are several other [DOI registration agencies](https://www.doi.org/registration_agencies.html). Crossref is by far he largest DOI RA, and the other RAs tend to specialize in orthoganal areas (e.g. Music & Video, Local language translations of publications, etc.) but it is important to not that this API will not work with non-Crossref DOIs (though [DataCite](https://www.datacite.org/), another RA, provides a very similar API).\n", + "\n", + "**TIP:** Not all DOIs are Crossref DOIs. If you are having trouble using a DOI with Crossref's API, check to see if it is a Crossref DOI.\n", + "\n", + "So the next obvious question is, how do I tell if a DOI is a Crossref DOI?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "works.agency('10.1590/0102-311x00133115')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "works.agency('10.6084/m9.figshare.1314859.v1')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "works.agency('10.5240/B1FA-0EEC-C316-3316-3A73-L')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "OK, so assuming that we are using a Crossref DOI, how do we get the metadata for it?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "record = works.doi('10.7554/eLife.09561')\n", + "record" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "This is basically a huge JSON object, so you can retrieve individual elements from it. Here is the publisher:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "record['publisher']" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "And here is the license for the \"version of record\":" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "next((item for item in record['license'] if item[\"content-version\"] == \"vor\"))['URL']" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "Um... That was complicated. What does 'vor' mean?\n", + "\n", + "**TIP:** Publishers sometimes record information for multiple versions of the content identified by a DOI. These versions should be interchaneable from the point of view of citation, but sometimes one version has more \"features\" than another. For example, it might be typset or have references linked, etc. The two versions might also have different licenses and different URLs. The terminology publishers use for identifying versions comes from the [NISO standard call JAV (Journal Article Version)](http://www.niso.org/publications/rp/RP-8-2008.pdf) and, although this terminology is [sometimes problematic](https://f1000research.com/articles/6-608/v1), you should be aware of it. In particualr, you will see two terms used in Crossref metadata:\n", + "\n", + "- `VOR` = Version of Record\n", + "- `AM` = Accepted Manuscript\n", + "\n", + "\n", + "\n", + "Now that we know what 'vor' means, let's get the link to the full text of the version of record:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "next((item for item in record['link'] if item[\"content-version\"] == \"vor\"))['URL']" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The above has given us a brief overview of how to get a record and elements of a record identified with a Crossref DOI. Obviously, the goal is to do this in bulk. That is, to select and process records for multiple Crossref DOIs. Before we do that, it is helpful to familiarise yourself with some of the other \"routes\" supported by the REST API. This is because more advanced usage of the API typically involveds combining information from several routes. " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "## Members\n", + "\n", + "Crossref is a membership organization. DOI records are registered and managed by those members. It is often very useful to break down Crossref DOI records by member. But first let's find out a little bit more about members.\n", + "\n", + "First we import and setup a useful shortcut. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "from crossref.restful import Members\n", + "members = Members()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "How many members does Crossref have?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "members.count()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's look at a partciular member, Hindawi:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "pub = next(iter(members.query('Hindawi')))\n", + "pub" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**TIP:** Many people make the mistake of thinking that a \"DOI prefix\" can be used to identify the member responsible for a Crossref DOI. This is not true. DOI prefixes merely serve as a namespace form which a member can create new DOIs without worrying about collisions. But, once created, Crossref DOIs are often transfered between publishers and so a Crossref member will often be responsible for DOIs with a variety of prefixes. So, for example, above, Hindawi is responsible for several prefixes:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "prefixes = [p['value'] for p in pub['prefix']]\n", + "prefixes" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**TIP** The most accurate way to refer a particular Crossref member and *all* their prefixes is through the member's `id`.\n", + "\n", + "So let's look at eLife." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "pub = next(iter(members.query('eLife')))\n", + "pub" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "eLife's Crossref member ID can be accessed as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "pub_id = pub['id']\n", + "pub_id" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "Now we can use this ID to specifically refer to eLife. For example:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "pub = members.member(pub_id)\n", + "pub" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "Let's see how many DOIs eLife has registered by year:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "dois_by_year = pub['breakdowns']['dois-by-issued-year']\n", + "dois_by_year" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "Cool, now let's look at some of the publisher data in more friendly formats. We are going to use the pandas library for summarising and visualising the data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "import pandas as pd\n", + "%matplotlib inline" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "First let's see the publications by year in a nice, sorted table:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "f = pd.DataFrame(dois_by_year)\n", + "f.columns = ['year','dois']\n", + "dois_sorted_by_year = f.sort_values(['year','dois'])\n", + "dois_sorted_by_year" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "Maybe look at this in a graph?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "dois_sorted_by_year.plot.bar(x='year',y='dois')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "We can pull this all together and you can look at a number of publishers. Try changing the publisher name in the code below to something else:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "publisher_name = 'PLOS'\n", + "pub_id = next(iter(members.query(publisher_name)))['id']\n", + "pub = members.member(pub_id)\n", + "dois_by_year = pub['breakdowns']['dois-by-issued-year']\n", + "f = pd.DataFrame(dois_by_year)\n", + "f.columns = ['year','dois']\n", + "dois_sorted_by_year = f.sort_values(['year','dois'])\n", + "dois_sorted_by_year.plot.bar(x='year',y='dois',figsize=(50, 7))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "A publisher record also contains a useful summary of the member's metadata and the Crossref services that they participate in.\n", + "\n", + "Let's look at what percentage of their metadata includes certain information:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "coverage = [[key,float(pub['coverage'][key])*100] for key in pub['coverage'].keys()]\n", + "coverage" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "f = pd.DataFrame(coverage)\n", + "f.columns = ['metadata','coverage']\n", + "f.plot.barh(x='metadata',y='coverage')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "Now let's see what Crossref services they participate in:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "participation = [[key,pub['flags'][key]] for key in pub['flags'].keys()]\n", + "f = pd.DataFrame(participation)\n", + "f.columns = ['service','paticipation']\n", + "f" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Fun with facets\n", + "\n", + "### ORCID support" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from crossref.restful import Works\n", + "works = Works()\n", + "r = works.filter(has_orcid='true').facet('publisher-name',10)\n", + "orcid_support = [[key,r['publisher-name']['values'][key]] for key in r['publisher-name']['values'].keys()]\n", + "f = pd.DataFrame(orcid_support)\n", + "f.columns = ['publisher','orcids']\n", + "f.plot.barh(x='publisher',y='orcids')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Zika publications" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "r = works.query(title='Zika').facet('publisher-name',10)\n", + "zika_publications = [[key,r['publisher-name']['values'][key]] for key in r['publisher-name']['values'].keys()]\n", + "f = pd.DataFrame(zika_publications)\n", + "f.columns = ['publisher','publications']\n", + "f.plot.barh(x='publisher',y='publications')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Some other resources\n", + "\n", + "### Types" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from crossref.restful import Types\n", + "types = [type['label'] for type in Types().all()]\n", + "types.sort()\n", + "types" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Journals" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from crossref.restful import Journals\n", + "journals = Journals()\n", + "journals.count()\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "journal = journals.journal('0028-0836')\n", + "journal" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## A slight digression to discuss testing debugging queries\n", + "\n", + "**TIP** One of the cool things about the library we are using, is that you can easily see the REST API URIs that it generates for queries you make to the API. To do this, you simply ask for the URL of the query in question. So, for example- if you want to see the API call for the code we used for asking for the number of Crossref DOIs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from crossref.restful import Works\n", + "works = Works()\n", + "works.query('zika').url" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Using samples for testing and to save time" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from crossref.restful import Works\n", + "works = Works()\n", + "zika_sample = [work for work in works.query('zika').sample(10)]\n", + "zika_sample" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "## Some Jisc Examples\n", + "\n", + "### Notify institutions of co-authored works" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "Works records indexed _today_ that have affiliation data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "import datetime\n", + "works=Works()\n", + "today = datetime.date.today().isoformat()\n", + "works_with_affiliations = [w for w in works.filter(from_online_pub_date=today, has_affiliation='true')]\n", + "print(len(works_with_affiliations))\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's look for publications with a publication date of today and a particular affiliation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "import datetime\n", + "affiliation=\"Harvard\"\n", + "today = datetime.date.today().isoformat()\n", + "recent_affiliation_pubs = works.filter(from_online_pub_date=today).query(affiliation=affiliation)\n", + "recent_affiliation_pubs.count()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's look at the first record..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "outputs": [], + "source": [ + "next(iter(recent_affiliation_pubs))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "button": false, + "collapsed": true, + "new_sheet": false, + "run_control": { + "read_only": false + } + }, + "source": [ + "#### A digression on organizational identifiers...\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Integrate funding data into funder policy service" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "from crossref.restful import Funders\n", + "funders=Funders()\n", + "funder_name=\"NIH\"\n", + "funder_id=next(iter(funders.query(funder_name)))['id']\n", + "funder_id" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "import datetime\n", + "works=Works()\n", + "today = datetime.date.today().isoformat()\n", + "works_with_funding_data = [w for w in works.filter(from_online_pub_date='2017-01-01', funder=funder_id)]\n", + "print(len(works_with_funding_data))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "collapsed": true + }, + "outputs": [], + "source": [ + "next(iter(works_with_funding_data))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### " + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.1" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/ad_hoc_deposits.md b/deprecated/ad_hoc_deposits.md similarity index 100% rename from ad_hoc_deposits.md rename to deprecated/ad_hoc_deposits.md diff --git a/deprecated/archive_query_api.md b/deprecated/archive_query_api.md new file mode 100644 index 0000000..61eba98 --- /dev/null +++ b/deprecated/archive_query_api.md @@ -0,0 +1,166 @@ +Archive Arrangement Query API +============================= + +## Change History + +| Date | Changes | +|------|---------| +| 2014-01-13 | Initial version | +| 2014-02-04 | Fix some typos | + +## Background + +CrossRef metadata can record the archive arrangement for a piece of content. For example, +a publisher may deposit copies of their published content with an archive organisation +such as CLOCKSS or LOCKSS and record this fact in metadata deposited to CrossRef. Each +DOI record can maintain archive arrangement information separately: + + + + ... + + + + + + + ... + + + +While the archive arrangement information in CrossRef metadata records a publisher's +intent to archive a certain piece of content with a particular organisation, it does +not confirm the existence of an archive copy. A third party wishing to validate the +publisher's claim of an archive arrangement must query the archive organisation +for the archive state of a piece of content. + +To faciliate archive arrangement verification, a common query API is proposed that +allows any third party to look up the archive state of a piece of content identified +by a DOI. + +## DOI Query Path + +A query API implementor must provide a publicly-accessible DOI query path. The path must accept a DOI as a query parameter named `doi` to a HTTP path `/doi/status`: + + http://anarchive.org/doi/status?doi={DOI} + +### DOI Syntax + +The implementor must handle the issues of embedding DOIs within HTTP paths and query strings. +Refer to the `Numbering` section of the [DOI Handbook](http://www.doi.org/doi_handbook/2_Numbering.html) for a detailed discussion of the syntax of DOIs. However, the important points are: + +- DOIs are case insensitive +- DOIs can contain characters that are normally reserved characters within the HTTP specification + +## JSON Response + +The response to requests to the DOI query path should always be in the JSON representation +mentioned below, regardless of the representation requested by the client. Therefore, the +client's `Accept` header is always ignored, and the only valid `Content-Type` response header +is `application/json`, or `application/json; charset=UTF-8`, where the charset value may +be set to any value appropriate for the response body. + +A valid JSON response contains a `status`, `message` and `doi`: + +| Name | Optional? | Value | Description | +|------|-----------|-------|-------------| +| status | No | HTTP status code | Must match the HTTP response status code. | +| message | No | Any string | A human readable error or status message. May be empty. | +| doi | No | A DOI without any adornment (`doi:` etc.) e.g. `10.5555/12345678` | The query DOI an archive response refers to. | + +### Archived Copies + +A JSON response may also contain the archive state one or more archive copies. No +`copies` element, or an empty `copies` element, indicates that no archive copy has been +received for the query DOI. + +Each `copies` entry may contain: + +| Name | Optional? | Value | Description | +|------|-----------|-------|-------------| +| received_at | No | A valid [ISO 8601](http://en.wikipedia.org/wiki/ISO_8601) date or date and time string | Date or date and time at which a copy of content was received for archiving. | +| state | No | `dark` or `light` | Whether an archive copy is light or dark. | +| location | Yes | Any URL | Publicly accessible download location for the archive copy. | +| content_version | Yes | `am` or `vor` | Indicates that the archive copy is specifically a copy of the author accepted manuscript (`am`) or publisher version of record (`vor`). | +| content_type | Yes | Any valid [media type](http://en.wikipedia.org/wiki/Internet_media_type) | The content type of the archive copy. | + +### Response Headers + +A HTTP response to the DOI query path may contain any headers, but must include a `Content-Type` +header, and in the case of a redirect must include a `Location` header. + +### Response Examples + +Response for a DOI whose content has been received by the archive, but is currently dark: + + { + "status": 200, + "message": "", + "doi": "10.5555/12345678", + "copies": [ + { + "received_at": "2014-01-13T12:24Z", + "state": "dark", + "content_version": "am", + "content_type": "application/pdf" + }, + { + "received_at": "2014-01-13T12:24Z", + "state": "dark", + "content_version": "vor", + "content_type": "text/xml" + } + ] + } + +Response for a DOI whose content has not been received by the archive: + + { + "status": 200, + "message": "" + "doi": "10.5555/12345678", + "copies": [] + } + +Response for a DOI whose content has been received by the archive and for which there +has been a trigger event, making the content become light: + + { + "status": 200, + "message": "", + "doi": "10.5555/12345678", + "copies": [ + { + "received_at": "2014-01-13T12:24Z", + "state": "light", + "content_version": "am", + "content_type": "application/pdf", + "location": "http://anarchive.org/content/am/10.5555/12345678.pdf" + }, + { + "received_at": "2014-01-13T12:24Z", + "state": "light", + "content_version": "vor", + "content_type": "text/xml", + "location": "http://anarchive.org/content/vor/10.5555/12345678.xml" + } + ] + } + +An error response: + + { + "status": 504, + "message": "Upstream service is currently unavailable.", + "doi": "10.5555/12345678" + } + +### Response Codes + +Responses to the DOI query path should incorporate all appropriate HTTP error codes. +Clients should expect to receive any valid HTTP response code, including redirects +and errors. In the case of redirects, clients should follow the `Location` header. +In the case of error codes, clients should implement a reasonable decaying retry +mechanism. + +Clients are not expected to deal with any form of rate-limiting HTTP headers. diff --git a/deprecated/deposit_api.md b/deprecated/deposit_api.md new file mode 100644 index 0000000..d81dbdb --- /dev/null +++ b/deprecated/deposit_api.md @@ -0,0 +1,192 @@ +# CrossRef RESTful Deposit API (Alpha) + +## Versioning + +| Date | Version | Changes | +|------|---------|---------| +| 2014-03-10 | v1 | Initial version | +| 2015-01-15 | v2 | Mention PDF deposits | + +Note: this code is used by the Open Journals System (OJS) integration. It is not intended for use by other Crossref members, who should use the production deposit system at http://doi.crossref.org + +## Background + +CrossRef provides a deposit mechanisms for our members and manuscript system vendors. +The current deposit mechanism provides a URL end-point that can accept deposits of +CrossRef metadata. However, this mechanism has some deficiencies, including: + +- an inability to programatically track the status of a deposit +- an outdated method of returning deposit results to a user (e-mail responses containing + success or failure notices) +- no way of programatically querying for a historic list of deposits +- no way of programatically retrieveing previously deposited XML. + +This document proposes a RESTful deposit API that attempts to address deficiencies +within the current CrossRef XML deposit mechanism. + +## Extension to the CrossRef REST API + +This document provides an extension to the CrossRef REST API, described +[here](https://github.com/CrossRef/rest-api-doc/blob/master/funder_kpi_api.md). It uses the same +API versioning scheme, JSON response format and concepts of singular and listy +responses. The base URI for this API is: + + http://api.crossref.org + +However, the routes described in this documentation must be accessed over HTTPS: + + https://api.crossref.org + +## Authorization + +All routes described in this document require authentication using CrossRef member +credentials. These must be supplied on each requests using the HTTP basic authentication +scheme. Member credentials are provided in a HTTP header as described in +[RFC2617](https://www.ietf.org/rfc/rfc2617.txt). To create an `Authorization` header: + +1. Create a string, "username:password" +2. Base64 encode the username, password string +3. Insert this bas64 string into an `Authorization` header: + + Authorization: Basic {base64 encoded user:password pair} + +## Making a Deposit + + POST /deposits + +Requests to `/deposits` must be authenticated and made over HTTPS. + +Deposits are made by POSTing the contents of a deposit to `/deposits`. If initial +checks against the deposit (for example, XML validation) succeed, a `303` redirect +will be returned with a `Location` header defining the deposit status query link: + + HTTP/1.1 303 See Other + Location: /deposits/1234-5678-1234-5678 + +Making a deposit with cURL: + + $ curl -i -H "Content-Type: application/vnd.crossref.deposit+xml" -u username:password --data-binary @my-deposit.xml https://api.crossref.org/deposits + +Deposits are modified slightly from the submitted form. This is to faciliate +some of the features of this API. Deposit e-mail addresses are changed to an +e-mail address where deposit notifications can be intercepted by this API. +Deposit `batch ID`s are changed to match the deposit ID given out by the API. + +Deposits may be made where the e-mail and `batch ID` are left blank. However, the +elements themselves must be present to pass schema validation. + +| XPath | Content change | +|-------|----------------| +| //head/doi_batch_id | Changed to match deposit resource ID (as in `/deposits/{ID}`) | +| //head/email_address | Changed to `labs-notifications@crossref.org` | + +### Specifying a Deposit Content Type + +Deposits POSTed to `/deposits` _must_ specify a deposit content type. This is done +by setting a `Content-Type` header in requests to `/deposits`. It _must_ be set to +one of: + +| Content Type | Description | +|--------------|-------------| +| application/vnd.crossref.deposit+xml | [Full deposit](http://doi.crossref.org/schemas/crossref4.3.4.xsd) | +| application/vnd.crossref.partial+xml | [Partial (resource) deposit](http://doi.crossref.org/schemas/doi_resources4.3.2.xsd) | +| application/pdf | PDF deposit for citation extraction | + +### Specifying a Ping Back URL + + POST /deposits?pingback={url_encoded_url} + +The deposit API can return information on completed or failed deposits to a +user by making a HTTP request to a per-deposit defined URL. Use the `pingback` +parameter to specify a URL encoded ping back URL. The user must specify a URL +that will return a `200` HTTP status response on successfully accepting ping +back information. If the deposit API receives any other HTTP status, or if the +URL is unaccessible for any reason, the API will make repeated requests to the +URL, following a pattern of exponential back off. The maximum number of request +attempts the API will make is undefined. + +### Depositing a Test Deposit + + POST /deposits?test=true + +Set the `test` paramter to `true`, `t` or `1` (any other value is considered false) +to make a test deposit. Such a deposit will go through the normal deposit process +but its contents will not be made live. By default, `test` is false. + +## Listing Previous Deposits + + GET /deposits + +Requests to `/deposits` must be authenticated and made over HTTPS. + +List previous deposists. The list of deposits can be paged with the `rows` and +`offset` query parameters (see the +[CrossRef REST API documentation](https://github.com/CrossRef/rest-api-doc/blob/master/funder_kpi_api.md)). + +The `/deposits` route also specifies some filters: + +| Filter | Possible Values | Description | +|--------|-----------------|-------------| +| status | One of `submitted`, `failed` or `completed` | Return only those deposits with given status | +| from-submitted-time | Date | Return only those deposits that were deposited on or after the given date | +| until-submitted-time | Date | Return only those deposits that were deposited on or before the given date | +| doi | DOI | Return only those deposits that deposited against the given DOI | +| test | One of `true`, `t`, `1`, `false`, `f`, `0` | Return only those deposits that are or are not test deposits. By default all deposits, both test and live, are returned. | +| type| Content Type | Return only those deposits with the given content type (mime type) | + +Dates should be of the form `YYYY-MM-DD`, `YYYY-MM` or `YYYY`. + +For more information on filters, including how to specify them, see the filters section +in the [CrossRef REST API documentation](https://github.com/CrossRef/rest-api-doc/blob/master/funder_kpi_api.md). + +## Querying the Status of a Deposit + + GET /deposits/{id} + +Requests to `/deposits/{id}` must be authenticated and made over HTTPS. + +Retrieve the status of a deposit, including submission status and details of any +errors by making a GET request to the redirect URL returned when making a deposit. + +### Error Types + +| Major Type | Minor Type | +|------------|------------| +| submission | added-with-conflict | +| | version-older-than-last | +| | title-deleted-by-crossref-admin | +| | npe | +| | unique-doi | +| permission | not-your-handle | +| | not-your-prefix | +| | not-your-title | +| | not-your-issn | +| xml-syntax | malformed | +| | schema-validation-fail | +| | bad-character-data | +| | content-in-prolog | +| | bad-character-encoding | +| xml-content | differing-prefixes | +| | invalid-year | +| | submission-version-is-null | + +## Querying the Deposit Status of a DOI + + GET /deposits?filter=doi:{url_encoded_doi} + +Get the deposit history of a DOI. Lists all deposits for a DOI. Useful to combile with +other filters, such as status. For example, check if a given DOI has had a successful +deposit. If it has, those deposits will be returned, if not, no results will return: + + GET /deposits?filter=doi:{url_encoded_doi},status:completed + +## Retrieving Original Deposit Objects + + GET /deposits/{id}/data + +Requests to `/deposits/{id}/data` must be authenticated and made over HTTPS. + +The original deposit XML may be retrieved using the `/deposits/{id}/data` route. +The response `Content-Type` will match the content type specified when depositing +the XML. + diff --git a/deprecated/resource_intended_use_hints.md b/deprecated/resource_intended_use_hints.md new file mode 100644 index 0000000..4d3f2ca --- /dev/null +++ b/deprecated/resource_intended_use_hints.md @@ -0,0 +1,205 @@ + +# Resource Intended Use Hints Through CrossRef Metadata + +## Version History + +- V1 2014-01-08, Initial draft +- V2 2014-04-23, Add examples +- V3 2014-04-24, Add 'any' collection property type +- V4 2014-04-25, Incorporate suggestions from Evan Owens +- V5 2014-05-27, Rename 'any' to 'unspecified' + +## The Problem + +The CrossRef schema does not give enough information to users about which resources are appropriate for different user classes and/or applications. For example, a publisher might want to be able to provide different lists of resources for the following use cases: + +- A human subscriber accessing the content to read +- A human non-subscriber accessing the content to read +- A human from an entitled funding agency accessing the content to read +- A bot run by a subscriber accessing the content to TDM +- A bot run by a non-subscriber accessing the content to TDM +- A bot run by an entitled funding agency accessing the content to TDM +- A bot run by a subscriber accessing the content under a syndication agreement +- A bot run by a non-subscriber accessing the content under a syndication agreement +- A bot run by an entitled funding agency accessing the content under a syndication agreement + +In the above cases, the content in question might or be either "restricted" (e.g. requiring a subscription to access) or "unrestricted" (open access, free). Furthermore, any restrictions may be time-boxed via embargoes. This effectively means that their are sub-cases where the desired behavior might change depending on when the user (bot or human) tries to access the content. + + So for the above use cases the publisher might want the user to select a resource that points to either: + +- A human readable landing page on the publisher's main website. +- A human readable representation of the VOR (e.g. PDF, HTML, etc.) on the publisher's main website. +- A human readable representation of the AAM (e.g. PDF, HTML, etc.) on the publisher's main website. +- A machine readable representation of the VOR (e.g. XML, Plain Text) on the publisher's TDM server +- A machine readable representation of the AAM (e.g. XML, Plain Text) on the publisher's TDM server +- A human readable representation of the VOR (e.g. PDF, HTML, etc.) on the publisher's syndication server. +- A human readable representation of the AAM (e.g. PDF, HTML, etc.) on the publisher's syndication server. + + +Selecting the appropriate resource would seem to hinge on three criteria: + +- Is the agent a human or a bot? +- Is the content under restricted access? For example, is a subscription needed?, is the content currently under embargo? +- is the agent entitled at this particular time? +- What is the desired application? Viewing, indexing, syndication, tdm? + + +We have used the term "BAV" to describe the "Best Available Version."- where "version" refers to the [JAV](http://www.niso.org/publications/rp/RP-8-2008.pdf) version of a document (i.e. AAM, VOR). The idea is that some versions of the document (e.g. VOR) may be under restricted access (e.g. under embargo, require a subscription) and are thus "unavailable" to certain classes of user, while other versions of the document (e.g. AAM) are not under restrictions and are thus "available" to the respective class of user. + +But the above use-cases introduce another facet that may need to be considered when selecting a resource- the "Appropriate Application Resource" (AAR). That is, the use-cases above imply that publishers might want to direct users at resources pointing to systems that are optimised to support a particular application. For example, a publisher might have a server that is reserved for search engines, another for syndication partners, etc. + +## Providing Hints For Selecting a BAV + +The normative HTTP mechanism to handle these requirements technically, would be to use content negotiation. That is, an agent could request a resource using a GET and be shown the options for different available versions of the resource in the response headers. The user agent would then select the BAV/AAR from the options presented in the publisher's response. In other words, ideally the system would work the same way that content negotiation works for languages or for browser feature sets. + +However, CrossRef recognises that, in the short/medium term, it would be impractical to expect all of our members to integrate such content negotiation in their publishing sites. Thus we have added the ability for publisher to record attributes of the resource that one would normally use content negotiation for in selecting a BAV. Specifically, we have added the following two attributes to the resource element: + +- mime_type +- content_version + + This allows the publisher to record resources for all the combinations of representation (PDF,HTML, Etc.) and version (VOR, AAM) that the publisher supports. In turn, the user can use these attributes to determine a-priori (i.e. without having to do an HTTP query for content negotiation) which versions of the content **exist**. However, it does give the user any information to a-priori determine which versions of the content are likely to be **available**. In other words, just because the HTTP URI of a particular representation of a particular version of the content is listed, that doesn't mean that the user is guaranteed to be able to access the resource. The resource might require a subscription, it might be under embargo, etc. + + The addition of the `AccessIndicators` section to the CrossRef schema provides a mechanism that allows publishers to provide resource-specific information about the *purported accessibility* of the content pointed to by the resource using extended versions of the NISO-recommended `` and `` elements. The `//AccessIndicators/license_ref` element has been extended with an "applies_to" attribute which can be used to map specific licenses (identified by HTTP URIs) to specific resources based on their content_version. So, for example, this enables a publisher to link one license to the resource that points to the VOR of the content and a different license to the resource that points to the AAM of the content. Similarly, the publisher will be able to map the `//AccessIndicators/free-to-read` element to specific resources identified by the content_version attribute in order to indicate that the content pointed to by a resource is available to "view" without any payment or registration preconditions. + + The `` element can also include a `start_date` attribute, which can be used to indicate when a license takes effect. This, in turn, can be used to record simple embargo information and to map that information to a specific version of the `` identified by the `content_type` attribute. + + For example, the following records that the VOR of the content is under a proprietary license from its date of publication on February 3, 2014 and that it is under a CC-BY license a year later on February 3, 2015 **and* that the AAM of the content is available from the date of publication under a CC-BY license: + + + http://www.psychoceramics.org/fulltext/vor/10.5555/12345678 + http://www.psychoceramics.org/fulltext/am/10.5555/12345678 + + http://www.psychoceramics.org/license_v1.html + http://creativecommons.org/licenses/by/3.0/deed.en_US + http://creativecommons.org/licenses/by/3.0/deed.en_US + + +The addition of the `AccessIndicators` allows the publisher to indicate not only what versions of the resource **exist** , but also for the publisher to provide a mechanism for the user to determine a-priori (without doing an HTTP get for content negotiation) which versions of the resource are likely to be **available**. In short, the combination of `//AccessIndicator` elements and `//collection/resource` elements allows the user to determine the **BAV**. + +## Providing Hints For Selecting an AAR + +But this still leaves us with the question of how to provide the publisher to provide the user with hints as to which version of a resource is appropriate for a particular application. So-far we have identified the following applications: + + 1. Viewing (e.g. on the publisher's web site, via landing page, etc.) + 2. Indexing (e.g. for search engines, etc.) + + + Currently we already have three existing mechanisms for labeling different 'applications' of the resource. + + The first mechanism is implicit. That is `/doi_data/resource` attributes have normatively recorded a link to the publisher landing page, from which the human reader can get a "viewable" version of the resource. + + The second two are encoded in the the `property` attribute of the `` container element. We currently support several values for this property, including `crawler-based` which was intended to be used to identify resources for search engines (i.e. indexing) and `text-mining` which has been introduced for resources designed for TDM bots. What seems to be missing is a value for "syndication." If we were to add a `syndication` attribute value then publishers would be able to record `/collection/resource` elements that meet all the use-cases identified above. + +## Putting It All Together + + If we combine the concept of the BAV and the AAR, we can meet all of the requirements outlined at the start of this document. + + The following example for the sample DOI 10.5555/12345678 says: + +1. The VOR is under a year embargo (proprietary license) from the date of publication +2. After the year, the VOR is available under a CC-BY license +3. The AAM is available under a CC-BY license from the date of publication +4. The humans clicking on the link will be directed to the landing page on the publisher's site +5. That robots requesting the content for a search engine (VOR only) should look for the PDF on the server `docstore.psychoceramics.org`. Access will be restricted as per the publisher's access control system. +6. That robots requesting the content for syndication (AAM or VOR) should look for the PDF on the server `docstore.psychoceramics.org`. Access will be restricted as per the publisher's access control system. +7. That robots requesting the content for text and data mining (AAM or VOR) should look for the XML on the server `marklogic.psychoceramics.org`. Access will be restricted as per the publisher's access control system. + +### XML Example + + + 10.5555/12345678 + 20121025161509 + + http://psychoceramics.labs.crossref.org/10.5555-12345678.html + + + + + http://www.psychoceramics.org/license_v1.html + http://creativecommons.org/licenses/by/3.0/deed.en_US + http://creativecommons.org/licenses/by/3.0/deed.en_US + + + + http://marklogic.psychoceramics.org/fulltext/vor/10.5555/12345678.xml + + + http://docstore.psychoceramics.org/fulltext/am/10.5555/12345678.pdf + + + + + + http://docstore.psychoceramics.org/fulltext/vor/10.5555/12345678.pdf + + + http://docstore.psychoceramics.org/fulltext/am/10.5555/12345678.pdf + + + + + + + http://docstore.psychoceramics.org/fulltext/vor/10.5555/12345678.pdf + + + + + http://docstore.psychoceramics.org/fulltext/vor/10.5555/12345678.pdf + + + + +## An 'unspecified' Use Type + +Some members will not want to provide intended use hints. These members can avoid repetition, and ignore intended use hints +altogether by using the 'unspecified' collection property: + + + 10.5555/12345678 + 20121025161509 + + http://psychoceramics.labs.crossref.org/10.5555-12345678.html + + + + + http://www.psychoceramics.org/license_v1.html + http://creativecommons.org/licenses/by/3.0/deed.en_US + http://creativecommons.org/licenses/by/3.0/deed.en_US + + + + http://docstore.psychoceramics.org/fulltext/vor/10.5555/12345678.xml + + + http://docstore.psychoceramics.org/fulltext/vor/10.5555/12345678.pdf + + + http://docstore.psychoceramics.org/fulltext/am/10.5555/12345678.pdf + + + +## Open Questions + +We have suggested that the following values be used for AAR + +- crawler-based (search engines) +- text-mining +- syndication +- unspecified + +Are they enough? + +## Limitations of Proposed System + +The hints provided in CrossRef metadata are just that, hints. There is nothing that CrossRef can do to force users to select an particular resource appropriate to their application. Thus, it is still going to be the responsibility of the publisher to check requests and to route them appropriately. + +## Access Control + +Intended use hints do not imply the existence or lack of access control. A publisher may choose to, or choose not to incorporate access control into links intended for text and data mining or content syndication. This follows the same situation for DOI resolution URLs, where some publishers place access control on some article landing pages. The decision to use or not use access control is left to the publisher, and will most likely be made after consideration of content licensing, collection of APCs, journal business model, government policy and participation in various industry-wide initiatives. + +## Removal of 'tdm' content_version + +This proposed system negates the need for a 'tdm' content version. An implementation of this proposal will also remove the 'tdm' option from +the content_version resource attribute. diff --git a/deprecated/rest_api_koans.md b/deprecated/rest_api_koans.md new file mode 100644 index 0000000..3d4df90 --- /dev/null +++ b/deprecated/rest_api_koans.md @@ -0,0 +1,187 @@ +# CrossRef API Koans + +## Version History + +- V1: 2014-02-26, first draft. +- V2: 2014-02-28, added license examples +- V3: 2017-04-27, replace license route examples with facet/filter examples + +## Overview + +The following short examples show how the CrossRef REST API can be used to issue sophisticated queries against the CrossRef system. + +## Finding out what is in the CrossRef system + +You might start by wondering how much and what kinds of data exist in the CrossRef system. + +### How many DOI records does CrossRef have? + + http://api.crossref.org/works?rows=0 + +### What content types does CrossRef have? + + http://api.crossref.org/types + +### How many journal article DOIs does CrossRef have? + + http://api.crossref.org/types/journal-article/works?rows=0 + +### How many report DOIs does CrossRef have? + + http://api.crossref.org/types/report/works?rows=0 + +But eventually you will probably want to start looking at metadata records + +### Example 1 + + TBD + +### Example 2 + + TBD + +And then you will want to start looking for metadata records that contain specific terms + +### Example 1 + + TDB + +### Example 2 + + TBD + +So let's look at specific records + +### Example 1 + + TBD + +### Example 2 + + TBD + +Interesting. There is license information in there and full text links. + +## How many works have license information? + + http://api.crossref.org/works?filter=has-license:true&rows=0 + +## Get first 25 works that have a license + + http://api.crossref.org/works?filter=has-license:true&rows=25&offset=0 + +## Get second 25 works that have a license + + http://api.crossref.org/works?filter=has-license:true&rows=25&offset=25 + +## How many license types are there? + + http://api.crossref.org/licenses?rows=0 + +## How many works have a CC-BY license? + + http://api.crossref.org/works?rows=0&filter=license.url:http://creativecommons.org/licenses/by/3.0/ + +## See how many works have funder information + + http://api.crossref.org/works?filter=has-funder:true&rows=0 + +## See how many Hindawi works have funder information + + http://api.crossref.org/members/98/works?filter=has-funder:true&rows=0 + + or + + http://api.crossref.org/works?filter=member:98,has-funder:true&rows=0 + +## See how many Elsevier works have funder information + + http://api.crossref.org/member/78/works?filter=has-funder:true&rows=0 + +## How many member publishers CrossRef has + + http://api.crossref.org/members?rows=0 + +## First 25 members: + + http://api.crossref.org/members?rows=25&offset=0 + +## Second 25 members: + + http://api.crossref.org/members?rows=25&offset=25 + +## Overview of Hindawi's particpation in CrossRef + + http://api.crossref.org/members?query=hindawi + +though once you have a member ID ('98', in this case), you should use that instead. So above is the same as: + + http://api.crossref.org/members/98 + +## Overview of Elsevier's particpation in CrossRef + + http://api.crossref.org/members?query=elsevier + +### same as: + + http://api.crossref.org/members/78 + +## How many works does Elsevier have? + + http://api.crossref.org/members/78/works?rows=0 + +## First 25 Elsevier works + + http://api.crossref.org/members/78/works?rows=25&offset=0 + +## Second 25 Elsevier works + + http://api.crossref.org/members/78/works?rows=25&offset=25 + +## How many Elsevier works have license links? + + http://api.crossref.org/members/78/works?filter=has-license:true&rows=0 + +## How many Elseveir works have full text links + + http://api.crossref.org/members/78/works?filter=has-full-text:true&rows=0 + +## What license types does Elsevier support? + + http://api.crossref.org/works?facet=license:*&filter=member:78&rows=0 + +## Overview of Hindawi's particpation in CrossRef + + http://api.crossref.org/members?query=hindawi + +### same as: + + http://api.crossref.org/members/98 + +## How many works does Hindawi have? + + http://api.crossref.org/members/98/works?rows=0 + +## How many Hindawi works have license links? + + http://api.crossref.org/members/98/works?filter=has-license:true&rows=0 + +## How many Hindawi works have full text links + + http://api.crossref.org/members/98/works?filter=has-full-text:true&rows=0 + +## What license types does Hindawi support? + + http://api.crossref.org/works?facet=license:*&filter=member:98&rows=0 + +## What license type does the journal with a particular ISSN support + + http://api.crossref.org/works?facet=license:*&filter=issn:2090-8091 + +## What licenses does a researcher with a particular ORCID publish under + + http://api.crossref.org/works?facet=license:*&filter=orcid:0000-0003-1340-5202 + + + + diff --git a/deprecated/rest_api_tour.md b/deprecated/rest_api_tour.md new file mode 100644 index 0000000..0a6606d --- /dev/null +++ b/deprecated/rest_api_tour.md @@ -0,0 +1,185 @@ +# CrossRef TDM API Tour + +## Version History + +- V1: 2014-04-18, first draft. +- V2: 2014-11-10, edits for CR workshop +- v3: 2017-04-27, fix license examples + +## Overview + +The following short examples show how the CrossRef REST APIs can be used to provide CrossPublisher support for TDM applications. This demonstration is a bit of a paradox- it is targeted at a non-technical audience who wants to understand a little but about the technical infrastructure that researchers can leverage for TDM applications. + +## Technical notes + +In many cases, you can simply paste the example URIs into a browser's URL box, but if you want to see the resulting JSON formatted correctly, then we recommend that you install one of the following plugins: + +- Firefox Users: [JSONView](http://jsonview.com/) +- Chrome Users: [JSONView](https://chrome.google.com/webstore/detail/jsonview/chklaanhfefbnpoihckbnefhakgolnmc) + +## A fake problem + +A researcher is interested in text mining all a set of literature mentioning the word "blood" + +## Identifying a Corpus + +Once a researcher has identified a problem, they need to identify the corpus that they want to explore using TDM. This corpus might be small (a few hundred items) or it might be large (millions of items). But in either case, it is likely that the corpus will span multiple publishers in multiple countries and with multiple business models. + +CrossRef is not a discovery service. We expect that many third parties, both for profit and non-profit, will provide researchers with the discovery tools needed in order to identify a the Corpus of literature that they wish to mine for their particular application. As long as these third-party services allow the researcher to easily download the DOIs of the content they wish to mine, then + +Having said that, CrossRef does provide some primitive metadata-based discovery tools which we will use to demonstrate a process for identifying a Corpus. + +## Finding out what is in the CrossRef system + +How many members does CrossRef have? + + http://api.crossref.org/members?rows=0 + +Who are they? Let's look at first 100 members + + http://api.crossref.org/members?rows=100 + +And the second 100 members + + http://api.crossref.org/members?rows=100&offset=100 + +How many DOI records does CrossRef have? + + http://api.crossref.org/works?rows=0 + +What content types does CrossRef have? + + http://api.crossref.org/types + +How many journal article DOIs does CrossRef have? + + http://api.crossref.org/types/journal-article/works?rows=0 + +How many proceedings articles DOIs does CrossRef have? + + http://api.crossref.org/types/proceedings-article/works?rows=0 + +But eventually you will probably want to start looking at metadata records. Lets search for records that have the word "blood" in the metadata and see how many there are. + + http://api.crossref.org/works?query=%22blood%22&rows=0 + +Lets look at some of the results. + + http://api.crossref.org/works?query=%22blood%22& + +Now lets look at one of the records + + http://api.crossref.org/works/10.1155/2014/413629 + +Interesting. The record has ORCIDs, fulltext links, and license links. You need license and fulltext links to text and data mine the content. + +How many works have license information? + + http://api.crossref.org/works?filter=has-license:true&rows=0 + +How many license types are there? + + http://api.crossref.org/licenses?rows=0 + +How many works have a CC-BY license? + + http://api.crossref.org/works?rows=0&filter=license.url:http://creativecommons.org/licenses/by/3.0/ + + +Ok, lets see how many records with the word "blood" in the metadata have license information and full text links + + http://api.crossref.org/works?filter=has-license:true,has-full-text:true&query=blood&rows=0 + +Let's download the results and download the content locally to TDM + +We could just get them all: + + http://api.crossref.org/works?filter=has-license:true,has-full-text:true&query=blood&rows=884 + +But for the purposes of this demo, let's subdivide them by publisher. + + +First let's get a sample of Elsevier titles: + +What is Elsevier's CrossRef member id number? + + http://api.crossref.org/members?query=elsevier + +Now what DOIs do Elsevier have that match our criteria? + + http://api.crossref.org/members/78/works?filter=has-license:true,has-full-text:true&query=blood&rows=50 + +Now what is Hindawi's CrossRef member id number? + + http://api.crossref.org/members?query=hindawi + +Now lets get some Hindawi articles that match the criteria: + + http://api.crossref.org/members/98/works?filter=has-license:true,has-full-text:true&query=blood&rows=50 + + + + +## Other examples + +See how many works have funder information + + http://api.crossref.org/works?filter=has-funder:true&rows=0 + +See how many Hindawi works have funder information + + http://api.crossref.org/members/98/works?filter=has-funder:true&rows=0 + + or + + http://api.crossref.org/works?filter=member:98,has-funder:true&rows=0 + +See how many Elsevier works have funder information + + http://api.crossref.org/works?filter=member:78,has-funder:true&rows=0 + + +Overview of Hindawi's particpation in CrossRef + + http://api.crossref.org/members?query=hindawi + + +Overview of Elsevier's particpation in CrossRef + + http://api.crossref.org/members?query=elsevier + +How many works does Elsevier have? + + http://api.crossref.org/members/78/works?rows=0 + +First 25 Elsevier works + + http://api.crossref.org/members/78/works?rows=25&offset=0 + +Second 25 Elsevier works + + http://api.crossref.org/members/78/works?rows=25&offset=25 + +How many Elsevier works have license links? + + http://api.crossref.org/members/78/works?filter=has-license:true&rows=0 + +How many Elsevier works have full text links + + http://api.crossref.org/members/78/works?filter=has-full-text:true&rows=0 + +What license types does Elsevier support? + + http://api.crossref.org/works?facet=license:*&filter=member:78 + +What license types does Hindawi support? + + http://api.crossref.org/works?facet=license:*&filter=member:98 + +What license type does the journal with a particular ISSN support + + http://api.crossref.org/works?facet=license:*&filter=issn:2090-8091 + +What licenses does a reasearcher with a particular ORCID publish under + + http://api.crossref.org/works?facet=license:*&filter=orcid:0000-0003-1340-5202 diff --git a/deprecated/scratch.md b/deprecated/scratch.md new file mode 100644 index 0000000..8b0b227 --- /dev/null +++ b/deprecated/scratch.md @@ -0,0 +1,153 @@ +# Content "Syndication" through CrossRef Metadata +## Current situation + +### Normative behavior for resolving CrossRef DOIs + +When CrossRef DOIs are resolved via a browser, they redirect to the HTTP URI recorded by the publisher in the element `doi_data/resource`. The URI recorded in `doi_data/resource` typically points to a "landing page. This landing page, in turn, typically presents: + +1. Human readable bibliographic metadata +2. Link(s) that enable one to acquire one or more human-readable full text representations of the resource (e.g. PDF, HTML, ePub) and/or one or more versions of the resource (e.g. AAM, VOR, etc.) + +Because the publisher controls the landing page, they can show/hide appropriate acquisition links according to their access control requirements and the status of the user viewing the landing page. + +Note that not all CrossRef DOIs behave this way, but the vast majority do. The above is normative behavior for CrossRef DOIs. + +Some aspects of this normative behavior are worth highlighting: + +1) The default behavior is designed for humans +2) The publisher system needs to make decisions about what the user is allowed to access from the landing page +3) This mechanism is currently used (abused?) by numerous text mining tools, etc. via screen scraping techniques as it is often the most convenient (only) way to get full text representations of the content identified by the DOI. + +### Extended CrossRef DOI resolution mechanisms + +#### "As-Crawled URLs" + +In order to better support the use of CrossRef DOIs by search engines, the CrossRef metadata schema was extended. Specifically, CrossRef members wanted to be able to: + +1. Direct search engines to platforms that were designed to handle the additional load imposed by search engines +2. Provide search engines with representations that were optimized for indexing (and, occasionally, which where degraded for "reading"). + + CrossRef introduced the `` element. The collection element is a container element that can include any number of "alternative" `` elements. Each `/collection/resource` element, in turn, includes a "crawler" attribute so that specific search engine crawlers could be directed at specific `resource` URIs. + +In order to use this mechanism, search engines needed to either: + +a) Make use of CrossRef's proprietary API for accessing the metadata for a DOI and looking up the alternative resources. +b) Subscribe to dumps of CrossRef's metadata. + +As such, no major search engine uses this mechanism for indexing CrossRef member web sites. Similarly, very few CrossRef members make use of the `collection/resource` element to support search engines. Of those that do record "as-crawled" URLs, none record separate URIs for different crawlers. + +A typical section for as-crawled resources looks like this: + + + + http://downloads.hindawi.com/journals/ijps/2013/435073.pdf + + + http://downloads.hindawi.com/journals/ijps/2013/435073.pdf + + + http://downloads.hindawi.com/journals/ijps/2013/435073.pdf + + + http://downloads.hindawi.com/journals/ijps/2013/435073.pdf + + + http://downloads.hindawi.com/journals/ijps/2013/435073.pdf + + + + +In 2007, the "crawler" attribute was extended to support the iParadigms crawler for CrossCheck, but again, very few publishers make use of it. + +#### Content Negotiation + +In 2011 CrossRef introduced content negotiation for CrossRef DOIs. This mechanism provides a standard way for tools to request machine-readable representations of a CrossRef DOI's metadata. It also encourages both publishers and users of CrossRef metadata to follow standard linked-data mechanisms for recording alternative and related resource representations in their metadata using HTTP URIs. + +#### Text and Data Mining (aka Prospect) + +In order to support text and data mining applications, CrossRef is encouraging publishers to has proposed using content negotiation and the current CrossRef metadata schema to support recording links to resource representations designed for text and data mining. + + + +But it is entirely up to the user as to how they make use of this information. + +This is a critical point, because even though we are providing 'hints' to the user on which version of the resource is likely to be available to them, there is nothing the CrossRef can do to force the user to make use of these hints. There is nothing we can do to stop an un-subscribed user from trying to retrieve the VOR of content that is restricted to subscribers. It is still entirely up to the publisher to appropriately **enforce** the guidelines that they have documented in the CrossRef metadata. + + + +"Appropriate Application Resource" + + + + + + + + + + + + + + + + + +crawler="subsciber" +crawler="public" +crawler="syndicator" + + +(TBD- note that multiple resolution slightly more complicated) + +This effectively means that doi_data/resource + + + means mediated access for humans + + + +In theory, this resource can point to "the thing itself" or to "something abou" + +we then have following sections: + +"as-crawled" for search engines + +"text-mining" misnomer- but set that aside for a bit + +What are things we want to distnguish: + +Batch v Query + +Human v Bot + +*minimally* they want to text mine what they can see- e.g. the PDF. So conflating this made sense- until Elsevier came around. + +OPDS has what they call "aquisition links" where rel attribute is set to a URI indicating what is being pointed to. They have the following types: + +http://opds-spec.org/acquisition/open-access for Open Access publications +http://opds-spec.org/acquisition/buy for publications that you can buy +http://opds-spec.org/acquisition/borrow for publications that you can borrow +http://opds-spec.org/acquisition/subscribe for publications that you can subscribe to +http://opds-spec.org/acquisition/sample to sample a publication +http://opds-spec.org/acquisition when none of the other values are appropriate or you don't have additional information + +These are problematic because they conflate various things. For example, something can CC-BY-SA , but paid for. For example: + +http://www.amazon.co.uk/Little-Brother-Cory-Doctorow-ebook/dp/B008TGKXWW/ref=sr_1_1?s=books&ie=UTF8&qid=1389085868&sr=1-1&keywords=Little+Brother + +And at the same time CC-BY-SA and for free: + +http://craphound.com/littlebrother/download/ + + + + + http://annalsofpsychoceramics.labs.crossref.org/fulltext/10.5555/515151.pdf + + + http://annalsofpsychoceramics.labs.crossref.org/fulltext/10.5555/515151.xml + + + + diff --git a/examples/full-crossmark.xml b/examples/full-crossmark.xml index e8c953e..f2ea129 100644 --- a/examples/full-crossmark.xml +++ b/examples/full-crossmark.xml @@ -4,14 +4,14 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ai="http://www.crossref.org/AccessIndicators.xsd" xmlns:fr="http://www.crossref.org/fundref.xsd" - xsi:schemaLocation="http://www.crossref.org/schema/4.3.4 + xsi:schemaLocation="http://www.crossref.org/schema/4.3.4 http://www.crossref.org/schema/deposit/crossref4.3.4.xsd"> 123456 1385985547 - Karl Ward - kward@crossref.org + Jane Bloggs + jbloggs@example.com creftest diff --git a/examples/partial-crossmark.xml b/examples/partial-crossmark.xml index 90163ec..6f609d4 100644 --- a/examples/partial-crossmark.xml +++ b/examples/partial-crossmark.xml @@ -4,13 +4,13 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ai="http://www.crossref.org/AccessIndicators.xsd" xmlns:fr="http://www.crossref.org/fundref.xsd" - xsi:schemaLocation="http://www.crossref.org/doi_resources_schema/4.3.2 + xsi:schemaLocation="http://www.crossref.org/doi_resources_schema/4.3.2 http://www.crossref.org/schema/deposit/doi_resources4.3.2.xsd"> 123456 - Karl Ward - kward@crossref.org + Jane Bloggs + jbloggs@example.com diff --git a/examples/partial-funders.xml b/examples/partial-funders.xml index f334f3f..b168c95 100644 --- a/examples/partial-funders.xml +++ b/examples/partial-funders.xml @@ -3,13 +3,13 @@ xmlns="http://www.crossref.org/doi_resources_schema/4.3.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:fr="http://www.crossref.org/fundref.xsd" - xsi:schemaLocation="http://www.crossref.org/doi_resources_schema/4.3.2 + xsi:schemaLocation="http://www.crossref.org/doi_resources_schema/4.3.2 http://www.crossref.org/schema/deposit/doi_resources4.3.2.xsd"> 123456 - Karl Ward - kward@crossref.org + Jane Bloggs + jbloggs@example.com diff --git a/examples/partial-licenses.xml b/examples/partial-licenses.xml index b7fd749..9676f8f 100644 --- a/examples/partial-licenses.xml +++ b/examples/partial-licenses.xml @@ -3,13 +3,13 @@ xmlns="http://www.crossref.org/doi_resources_schema/4.3.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ai="http://www.crossref.org/AccessIndicators.xsd" - xsi:schemaLocation="http://www.crossref.org/doi_resources_schema/4.3.2 + xsi:schemaLocation="http://www.crossref.org/doi_resources_schema/4.3.2 http://www.crossref.org/schema/deposit/doi_resources4.3.2.xsd"> 123456 - Karl Ward - kward@crossref.org + Jane Bloggs + jbloggs@example.com diff --git a/examples/partial-resources.xml b/examples/partial-resources.xml index b439e8e..7a74c1a 100644 --- a/examples/partial-resources.xml +++ b/examples/partial-resources.xml @@ -2,13 +2,13 @@ 123456 - Karl Ward - kward@crossref.org + Jane Bloggs + jbloggs@example.com diff --git a/funder_kpi_api.html b/funder_kpi_api.html deleted file mode 100644 index 2119984..0000000 --- a/funder_kpi_api.html +++ /dev/null @@ -1,675 +0,0 @@ - - - - - - funder_kpi_api - - - - - -
-

CrossRef APIs to support key performance indicators (KPIs) for funding agencies

- -

Version History

- -
    -
  • V1: 2013–09–08, first draft.
  • -
  • V2: 2013–09–24, reference platform deployed
  • -
  • v3: 2013–09–25, reworked filters. Added API versioning doc
  • -
  • v4: 2013–09–25, more filter changes.
  • -
  • v5: 2013–09–27, doc mime-type and message-type relationship
  • -
  • v6: 2013–10–01, updated sample & added examples with filters
  • -
  • v6: 2013–10–01, corrected warning date
  • -
  • v7: 2013–10–02, fixed typos
  • -
  • v8: 2013–10–17, updated warning. Added email address
  • -
- -

Background

- -

See the document, CrossRef metadata best practice to support key performance indicators (KPIs) for funding agencies, for background.

- -

Warning

- -

The API described here is in alpha. If you encounter problems with the API or the documentation, please report them to:

- -
-
- - - -

Overview

- -

The API is generally RESTFUL and returns results in JSON.

- -

The API will only work for CrossRef DOIs. You can test the registration agency for a DOI using the following convention:

- -

http://doi.crossref.org/doiRA/{doi}

- -

So testing the following CrossRef DOI:

- -

10.1037/0003-066X.59.1.29

- -

Will return the following result:

- -
[
-    {
-         DOI: "10.1037/0003-066X.59.1.29",
-         RA: "CrossRef"
-    }
-]
-
- -

If you use any of the API calls listed below with a non-CrossRef DOI, you will get a 404 HTTP status response with the message “non-CrossRef DOI.”

- -

Results Overview

- -

All results are returned in JSON. There are two general types of results:

- -
    -
  • Singletons
  • -
  • Lists
  • -
- -

The mime-type for API results is application/vnd.crossref-api-message+json

- -

Singletons

- -

Singletons are single results. Retrieving metadata for a specific identifier (e.g. DOI, ISSN, funder_identifier) typically returns in a singleton result.

- -

Lists

- -

Lists results can contain multiple entries. Searching or filtering typically returns a list result. A list has two parts:

- -
    -
  • Summary, which include the following information: - -
      -
    • status (e.g. “ok”, error)
    • -
    • message-type (e.g. “work-list” )
    • -
    • message-version (e.g. 1.0.0 )
    • -
  • -
  • Items, which will will contain the items matching the query or filter.
  • -
- -

Note that the “message-type” returned will differ from the mime-type. There are six message-types:

- -
    -
  • funder (singleton)
  • -
  • publisher (singleton)
  • -
  • work (singleton)
  • -
  • work-result-list (list)
  • -
  • funder-result-list (list)
  • -
  • publisher-result-list (list)
  • -
- -

Normally, and API list result will return both the summary and the items. If you want to just retrieve the summary, you can do so by specifying that the number of rows returned should be zero.

- -

Sort order

- -

If the API call includes a query, then the sort order will be by the relevance score. If no query is included, then the sort order will be by DOI update date.

- -

Resource Components

- -

Major resource components supported by the CrossRef API are:

- -
    -
  • works
  • -
  • funders
  • -
  • publishers
  • -
- -

These can be used alone like this

- - ---- - - - - - - - - - - - - - - - - - - - - - - -
resourcedescription
/worksreturns a list of all works (journal articles, conference proceedings, books, components, etc), 20 per page.
/fundersreturns a list of all funders in the FundRef Registry
/publishersr eturns a list of all publishers.
-

Resource components and identifiers

- -

Resource components can be used in conjunction with identifiers to retrieve the metadata for that identifier.

- - ---- - - - - - - - - - - - - - - - - - - - - - - -
resourcedescription
/works/{doi}returns metadata for the specified CrossRef DOI.
/funders/{funder_id}returns metadata for specified funder and its suborganizations
/publishers/{owner_prefix}returns metadata for the specified publisher.
-

Combining resource components

- -

Resource components can be combined to narrow down selections.

- - ---- - - - - - - - - - - - - - - - - - - - - - - -
resourcedescription
/works/{doi}/fundersreturns list of funders associated with the specified CrossRef DOI
/funders/{funder_id}/worksreturns list of works associated with the specified funder_id
/publishers/{owner_prefix}/worksreturns list of works associated with specified owner_prefix
-

Parameters

- -

Parameters can be used to query, filter and control the results returned by the CrossRef API. They can be passed as normal URI parameters or as JSON in the body of the request.

- - ---- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
parameterdescription
querylimited DisMax query terms
filter={filter_name}:{value}filter results by specific fields
rows={#}results per per page
offset={#}page offset
sample={#}return random N results
-

Example query using URI parameters

- -
http://api.crossref.org/funders/100000015/works?query=electron+pairs&filter=has-orcid:true&rows=1
-
- -

Example query using JSON in body of GET request

- -

Note that if you include a body in your GET request, any URI parameters will be ignored. In short, you cannot mix URI parameters and JSON queries.

- -

To use the API using JSON, pass the JSON in the body of the HTTP GET request like this:

- -
curl -X GET -H "Content-Type: application/json" -d '{"query": "psychoceramics", "offset": 40, "rows": 20, "filter": {"has-orcid": true, "publisher": "10.5555"}}'  http://api.crossref.org/works
-
- -

Queries

- -

Queries support a subset of DisMax, so, for example you can refine queries as follows.

- -

Works that include “renear” but not “ontologies”

- -
http://api.crossref.org/works?query=renear+-ontologies
-
- -

or using JSON

- -
curl -X GET -H "Content-Type: application/json" -d '{"query": "renear -ontologies"}'  http://api.crossref.org/works
-
- -

Filter Names

- -

Filters allow you to narrow queries. All filter results are lists. The following filters are supported:

- - ----- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
filterpossible valuesdescription
funder{funder_id}metadata which include the {funder_id} in FundRef data
publisher{owner_prefix}metadata belongs to published identified by {owner_prefix} (e.g. 10.1016 )
from-update-date{date}metadata updated since (inclusive) {date}
until-update-date{date}metadata updated before (inclusive) {date}
from-pub-date{date}metadata where published date is since (inclusive) {date}
until-pub-date{date}metadata where published date is before (inclusive) {date}
has-licensemetadata that includes any <license_ref> elements.
license.uri{uri}metadata where <license_ref> value equals {uri}
license.content-version{string}metadata where the <license_ref>’s applies_to attribute is {string}
license.max-embargo-days{integer}metadata where difference between publication date and the <license_ref>’s start_date attribute is <= {integer}
has-full-textmetadata that includes any full text <resource> elements.
fulltext.content-version{string}metadata where <resource> element’s content_version attribute is {string}.
fulltext.content-type{mime_type}metadata where <resource> element’s content_type attribute is {mime_type} (e.g. application/pdf).
public-referencesmetadata where publishers allow references to be distributed publically.
has-archivemetadata which include name of archive partner[1]
archive{string}metadata which where value of archive partner is {string}[1]
has-orcidmetadata which includes one or more ORCIDs
orcid{orcid}metadata where <orcid> element’s value = {orcid}
-

Notes on owner prefixes

- -

The prefix of a CrossRef DOI does NOT indicate who currently owns the DOI. It only reflects who originally registered the DOI. CrossRef metadata has an owner_prefix element that records the current owner of the CrossRef DOI in question.

- -

Notes on dates

- -

Note that dates in filters should always be of the form YYYY-MM-DD. Also not that date information in CrossRef metadata can often be incomplete. So, for example, a publisher may only include the year and month of publication for a journal article. For a monograph they might just include the year. In these cases the API selects the earliest possible date given the information provided. So, for instance, if the publisher only provided 2013–02 as the published date, then the date would be treated as 2013–02–01. Similarly, if the publisher only provided the year 2013 as the date, it would be treated at 2013–01–01.

- -

Result controls

- -

You can control the delivery and selection results using the rows, offset and sample parameters.

- -

Rows

- -

Normally, results are returned 20 at a time. You can control the number of results returns by using the rows parameter. To limit results to 5, for example, you could do the following:

- -
http://api.crossref.org/works?query=allen+renear&rows=5
-
- -

If you would just like to get the summary of the results, you can set the rows to 0 (zero).

- -
http://api.crossref.org/works?query=allen+renear&rows=0
-
- -

The maximum number rows you can ask for in one query is 1000.

- -

Offset

- -

The number of returned items is controlled by the rows parameter, but you can select the Nth set of rows by using the offset parameter. So, for example, to select the second set of 5 results (i.e. results 6 through 10), you would do the following:

- -
http://api.crossref.org/works?query=allen+renear&rows=5&offset=2
-
- -

Sample

- -

Being able to select random results is useful for both testing and sampling. You can use the sample parameter to retrieve random results. So, for example, the following select 10 random works:

- -
http://api.crossref.org/works?sample=10
-
- -

Note that when you use the sample parameter, the rows and offset parameters are ignored.

- -

Example Queries

- -

All works published by owner prefix 10.1016 in January 2010

- -
http://api.crossref.org/publishers/10.5555/works?filter=from-pub-date:2010-01,until-pub-date:2010-01
-
- -

All works funded by funder_id that have a CC-BY license

- -
http://api.crossref.org/publishers/10.5555/works?filter=license.uri:http://creativecommons.org/licenses/by/3.0/deed.en_US
-
- -

All works published by owner prefix 10.5555 in February 2015 that have a CC-BY license

- -
http://api.crossref.org/publishers/10.5555/works?filter=license.uri:http://creativecommons.org/licenses/by/3.0/deed.en_US,from-pub-date:2015-02,until-pub-date:2015-02
-
- -

All works funded by 10.13039/100005235 where license = CC-BY and embargo <= 365 days

- -
http://api.crossref.org/funders/10.13039/100005235/works?filter=license.uri:http://creativecommons.org/licenses/by/3.0/deed.en_US,license.max-embargo-days:365
-
- -

All works funded by X where the archive partner listed = ‘LOCKSS’

- -

Coming soon.

- -

Versioning

- -

In theory, the syntax of the API can vary independently of the result representations. In practice, major version changes in either will require changes to API clients and so versioning of the API will apply to both the API syntax and the result representation.

- -

The API uses a semantic versioning scheme whereby the version number is divided into three parts delimited by periods. The first number represents the “major” release number. The second represents a “minor” release number and the third represents an “internal” release number.

- -
Version 1.20.31
-        ^  ^  ^
-        |  |  |
-    major  |  |
-       minor  |
-       internal
-
- -

Major version increments will are defined as releases that can break backwards compatibility. CrossRef will only commit to supporting the latest two major releases simultaneously and legacy major releases will be supported for no more than nine months. Exceptions to these rules may be made when major releases are required to ensure the security or stability of the system.

- -

Minor version increments are defined as backwards compatible. There is no limit on the number of minor versions that can CrossRef can roll out. Note that client applications should not have dependencies on minor versions.

- -

Internal version increments are simply used to keep track of development versions of the API. They should never have any effect on client applications.

- -

Adding syntax options or metadata to representations will normally be backwards compatible and will thus normally only trigger minor version changes. Renaming or restructuring syntax options of metadata tends not to be backward compatible and will thus typically trigger major version changes

- -

How to manage API versions

- -

If you need to tie your implementation to a specific major version of the API, you can do so by using content-negotiation and specifying the version of the API in the ACCEPT header as follows:

- -
 application/vnd.crossref-api-message+json; version=1.0
-
- -

Minor version numbers will be ignored in ACCEPT headers as they are by definition backwards compatible.

- -

If you omit a specific version in your ACCEPT header, the system will default to using the latest version of the API.

- -

Error messages

- -

There will be no errors, and therefor error messages will be unnecessary. But seriously… coming soon.

- - - - - -
-
-
    - -
  1. -

    Not implemented yet.  ↩

    -
  2. - -
-
- -
- - \ No newline at end of file diff --git a/funder_kpi_api.md b/funder_kpi_api.md deleted file mode 100644 index bdb35d9..0000000 --- a/funder_kpi_api.md +++ /dev/null @@ -1,293 +0,0 @@ -# CrossRef APIs to support key performance indicators (KPIs) for funding agencies - -## Version History - -- V1: 2013-09-08, first draft. -- V2: 2013-09-24, reference platform deployed -- v3: 2013-09-25, reworked filters. Added API versioning doc -- v4: 2013-09-25, more filter changes. -- v5: 2013-09-27, doc mime-type and message-type relationship -- v6: 2013-10-01, updated `sample` & added examples with filters -- v6: 2013-10-01, corrected warning date -- v7: 2013-10-02, fixed typos -- v8: 2013-10-17, updated warning. Added email address - -## Background - -See the document, [CrossRef metadata best practice to support key performance indicators (KPIs) for funding agencies](http://api.crossref.org/docs/funder_kpi_metadata_best_practice.html), for background. - -## Warning - -The API described here is in alpha. If you encounter problems with the API or the documentation, please report them to: - -![](http://labs.crossref.org/wp-content/uploads/2013/01/labs_email.png) - - -## Overview - -The API is generally RESTFUL and returns results in JSON. - -The API will only work for CrossRef DOIs. You can test the registration agency for a DOI using the following convention: - -`http://doi.crossref.org/doiRA/{doi}` - -So testing the following CrossRef DOI: - -`10.1037/0003-066X.59.1.29` - -Will return the following result: - - [ - { - DOI: "10.1037/0003-066X.59.1.29", - RA: "CrossRef" - } - ] - - -If you use any of the API calls listed below with a non-CrossRef DOI, you will get a `404` HTTP status response with the message "non-CrossRef DOI." - -## Results Overview - -All results are returned in JSON. There are two general types of results: - -- Singletons -- Lists - -The mime-type for API results is `application/vnd.crossref-api-message+json` - - -### Singletons - -Singletons are single results. Retrieving metadata for a specific identifier (e.g. DOI, ISSN, funder_identifier) typically returns in a singleton result. - -### Lists -Lists results can contain multiple entries. Searching or filtering typically returns a list result. A list has two parts: - -- Summary, which include the following information: - - status (e.g. "ok", error) - - message-type (e.g. "work-list" ) - - message-version (e.g. 1.0.0 ) -- Items, which will will contain the items matching the query or filter. - - -Note that the "message-type" returned will differ from the mime-type. There are six message-types: - -- funder (singleton) -- publisher (singleton) -- work (singleton) -- work-result-list (list) -- funder-result-list (list) -- publisher-result-list (list) - - -Normally, and API list result will return both the summary and the items. If you want to just retrieve the summary, you can do so by specifying that the number of rows returned should be zero. -### Sort order - -If the API call includes a query, then the sort order will be by the relevance score. If no query is included, then the sort order will be by DOI update date. - - -## Resource Components -Major resource components supported by the CrossRef API are: - -- works -- funders -- publishers - -These can be used alone like this - -| resource | description | -|:--------------|:----------------------------------| -| `/works` | returns a list of all works (journal articles, conference proceedings, books, components, etc), 20 per page. -| `/funders` | returns a list of all funders in the [FundRef Registry](http://www.crossref.org/fundref/fundref_registry.html) -| `/publishers` |r eturns a list of all publishers.| - - -### Resource components and identifiers -Resource components can be used in conjunction with identifiers to retrieve the metadata for that identifier. - -| resource | description | -|:----------------------------|:----------------------------------| -| `/works/{doi}` | returns metadata for the specified CrossRef DOI. | -| `/funders/{funder_id}` | returns metadata for specified funder **and** its suborganizations | -| `/publishers/{owner_prefix}` | returns metadata for the specified publisher. | - -### Combining resource components - -Resource components can be combined to narrow down selections. - -| resource | description | -|:----------------------------|:----------------------------------| -| `/works/{doi}/funders` | returns list of funders associated with the specified CrossRef `DOI` | -| `/funders/{funder_id}/works`| returns list of works associated with the specified `funder_id` | -| `/publishers/{owner_prefix}/works` | returns list of works associated with specified `owner_prefix` | - - -## Parameters - -Parameters can be used to query, filter and control the results returned by the CrossRef API. They can be passed as normal URI parameters or as JSON in the body of the request. - -| parameter | description | -|:-----------------------------|:----------------------------| -| `query` | limited [DisMax](https://wiki.apache.org/solr/DisMax) query terms | -| `filter={filter_name}:{value}`| filter results by specific fields | -| `rows={#}` | results per per page | -| `offset={#}` | result offset | -| `sample={#}` | return random N results | - - - - - -### Example query using URI parameters - - http://api.crossref.org/funders/100000015/works?query=electron+pairs&filter=has-orcid:true&rows=1 - -### Example query using JSON in body of GET request - -Note that if you include a body in your `GET` request, any URI parameters will be ignored. In short, you cannot mix URI parameters and JSON queries. - -To use the API using JSON, pass the JSON in the body of the HTTP GET request like this: - - curl -X GET -H "Content-Type: application/json" -d '{"query": "psychoceramics", "offset": 40, "rows": 20, "filter": {"has-orcid": true, "publisher": "10.5555"}}' http://api.crossref.org/works - -## Queries - -Queries support a subset of [DisMax](https://wiki.apache.org/solr/DisMax), so, for example you can refine queries as follows. - -**Works that include "renear" but not "ontologies"** - - http://api.crossref.org/works?query=renear+-ontologies - -or using JSON - - curl -X GET -H "Content-Type: application/json" -d '{"query": "renear -ontologies"}' http://api.crossref.org/works - - - -## Filter Names - -Filters allow you to narrow queries. All filter results are lists. The following filters are supported: - - -| filter | possible values | description| -|:-----------|:----------------|:-----------| -| `funder` | `{funder_id}` | metadata which include the `{funder_id}` in FundRef data | -| `publisher` | `{owner_prefix}` | metadata belongs to published identified by `{owner_prefix}` (e.g. `10.1016` ) | -| `from-update-date` | `{date}` | metadata updated since (inclusive) `{date}` | -| `until-update-date` | `{date}` | metadata updated before (inclusive) `{date}` | -| `from-pub-date` | `{date}` | metadata where published date is since (inclusive) `{date}` | -| `until-pub-date` | `{date}` | metadata where published date is before (inclusive) `{date}` | -| `has-license` | | metadata that includes any `` elements. | -| `license.url` | `{url}` | metadata where `` value equals `{url}` | -| `license.version` | `{string}` | metadata where the ``'s `applies_to` attribute is `{string}`| -| `license.delay` | `{integer}` | metadata where difference between publication date and the ``'s `start_date` attribute is <= `{integer}` (in days)| -| `has-full-text` | | metadata that includes any full text `` elements. | -| `full-text.version` | `{string}` | metadata where `` element's `content_version` attribute is `{string}`. | -| `full-text.type` | `{mime_type}` | metadata where `` element's `content_type` attribute is `{mime_type}` (e.g. `application/pdf`). | -| `public-references` | | metadata where publishers allow references to be distributed publically. | -| `has-archive` | | metadata which include name of archive partner[^*] | -| `archive` | `{string}` | metadata which where value of archive partner is `{string}`[^*] | -| `has-orcid` | | metadata which includes one or more ORCIDs | -| `orcid` | `{orcid}` | metadata where `` element's value = `{orcid}` | - -[^*]: Not implemented yet. - -### Notes on owner prefixes - -The prefix of a CrossRef DOI does **NOT** indicate who currently owns the DOI. It only reflects who originally registered the DOI. CrossRef metadata has an **owner_prefix** element that records the current owner of the CrossRef DOI in question. - -### Notes on dates - -Note that dates in filters should always be of the form `YYYY-MM-DD`. Also not that date information in CrossRef metadata can often be incomplete. So, for example, a publisher may only include the year and month of publication for a journal article. For a monograph they might just include the year. In these cases the API selects the earliest possible date given the information provided. So, for instance, if the publisher only provided 2013-02 as the published date, then the date would be treated as 2013-02-01. Similarly, if the publisher only provided the year 2013 as the date, it would be treated at 2013-01-01. - -## Result controls - -You can control the delivery and selection results using the `rows`, `offset` and `sample` parameters. - -### Rows - -Normally, results are returned 20 at a time. You can control the number of results returns by using the `rows` parameter. To limit results to 5, for example, you could do the following: - - http://api.crossref.org/works?query=allen+renear&rows=5 - -If you would just like to get the `summary` of the results, you can set the rows to 0 (zero). - - http://api.crossref.org/works?query=allen+renear&rows=0 - -The maximum number rows you can ask for in one query is `1000`. - -### Offset - -The number of returned items is controlled by the `rows` parameter, but you can select the `Nth` set of `rows` by using the `offset` parameter. So, for example, to select the second set of 5 results (i.e. results 6 through 10), you would do the following: - - http://api.crossref.org/works?query=allen+renear&rows=5&offset=2 - -### Sample - -Being able to select random results is useful for both testing and sampling. You can use the `sample` parameter to retrieve random results. So, for example, the following select 10 random works: - - http://api.crossref.org/works?sample=10 - -Note that when you use the `sample` parameter, the `rows` and `offset` parameters are ignored. - - -### Example Queries - -**All works published by owner prefix `10.1016` in January 2010** - - http://api.crossref.org/publishers/10.5555/works?filter=from-pub-date:2010-01,until-pub-date:2010-01 - -**All works funded by funder_id that have a CC-BY license** - - http://api.crossref.org/publishers/10.5555/works?filter=license.uri:http://creativecommons.org/licenses/by/3.0/deed.en_US - -**All works published by owner prefix 10.5555 in February 2015 that have a CC-BY license** - - http://api.crossref.org/publishers/10.5555/works?filter=license.uri:http://creativecommons.org/licenses/by/3.0/deed.en_US,from-pub-date:2015-02,until-pub-date:2015-02 - -**All works funded by `10.13039/100005235` where license = CC-BY and embargo <= 365 days** - - http://api.crossref.org/funders/10.13039/100005235/works?filter=license.uri:http://creativecommons.org/licenses/by/3.0/deed.en_US,license.max-embargo-days:365 - - -**All works funded by X where the archive partner listed = 'LOCKSS'** - -Coming soon. - - - -## Versioning - -In theory, the syntax of the API can vary independently of the result representations. In practice, major version changes in either will require changes to API clients and so versioning of the API will apply to both the API syntax and the result representation. - -The API uses a semantic versioning scheme whereby the version number is divided into three parts delimited by periods. The first number represents the "major" release number. The second represents a "minor" release number and the third represents an "internal" release number. - - Version 1.20.31 - ^ ^ ^ - | | | - major | | - minor | - internal - - **Major** version increments will are defined as releases that can break backwards compatibility. CrossRef will only commit to supporting the latest two major releases simultaneously and legacy major releases will be supported for no more than nine months. Exceptions to these rules may be made when major releases are required to ensure the security or stability of the system. - -**Minor** version increments are defined as backwards compatible. There is no limit on the number of minor versions that can CrossRef can roll out. Note that client applications should not have dependencies on minor versions. - -**Internal** version increments are simply used to keep track of development versions of the API. They should never have any effect on client applications. - -Adding syntax options or metadata to representations will normally be backwards compatible and will thus normally only trigger minor version changes. Renaming or restructuring syntax options of metadata tends not to be backward compatible and will thus typically trigger major version changes - -### How to manage API versions - -If you need to tie your implementation to a specific major version of the API, you can do so by using content-negotiation and specifying the version of the API in the `ACCEPT` header as follows: - - application/vnd.crossref-api-message+json; version=1.0 - -Minor version numbers will be ignored in `ACCEPT` headers as they are by definition backwards compatible. - -If you omit a specific version in your `ACCEPT` header, the system will default to using the latest version of the API. - -## Error messages - -There will be no errors, and therefor error messages will be unnecessary. But seriously… coming soon. diff --git a/funder_kpi_metadata_best_practice.html b/funder_kpi_metadata_best_practice.html deleted file mode 100644 index 974badf..0000000 --- a/funder_kpi_metadata_best_practice.html +++ /dev/null @@ -1,514 +0,0 @@ - - - - - - funder_kpi_metadata_best_practice - - - - - -
-

CrossRef metadata best practice to support key performance indicators (KPIs) for funding agencies

- -

Version History

- -
    -
  • V01: 2013–09–08, first draft.
  • -
  • V02: 2013–09–09, add examples + links.
  • -
  • V03: 2013–09–10, adjust title. Correct typos.
  • -
  • V04: 2013–09–12, changed AAM to AM.
  • -
  • V05: 2013–09–18, incorporated corrections, suggestions from D. Shotton.
  • -
  • V06: 2013–09–23, added <free-to-read> element info. Updated warning.
  • -
  • V07: 2013–09–24, emphasize that publishers must deposit funder identifiers, when they exist in the FundRef Registry.
  • -
  • V08: 2013–11–04, Added FAQ about schema interpretation and usage
  • -
  • V09: 2013–12–02, Added XML deposit examples
  • -
  • v10: 2013–12–03, Updated <free-to-read/> element documentation to reflect latest NISO work. Added information about licensing CrossRef metadata to FAQ (hint, no license required for free APIs). Added labs email address. Changed formatting.
  • -
- -

Warning

- -

As of 2013–12–03 the <free-to-read> element has not yet been incorporated into the CrossRef deposit schema.

- -

If you encounter problems with the API or the documentation, please report them to:

- -
-
- - - -

Background

- -

Funding agencies and publishers are interested in being able to measure Key Performance Indicators (KPIs) related to mandates such the February 22nd OSTP memo on Public Access to the Results of Federally Funded Research. -CrossRef is extending its FundRef Application Programming Interfaces (APIs) to enable funding agencies and publishers to query CrossRef metadata in support of generating such KPIs. Organisations such as CHORUS and SHARE can make use of these APIs in order to create KPI Dashboards measuring, amongst other things:

- -
    -
  1. Publications relating to research funded by particular agencies.
  2. -
  3. The licenses under which said publications have been released.
  4. -
  5. The location of the full text of the Best Available Version (BAV) for said publications for both reading and Text & Data Mining (TDM) applications.
  6. -
  7. The long-term preservation arrangements that have been made for the VOR of said publications.
  8. -
- -

The CrossRef extended APIs, of course, will only work if publishers supply the appropriate metadata. This document outlines the metadata that publishers will need to provide in order to support such KPI reporting.

- -

Conventions

- -

Although this document is not an RFC, it will follow the conventions of rfc2119 in the use of the following terms:

- -
    -
  1. must - This word, or the terms “REQUIRED” or “SHALL”, mean that the definition is an absolute requirement for meeting best practice.
  2. -
  3. must not - This phrase, or the phrase “SHALL NOT”, mean that the definition is an absolute prohibition for meeting best practice.
  4. -
  5. should - This word, or the adjective “RECOMMENDED”, mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.
  6. -
  7. should not - This phrase, or the phrase “NOT RECOMMENDED”, mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label.
  8. -
- -

Summary

- -

In order to support basic agency and publisher KPIs:

- -
    -
  • Publishers must record funder information in their CrossRef deposits
  • -
  • Publishers must deposit the FundRef funder identifiers corresponding to their funder names where these exist in the FundRef Registry
  • -
  • Publishers should record award numbers when possible.
  • -
  • The Publisher should record funding information within CrossMark records if they are either implementing CrossMark or are planning to implement CrossMark within the next two years.
  • -
  • Publishers should record licensing information if they have it by means of a URI specifying the license under which the publication is made.
  • -
  • If publishers do not have licensing information, they should record a placeholder URI and fill in the target of the URI once they have agreed on licensing information.
  • -
  • Publishers should record full text links to the readable version(s) of the document. This may include different resources for the Version of Record (VOR) and Author Accepted Manuscript (AM).
  • -
  • Publishers should record full text links to representations of the document that are made available for TDM. These may be the same or different to the “readable” versions of the document pointed to above.
  • -
  • Where they are recording multiple versions of the document (e.g. AM & VOR), the publisher should map licensing information to the specific resource versions.
  • -
  • Publishers should record full text links to archived versions of the document identified by the CrossRef DOI.
  • -
- -

In order to enhance the utility of CrossRef metadata to agencies and in order to enable more sophisticated agency/publisher KPIs:

- -
    -
  • Publishers should consider participating in CrossMark in order to record updates, errata, corrigenda,retractions and withdrawals.
  • -
  • Publishers should consider depositing abstracts using CrossRef’s JATS abstract element support.
  • -
  • Publishers should consider collecting and depositing ORCIDs for publication authors.
  • -
  • Publishers should consider making the bibliographic metadata and references for documents resulting from agency funding maximally available by overriding CrossRef opt-outs using the <metadata_distribution_opt> and <reference_distribution_opt> elements.
  • -
- -

Funding information

- -

CrossRef supports the recording of funding information for a publication via its FundRef program. FundRef defines an open, standard registry of funder names and funder identifiers that can be used in order to increase the accuracy of the funding information recorded. Although FundRef supports recording award_numbers along with funder identifiers, FundRef does not define standards for recording award numbers as practice varies greatly across funders.

- -

To support funder KPIs, members must deposits funder metadata using the specifications defined for the FundRef program. Specifically, when depositing metadata you:

- -
    -
  1. must include funder information.
  2. -
  3. must not deposit your funder names without at least trying to map them to FundRef identifiers in the FundRef registry. Depositing funder names that are included in the FundRef registry, but without their respective FundRef Funder Identifiers, will pollute the FundRef metadata and lower the value of the service for all participants. Note that the KPI APIs will only work for Funder metadata that includes FundRef Funder Identifiers.
  4. -
  5. should include award numbers in FundRef metadata when possible. Although the standard KPI API does not make direct use of award numbers, individual agencies may be able to make use of included award numbers where found.
  6. -
  7. should deposit FundRef data as part of a CrossMark record if you (the publisher) already are (or are planning to become) a participant in CrossMark. There are two reasons for this: First, it ensures that the Funder Metadata is available both in a standard machine readable format AND via a standard UI for readers. Second, it ensures that the Funder metadata is made maximally reusable via a CC Zero license waiver. Note that publishers do not need to have implemented CrossMark yet to deposit Funder metadata via CrossMark. We expect that publishers may take a year or more before they have fully implemented all of CrossMark’s features.
  8. -
- -

See CrossRef’s Help pages for Technical details on depositing FundRef metadata.

- -

License information

- -

One of the main drivers for the FunRef KPI API is that many funders are required to report on the public availability of the results and publications arising from funder-financed research. Funders are therefor interested in understanding how publications related to funded research are licensed.

- -

To deposit license information, publishers must use the <license_ref> element. The value of the <license_ref> element must be a stable HTTP URI which points to a human-readable document that either includes (or guides the reader to) any copyright and/or licensing information related to the CrossRef DOI of the content. The URI must point either to a location on the publisher’s site or to the stable location of any well-known licenses such as those of the Creative Commons.

- -

Note that it is entirely acceptable to record a <license_ref> URI as a “placeholder.” If you are still working out specific licensing terms, the URI you record should point to a blank page or even a simple re-assertion of the document’s copyright. There is a big difference between recording at least some <license_ref> URI and recording no <license_ref> URI at all. The former indicates an intent to eventually clarify licensing information, whereas the latter indicates that the licensing information is likely to remain ambiguous.

- -

Use of the <license_ref> element is best explained through examples.

- -

The <license_ref> for content licensed under the popular CC-BY license, would look like this:

- -
<license_ref>http://creativecommons.org/licenses/by/3.0/deed.en_US</license_ref>
-
- -

Where as the Journal of Psychoceramics might record that their content is licensed under a proprietary license like this:

- -
<license_ref>http://www.psychoceramics.org/license_v1.html</license_ref>
-
- -

You can deposit multiple <license_ref> elements- so the following would indicate that a document was available under a dual license (e.g. one for commercial applications and one for non-commercial applications).

- -
<license_ref>http://www.psychoceramics.org/non_commercial_license_v1.html</license_ref>
-<license_ref>http://www.psychoceramics.org/commercial_license_v1.html</license_ref>
-
- -

Embargos

- -

Publishers may want to record that a document is under embargo. In other words, that it is available under access control and a proprietary license for a set period of time, after which it is available under an open license. Publishers wishing to record embargoes should use the optional start_date attribute on the <license_ref> element.

- -

For example, the following records that the content is under a proprietary license from its date of publication on February 3, 2014 and that it is under a CC-BY license a year later on February 3, 2015:

- -
<license_ref start_date="2014-02-03">http://www.psychoceramics.org/license_v1.html</license_ref>
-<license_ref start_date="2015-02-03">http://creativecommons.org/licenses/by/3.0/deed.en_US</license_ref>
-
- -

Note that the value of the start_date element must be recorded using the format YYYY-MM-DD -The start_date attribute can be combined with multiple <license_ref> elements to indicate that a document is under a proprietary license during an embargo, but that it is then under a dual (commercial/non-commercial) license a year later)

- -
<license_ref start_date="2014-02-03">http://www.psychoceramics.org/license_v1.html</license_ref>
-
-<license_ref start_date="2015-02-03">http://www.psychoceramics.org/non_commercial_license_v1.html</license_ref>
-
-<license_ref start_date="2015-02-03">http://www.psychoceramics.org/commercial_license_v1.html</license_ref>
-
- -

Note that there is no corresponding end_date attribute for the <license_ref> element. This is because including end dates could introduce ambiguities. For example:

- -
    -
  • Open Licenses, such as CC, do not have “end dates”.
  • -
  • With end dates, it would be possible to inadvertently record “gaps” between licenses.
  • -
- -

You might ask why one should record a license that starts in the future? Wouldn’t it be better to just update the <license_ref> element at the time the license changes? By recording that another license takes effect in the future, you are informing the consumer of the metadata that the current restricted license is only for the embargo period. In short, you are recording the intent to change the license when the embargo is done. Furthermore, providing additional metadata for a current publication at some future date is an additional chore for the publisher that might well be overlooked.

- -

In the above examples, the <license_ref> element is unqualified and should therefor be considered to apply to the content pointed to by any <resource> URIs included in the CrossRef metadata. The CrossRef metadata schema supports recording different license for different versions of the resource and this will be discussed below. However, first let’s look at at the role the <resource> element plays in providing funding agency KPIs.

- -

Recording links to full text and/or archived versions of documents, etc.

- -

Funders are not just interested in reporting on the licensing terms of publications resulting from funder-financed research. They are also interested in making sure that the full text content of the BAV is made available for reading, automated processing and archiving.

- -

To this end, publishers need to be able to record links to the full text of the content to which a DOI refers. Additionally, publishers will want to offer different versions (e.g. AM or VOR) and different representations (e.g. PDF for viewing, XML for TDM, etc.) of the content tailored for specific applications.

- -

The <resource> element in CrossRef metadata is most often used to record an HTTP URI pointing at the publisher’s landing page for the publication identified by the CrossRef DOI in question. However, the CrossRef schema has long supported the recording of multiple <resource> elements in order to enable, for example:

- -
    -
  • Multiple resolution
  • -
  • Search engine indexing
  • -
  • CrossCheck indexing
  • -
- -

CrossRef has extended the ability to record multiple <resource> elements in order to allow the recording of URIs which point to the full text of content identified by the CrossRef DOI. The publisher can record multiple representations of the full text (e.g. PDF, XML, plain text) using the new mime_type attribute and then, through their access control systems, control who is able to reach which representation and under which conditions.

- -

Note that, by recording a <resource> that points to the full text, you are not necessarily guaranteeing that the URI will be accessible

- -

Note also that the publisher could theoretically choose only to deposit <resource> elements for full text representations once an embargo has ended. However, this approach may prove fraught, as any mistakes or delays in the redeposit process might lead the funding agency to believe that the publisher has not made the relevant content accessible at the end of the embargo period.

- -

Further detail on using the <resource> element for recording links to full text can be found on the Prospect support site and in the CrossRef deposit schema documentation for the <collection> and <resource> elements.

- -

Different licenses for different versions of the content

- -

Some publishers may want to record different licenses for different versions of the <resource> element recorded in CrossRef metadata. For example, one <resource> element may point to a URI intended for subscribed readers, while another <resource> element may point to a version of the document intended for Text and Data Mining (TDM) applications. Similarly, a publisher may choose to apply one license to the “Author Accepted Manuscript” (AM) and another to the “Version of Record” (VOR).

- -

To accommodate these scenarios, the <license_ref> element supports an applies_to element. Similarly, the <resource> element has been extended to support a content_version attribute. Publishers can use these element/attribute combinations to apply specific licenses to specific versions of the resource. For example, to indicate the “VOR” version of a document is licensed under a proprietary license, but that the “AM” version of the same document is licensed under an open license, the <license_ref> and <resource> elements could be combined like this:

- -
<license_ref applies_to="vor">http://www.psychoceramics.org/license_v1.html</license_ref>
-
-<!-- … -->
-
-<license_ref applies_to="am">http://creativecommons.org/licenses/by/3.0/deed.en_US</license_ref>
-
-<!--- other CrossRef metadata -->
-
-<resource content_version="vor">http://www.psychoceramics.org/fulltext/vor/10.5555/12345678</resource>
-
-<!-- … -->
-
-<resource content_version="am">http://www.psychoceramics.org/fulltext/am/10.5555/12345678</resource>
-
- -

The <license_ref> and <resource> elements along with their respective start_date, applies_to, and content_type attributes can all be combined to support more complex assertions. So, for example the following example says that a document is only available under a proprietary license for readers during an embargo period, but is then available to the public for reading under a more open license and for non-commercial TDM applications under a specific TDM license:

- -
<license_ref start_date="2014-02-03" applies_to="vor">http://www.psychoceramics.org/license_v1.html</license_ref>
-
-<!-- … -->
-
-<license_ref start_date="2015-02-03" applies_to="am">http://www.psychoceramics.org/open_license.html</license_ref>
-
-<!-- … -->
-
-<license_ref start_date="2015-02-03" applies_to="tdm">http://www.psychoceramics.org/nc_tdm_license.html</license_ref>
-
-<!--- other CrossRef Metadata -->
-
-<resource content_version="vor">http://www.psychoceramics.org/fulltext/vor/10.5555/12345678</resource>
-
-<!-- … -->
-
-<resource content_version="am">http://www.psychoceramics.org/fulltext/am/10.5555/12345678</resource>
-
-<resource content_version="tdm">http://www.psychoceramics.org/fulltext/tdm/10.5555/12345678.xml</resource>
-
- -

Detailed information on recording licensing information in CrossRef metadata can be found in the CrossRef schema documentation for the <license_ref> element.

- -

“Libre” vs “Gratis”

- -

The license information recorded in the <licence_ref> element can tell you what you are allowed to do with the resources the licenses point to, but they do not say anything about whether or not there is a monetary charge involved. -In order to allow a publisher to record whether access to the content requires payment, CrossRef supports a new <free-to-read/> element. The <free-to-read/> element is an empty element. It can include two attributes, a start_date and an end_date. The <free-to-read/> elements works as follows:

- -
    -
  • The presence of a element in CrossRef metadata _should be interpreted to mean that the full text content pointed to by the DOI resource is available “gratis” during the time period specified by the start_date and end_date attributes.
  • -
  • If the element only includes a start_date attribute, then the element should be interpreted to mean that the content pointed to by the DOI resource will be made gratis from start_date on.
  • -
  • If the element only includes a end_date attribute, then the element should be interpreted to mean that the content pointed to by the DOI resource will be made gratis from the publication date to and including the end_date.
  • -
  • If the element has no start_date or end_date attributes, then the element should be interpreted to mean that the content pointed to by the DOI resource is available “gratis” from the date of publication on.
  • -
  • If the element is not present in the DOI record, one should not assume that the resource pointed at by the DOI is available to read “gratis”.
  • -
- -

When the <free-to-read> element is combined with the <license_ref> element, the publisher can record sophisticated information about the availability and reusability of content. For example:

- -

restrictive licenses and possibly a payment

- -
<license_ref>http://tinypublisher.org/licenses/proprietary.html</license_ref>
-
- -

restrictive licenses and no payment (e.g free copy of an article from a subscription journal)

- -
<free-to-read/>
-<!-- … -->
-<license_ref>http://tinypublisher.org/licenses/proprietary.html</license_ref>
-
- -

have unrestrictive licenses and a possibly a payment (e.g. a CC-BY licensed novel for sale on Amazon)

- -
<license_ref>http://creativecommons.org/licenses/by/3.0/deed.en_US</license_ref>
-
- -

have unrestricted licenses and NO payment

- -
<free-to-read/>
-<!-- … -->
-<license_ref>http://creativecommons.org/licenses/by/3.0/deed.en_US</license_ref>
-
- -

Bonus points

- -

The more metadata that publishers record for publications arising from agency funded research, the more useful that metadata will be to said agencies and the more value they will see from publishers. Where as the above sections details metadata elements that agencies will expect in order to be able to compile basic KPIs and offer portal services, additional metadata will allow agencies to create even more sophisticated KPIs and services. As such, publishers should seriously consider depositing the following additional metadata elements in their CrossRef deposits.

- -

Distributing standard bibliographic metadata

- -

Metadata deposited to CrossRef is made available freely via numerous CrossRef query APIs. However all deposited metadata is subject to opt-outs in the case of bulk distribution APIs and data dumps. In order to make sure that bibliographic metadata for publications arising from agency funding is maximally available, publishers should consider setting the value of the <metadata_distribution_opts> element for DOIs to any. Further details can be found in CrossRef’s schema documentation for the <metadata_distribution_opts> element.

- -

Distributing references

- -

References made in publications arising from agency funding can provide agencies with an overview of what literature is considered important in the fields that they fund. Many publishers deposit references to CrossRef as part of their participation CrossRef’s CitedBy service. However, participation in CitedBy does not automatically make references available via CrossRef’s standard APIs. In order for publishers to distribute references along with standard bibliographic metadata, publishers need to set the <reference_distribution_opt> element to any for each DOI deposit where they want to make references openly available. By setting this element, references for the DOI will be distributed without restriction through all of CrossRefs APIs and bulk metadata dumps. Further details can be found in CrossRef’s schema documentation for the <reference_distribution_opt> element.

- -

CrossMark

- -

CrossMark provides a standard mechanism for alerting researchers to updates to published documents- including corrections, errata, corrigenda retractions and withdrawals. Use of the CrossMark service sends a signal to researchers and agencies that publishers are committed to maintaining the integrity of the scholarly record.

- -

Additionally, CrossMark also provides a standard, cross-publisher, user interface that researchers can use to view FundRef information and licensing information. This user interface works both from publisher landing pages and from published PDFs. More information can be found on the CrossMark support site

- -

Abstracts

- -

Many funding agencies are interested in building custom portals that highlight agency-funded research. In order to provide users of these portals with the best experience, agencies will want, where possible, to display abstracts of publications along with their standard bibliographic metadata.

- -

CrossRef supports the deposit of abstracts conforming to the JATS abstract element. Further details can be found in the CrossRef Schema Documentation of the <abstract> element.

- -

ORCIDs

- -

ORCIDs are unique identifiers for researchers. CrossRef supports the deposit of ORCIDs for authors. The presence of ORCIDs in CrossRef metadata will, in turn, allow agencies to tie agency funded research publications directly to researchers. Widespread use of ORCIDs in CrossRef deposits could even let agencies start to develop publication KPIs for researchers that they fund. Further details on CrossRef’s ORCID support can be found in the CrossRef Schema Documentation of the <ORCID> element

- -

Frequently Asked Questions

- -

Q: What license applies to the metadata retrieved by the CrossRef APIs to support key performance indicators (KPIs) for funding agencies? -
-A: CrossRef asserts no claims of ownership to individual items of bibliographic metadata and associated Digital Object Identifiers (DOIs) acquired through the use of the CrossRef Free Services. Individual items of bibliographic metadata and associated DOIs may be cached and incorporated into the user’s content and systems. More information can be found on our web site.

- -

Q: What does it mean if a <license_ref> element has no start_date attribute? -
-A: This should be interpreted to mean that the <license_ref> applies from the earliest publication date.

- -

Q: What does is mean if there is no applies_to attribute for the <license_ref> element? -
-A: This should be interpreted to mean that the license_ref applies to all the <resource> elements in the record.

- -

Q: What does it mean if the <resource> element doesn’t have a content_version attribute? -
-A: This should be interpreted to mean that any <resource> elements point to the version of record (‘vor’)

- -

Q: What does it mean if there is no correspondence between existing <license_ref> applies_to attributes and existing <resource> content_version attributes? -
-A: This probably means the publisher made a mistake depositing the metadata.

- -

XML Deposit Examples

- -

Full Deposits

- -

Full deposits use the standard deposit schema.

- - - -

Partial Deposits

- -

Partial deposits use the resource deposit schema.

- -

Partial deposits update only part of a DOI’s metadata. In the CrossRef help system -they are referred to as resource deposits, but it is not just resources that can -be provided as a partial deposit. Licenses, funding information and CrossMarks can also -be provided as partial deposits.

- -

Many partial deposits can be provided in a single batch deposit. The <body> element can -contain any number of partial deposits of any type, some of which may be partial deposits for -the same DOI. For example, two partial deposits could be provided for the same DOI, -one updating funding information, the other updating license information.

- - - - - - -
- - \ No newline at end of file diff --git a/funder_kpi_metadata_best_practice.md b/funder_kpi_metadata_best_practice.md index b8a5a01..6c4c0d0 100644 --- a/funder_kpi_metadata_best_practice.md +++ b/funder_kpi_metadata_best_practice.md @@ -1,7 +1,41 @@ -# CrossRef metadata best practice to support key performance indicators (KPIs) for funding agencies - - -## Version History +# Using Crossref metadata to enable auditing of conformance to funder mandates: A Guide for publishers + +## Table of Contents + + + +- [Using Crossref metadata to enable auditing of conformance to funder mandates: A Guide for publishers](#using-crossref-metadata-to-enable-auditing-of-conformance-to-funder-mandates-a-guide-for-publishers) + - [Table of Contents](#table-of-contents) + - [Version History](#version-history) + - [Contact info](#contact-info) + - [Background](#background) + - [Conventions](#conventions) + - [Summary](#summary) + - [Funding information](#funding-information) + - [License information](#license-information) + - [Embargoes](#embargoes) + - [Recording links to full text and/or archived versions of documents, etc.](#recording-links-to-full-text-andor-archived-versions-of-documents-etc) + - [Different licenses for different versions of the content](#different-licenses-for-different-versions-of-the-content) + - ["Libre" vs "Gratis"](#libre-vs-gratis) + - [Recording third party archive arrangements](#recording-third-party-archive-arrangements) + - [Assigning and registering DOIs at acceptance](#assigning-and-registering-dois-at-acceptance) + - [Assigning and registering DOIs for manuscripts that the publisher *has* made avaialble online](#assigning-and-registering-dois-for-manuscripts-that-the-publisher-has-made-avaialble-online) + - [Assigning and registering DOIs for manuscripts that the publisher *has not yet* made available online](#assigning-and-registering-dois-for-manuscripts-that-the-publisher-has-not-yet-made-available-online) + - [Bonus points](#bonus-points) + - [Distributing standard bibliographic metadata](#distributing-standard-bibliographic-metadata) + - [Distributing references](#distributing-references) + - [CrossMark](#crossmark) + - [Abstracts](#abstracts) + - [ORCIDs](#orcids) + - [Frequently Asked Questions](#frequently-asked-questions) + - [XML Deposit Examples](#xml-deposit-examples) + - [Full Deposits](#full-deposits) + - [Partial Deposits](#partial-deposits) + - [Registered Content Deposits](#registered-content-deposits) + + + +## Document Version History - V01: 2013-09-08, first draft. - V02: 2013-09-09, add examples + links. @@ -14,24 +48,35 @@ - V09: 2013-12-02, Added XML deposit examples - v10: 2013-12-03, Updated `` element documentation to reflect latest NISO work. Added information about licensing CrossRef metadata to FAQ (hint, no license required for free APIs). Added labs email address. Changed formatting. - v11: 2013-12-11, Added third party archive arrangements section. Updated examples to include archive locations. +- v12: 2015-11-27, Revisions to describe deposit workflow to support alerting funders/instituions when content has been "accepted". Pointers to latest schemas. General cleanup. +- v13: 2016-03-22, Updated mentions of "FunRef" to "Funding Data" and "Open Funder Registry" as appropriate. Updated "CrossRef" to "Crossref." Updated section on registered content to indicate impending finalization of process. +- v14: 2017-07-20, Cleanup links. -## Warning -As of 2013–12–03 the `` element has not yet been incorporated into the CrossRef deposit schema. - -If you encounter problems with the API or the documentation, please report them to: +## Contact info -![](http://labs.crossref.org/wp-content/uploads/2013/01/labs_email.png) +If you encounter problems with the API or the documentation, please report them using our [issue tracker](https://github.com/CrossRef/rest-api-doc/issues). ## Background -Funding agencies and publishers are interested in being able to measure Key Performance Indicators (KPIs) related to mandates such the February 22nd OSTP memo on *[Public Access to the Results of Federally Funded Research](http://www.whitehouse.gov/blog/2013/02/22/expanding-public-access-results-federally-funded-research)*. -CrossRef is extending its FundRef Application Programming Interfaces (APIs) to enable funding agencies and publishers to query CrossRef metadata in support of generating such KPIs. Organisations such as *[CHORUS](http://publishers.org/press/107/)* and *[SHARE](http://www.arl.org/news/arl-news/2773-shared-access-research-ecosystem-proposed-by-aau-aplu-arl)* can make use of these APIs in order to create KPI Dashboards measuring, amongst other things: -1. Publications relating to research funded by particular agencies. -2. The licenses under which said publications have been released. -3. The location of the full text of the Best Available Version (BAV) for said publications for both reading and Text & Data Mining (TDM) applications. -4. The long-term preservation arrangements that have been made for the VOR of said publications. +Funders are increasingly setting mandates around publications that result from research they have funded. The mandates include specifications about licenses, embargoes, and notifications of publication acceptance and/or publication. This poses logistical problems for all the parties involved. Funders will need a way to track outputs from thousands of publishers. Publishers will need a standard and efficient way to demonstrate conformance to the mandates. All the stakeholders in the process (funders, publishers, institutions and researchers) will span disciplines, institutions, geographies and jurisdictions. Crossref was setup specifically to deal with these sorts of multiple bilateral relationships. + +Crossref has extended its metadata schemas and Application Programming Interfaces (APIs) to enable funding agencies, institutions and publishers to use Crossref as a metadata source that can be used to track research that is subject to these mandates and to ensure that said research is being disseminated according to the requirements of the mandates. + +Funders, institutions, publishers and third parties providing research information management tools (e.g. *[CHORUS](https://www.chorusaccess.org/)*, *[SHARE](http://www.arl.org/news/arl-news/2773-shared-access-research-ecosystem-proposed-by-aau-aplu-arl)*,*[Symplectic](http://symplectic.co.uk)* can make use of Crossref APIs and metadata in order to identify: -The CrossRef extended APIs, of course, will only work if publishers supply the appropriate metadata. This document outlines the metadata that publishers will need to provide in order to support such KPI reporting. +- Publications relating to research supported by particular funders. +- Publications from particular researchers identified by their ORCID ID. +- The bibliographic metadata for said publications. +- The licenses under which said publications have been released. +- Any embargoes applied to said publications. +- The location of the full text of the Best Available Version (BAV) for said publications for both reading and text & data mining (TDM) applications. +- The long-term preservation arrangements that have been made for the VOR of said publications. +- The ORCIDs associated with those publications. +- Any updates (errata, corrigenda, retractions) applied to said publications. + +This data can be propagated by publisher at any time after publication acceptance- according to the requirements of the relevant mandates. + +The Crossref extended APIs, of course, will only work if publishers supply the appropriate metadata and follow the specified metadata deposit workflows. This document outlines the metadata that publishers will need to provide and the metadata deposit workflows they will need to support in order to advertise their conformance to funder mandates. ## Conventions @@ -41,50 +86,54 @@ Although this document is not an RFC, it will follow the conventions of __[rfc21 2. __must not__ - This phrase, or the phrase "SHALL NOT", mean that the definition is an absolute prohibition for meeting best practice. 3. __should__ - This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. 4. __should not__ - This phrase, or the phrase "NOT RECOMMENDED", mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label. +5. __may__ This word, or the adjective "OPTIONAL", mean that an item is truly optional. One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item. An implementation which does not include a particular option MUST be prepared to interoperate with another implementation which does include the option, though perhaps with reduced functionality. In the same vein an implementation which does include a particular option MUST be prepared to interoperate with another implementation which does not include the option (except, of course, for the feature the option provides.) ## Summary -In order to support basic agency and publisher KPIs: +In order to advertise conformance to funder mandates, Crossref members: -- Publishers __must__ record funder information in their CrossRef deposits -- Publishers __must__ deposit the FundRef funder identifiers corresponding to their funder names where these exist in the FundRef Registry -- Publishers __should__ record award numbers when possible. -- The Publisher __should__ record funding information within CrossMark records __if__ they are either implementing CrossMark or are planning to implement CrossMark within the next two years. -- Publishers __should__ record licensing information if they have it by means of a URI specifying the license under which the publication is made. +- __must__ record funder information in their Crossref deposits +- __must__ deposit the funder identifiers corresponding to their funder names where these exist in the Open Funder Registry +- __should__ record award numbers when possible. +- __should__ record Funding Data within CrossMark records __if__ they are either implementing CrossMark or are planning to implement CrossMark within the next two years. +- __should__ record licensing information if they have it by means of a URI specifying the license under which the publication is made. - If publishers do not have licensing information, they __should__ record a placeholder URI and fill in the target of the URI once they have agreed on licensing information. -- Publishers __should__ record full text links to the readable version(s) of the document. This may include different resources for the Version of Record (VOR) and Author Accepted Manuscript (AM). -- Publishers __should__ record full text links to representations of the document that are made available for TDM. These may be the same or different to the "readable" versions of the document pointed to above. +- __should__ record full text links to the readable version(s) of the document. This may include different resources for the Version of Record (VOR) and Author Accepted Manuscript (AM). +- __should__ record full text links to representations of the document that are made available for TDM. These may be the same or different to the "readable" versions of the document pointed to above. - Where they are recording multiple versions of the document (e.g. AM & VOR), the publisher __should__ map licensing information to the specific resource versions. -- Publishers __should__ record full text links to archived versions of the document identified by the CrossRef DOI. -- Publishers __should__ record archive arrangements made with third party archiving organizations where the document identified by the CrossRef DOI is archived with the third party. +- __should__ record full text links to archived versions of the document identified by the Crossref DOI. +- __should__ record archive arrangements made with third party archiving organizations where the document identified by the Crossref DOI is archived with the third party. + +In order to enhance the utility of Crossref metadata to funders and in order to enable more sophisticated funder/publisher KPIs, Crossref members: + +- __should__ consider participating in CrossMark in order to record updates, errata, corrigenda,retractions and withdrawals. +- __should__ consider depositing abstracts using Crossref's JATS abstract element support. +- __should__ consider collecting and depositing ORCIDs for publication authors. +- __should__ consider making the bibliographic metadata and references for documents resulting from agency funding maximally available by overriding Crossref opt-outs using the `` and `` elements. -In order to enhance the utility of CrossRef metadata to agencies and in order to enable more sophisticated agency/publisher KPIs: +In order to alert funders of relevant publications as soon as possible, Crossref members: -- Publishers __should__ consider participating in CrossMark in order to record updates, errata, corrigenda,retractions and withdrawals. -- Publishers __should__ consider depositing abstracts using CrossRef's JATS abstract element support. -- Publishers __should__ consider collecting and depositing ORCIDs for publication authors. -- Publishers __should__ consider making the bibliographic metadata and references for documents resulting from agency funding maximally available by overriding CrossRef opt-outs using the `` and `` elements. +- __should__ consider assigning and registering DOIs at acceptance - ## Funding information -CrossRef supports the recording of funding information for a publication via its [FundRef](http://www.crossref.org/fundref/) program. FundRef defines an open, standard [registry of funder names and funder identifiers](http://www.crossref.org/fundref/fundref_registry.html) that can be used in order to increase the accuracy of the funding information recorded. Although FundRef supports recording award_numbers along with funder identifiers, FundRef does __not__ define standards for recording award numbers as practice varies greatly across funders. +Crossref supports the recording of funding information for a publication via its [Funding Data](http://www.crossref.org/fundingdata/index.html) program. The Open Funder Registry defines an open, standard [registry of funder names and funder identifiers](https://www.crossref.org/services/funder-registry/) that can be used in order to increase the accuracy of the funding information recorded. Although Funding Data supports recording award_numbers along with funder identifiers, Crossref does __not__ define standards for recording award numbers as practice varies greatly across funders. -To support funder KPIs, members __must__ deposits funder metadata using the specifications defined for the FundRef program. Specifically, when depositing metadata you: +To support funder KPIs, members __must__ deposits funder metadata using the specifications defined for the Funder Data program. Specifically, when depositing metadata you: -1. __must__ include funder information. -2. __must not__ deposit your funder names without at least trying to map them to FundRef identifiers in the FundRef registry. Depositing funder names that are included in the FundRef registry, but without their respective FundRef Funder Identifiers, will pollute the FundRef metadata and lower the value of the service for all participants. Note that the KPI APIs will only work for Funder metadata that includes FundRef Funder Identifiers. -3. __should__ include award numbers in FundRef metadata when possible. Although the standard KPI API does not make direct use of award numbers, individual agencies may be able to make use of included award numbers where found. -4. __should__ deposit FundRef data as part of a CrossMark record if you (the publisher) already are (or *are planning* to become) a participant in CrossMark. There are two reasons for this: First, it ensures that the Funder Metadata is available __both__ in a standard machine readable format __AND__ via a standard UI for readers. Second, it ensures that the Funder metadata is made maximally reusable via a CC Zero license waiver. Note that publishers do not __need__ to have implemented CrossMark yet to deposit Funder metadata via CrossMark. We expect that publishers may take a year or more before they have fully implemented all of CrossMark's features. +1. __must__ include funder information. +2. __must not__ deposit your funder names without at least trying to map them toidentifiers in the Open Funder Registry. Depositing funder names that are included in the Open Funder Registry registry, but without their respective funder identifiers, will pollute the Funder Data and lower the value of the service for all participants. Note that the KPI APIs will only work for Funder Data that includes Open Funder Registry Identifiers. +3. __should__ include award numbers in Funder Data when possible. Although the standard KPI API does not make direct use of award numbers, individual agencies may be able to make use of included award numbers where found. +4. __should__ deposit Funder Data as part of a CrossMark record if you (the publisher) already are (or *are planning* to become) a participant in CrossMark. There are two reasons for this: First, it ensures that the Funder Data is available __both__ in a standard machine readable format __AND__ via a standard UI for readers. Second, it ensures that the Funder Data is made maximally reusable via a CC Zero license waiver. Note that publishers do not __need__ to have implemented CrossMark yet to deposit Funder metadata via CrossMark. We expect that publishers may take a year or more before they have fully implemented all of CrossMark's features. -See CrossRef's Help pages for [Technical details on depositing FundRef metadata.](http://help.crossref.org/#fundref) +See Crossref's Help pages for [Technical details on depositing Funder Data.](https://support.crossref.org/hc/en-us/articles/214360746-Funding-data-overview) ## License information -One of the main drivers for the FunRef KPI API is that many funders are required to report on the public availability of the results and publications arising from funder-financed research. Funders are therefor interested in understanding how publications related to funded research are licensed. +One of the main drivers for the Funder Data KPI API is that many funders are required to report on the public availability of the results and publications arising from funder-financed research. Funders are therefor interested in understanding how publications related to funded research are licensed. -To deposit license information, publishers __must__ use the `` element. The value of the `` element __must__ be a stable HTTP URI which points to a human-readable document that either includes (or guides the reader to) any copyright and/or licensing information related to the CrossRef DOI of the content. The URI __must__ point either to a location on the publisher's site or to the stable location of any well-known licenses such as those of the Creative Commons. +To deposit license information, publishers __must__ use the `` element. The value of the `` element __must__ be a stable HTTP URI which points to a human-readable document that either includes (or guides the reader to) any copyright and/or licensing information related to the Crossref DOI of the content. The URI __must__ point either to a location on the publisher's site or to the stable location of any well-known licenses such as those of the Creative Commons. Note that it is entirely acceptable to record a `` URI as a "placeholder." If you are still working out specific licensing terms, the URI you record __should__ point to a blank page or even a simple re-assertion of the document's copyright. There is a big difference between recording at least some `` URI and recording no `` URI at all. The former indicates an intent to eventually clarify licensing information, whereas the latter indicates that the licensing information is likely to remain ambiguous. @@ -103,7 +152,7 @@ You can deposit multiple `` elements- so the following would indica http://www.psychoceramics.org/non_commercial_license_v1.html http://www.psychoceramics.org/commercial_license_v1.html -### Embargos +### Embargoes Publishers may want to record that a document is under embargo. In other words, that it is available under access control and a proprietary license for a set period of time, after which it is available under an open license. Publishers wishing to record embargoes __should__ use the optional `start_date` attribute on the `` element. @@ -123,36 +172,36 @@ The `start_date` attribute can be combined with multiple `` element Note that there is __no__ corresponding `end_date` attribute for the `` element. This is because including end dates could introduce ambiguities. For example: -- Open Licenses, such as CC, do not have "end dates". -- With end dates, it would be possible to inadvertently record "gaps" between licenses. +- Open Licenses, such as CC, do not have "end dates". +- With end dates, it would be possible to inadvertently record "gaps" between licenses. You might ask why one should record a license that starts in the future? Wouldn't it be better to just update the `` element at the time the license changes? By recording that another license takes effect in the future, you are informing the consumer of the metadata that the current restricted license is only for the embargo period. In short, you are recording the __intent__ to change the license when the embargo is done. Furthermore, providing additional metadata for a current publication at some future date is an additional chore for the publisher that might well be overlooked. -In the above examples, the `` element is unqualified and should therefor be considered to apply to the content pointed to by any `` URIs included in the CrossRef metadata. The CrossRef metadata schema supports recording different license for different versions of the resource and this will be discussed below. However, first let's look at at the role the `` element plays in providing funding agency KPIs. +In the above examples, the `` element is unqualified and should therefor be considered to apply to the content pointed to by any `` URIs included in the Crossref metadata. The Crossref metadata schema supports recording different license for different versions of the resource and this will be discussed below. However, first let's look at at the role the `` element plays in providing funding agency KPIs. ## Recording links to full text and/or archived versions of documents, etc. -Funders are not just interested in reporting on the licensing terms of publications resulting from funder-financed research. They are also interested in making sure that the full text content of the BAV is made available for reading, automated processing and archiving. +Funders are not just interested in reporting on the licensing terms of publications resulting from funder-financed research. They are also interested in making sure that the full text content of the BAV is made available for reading, automated processing and archiving. To this end, publishers need to be able to record links to the full text of the content to which a DOI refers. Additionally, publishers will want to offer different versions (e.g. AM or VOR) and different representations (e.g. PDF for viewing, XML for TDM, etc.) of the content tailored for specific applications. -The `` element in CrossRef metadata is most often used to record an HTTP URI pointing at the publisher's landing page for the publication identified by the CrossRef DOI in question. However, the CrossRef schema has long supported the recording of multiple `` elements in order to enable, for example: +The `` element in Crossref metadata is most often used to record an HTTP URI pointing at the publisher's landing page for the publication identified by the Crossref DOI in question. However, the Crossref schema has long supported the recording of multiple `` elements in order to enable, for example: - Multiple resolution - Search engine indexing - CrossCheck indexing -CrossRef has extended the ability to record multiple `` elements in order to allow the recording of URIs which point to the full text of content identified by the CrossRef DOI. The publisher can record multiple representations of the full text (e.g. PDF, XML, plain text) using the new `mime_type` attribute and then, through their access control systems, control who is able to reach which representation and under which conditions. +Crossref has extended the ability to record multiple `` elements in order to allow the recording of URIs which point to the full text of content identified by the Crossref DOI. The publisher can record multiple representations of the full text (e.g. PDF, XML, plain text) using the new `mime_type` attribute and then, through their access control systems, control who is able to reach which representation and under which conditions. -Note that, by recording a `` that points to the full text, you are not necessarily guaranteeing that the URI will be accessible +Note that, by recording a `` that points to the full text, you are not necessarily guaranteeing that the URI will be accessible Note also that the publisher could theoretically choose only to deposit `` elements for full text representations once an embargo has ended. However, this approach may prove fraught, as any mistakes or delays in the redeposit process might lead the funding agency to believe that the publisher has not made the relevant content accessible at the end of the embargo period. -Further detail on using the `` element for recording links to full text can be found on the [Prospect support site](http://prospectsupport.labs.crossref.org/full-text-uris-technical-details/) and in the CrossRef deposit schema documentation for the [ `` ](http://www.crossref.org/schema/documentation/4.3.4/4.3.4.html#collection) and [ `` ](http://www.crossref.org/schema/documentation/4.3.4/4.3.4.html#resource) elements. +Further detail on using the `` element for recording links to full text can be found on the [Text & data mining support site](https://support.crossref.org/hc/en-us/articles/214298866-Full-Text-URIs-Technical-Details) and in the Crossref deposit schema documentation for the [ `` ](http://data.crossref.org/reports/help/schema_doc/4.4.0/schema_4_4_0.html#collection) and [ `` ](http://data.crossref.org/reports/help/schema_doc/4.4.0/schema_4_4_0.html#resource) elements. ## Different licenses for different versions of the content -Some publishers may want to record different licenses for different versions of the `` element recorded in CrossRef metadata. For example, one `` element may point to a URI intended for subscribed readers, while another `` element may point to a version of the document intended for Text and Data Mining (TDM) applications. Similarly, a publisher may choose to apply one license to the "Author Accepted Manuscript" (AM) and another to the "Version of Record" (VOR). +Some publishers may want to record different licenses for different versions of the `` element recorded in Crossref metadata. For example, one `` element may point to a URI intended for subscribed readers, while another `` element may point to a version of the document intended for Text and Data Mining (TDM) applications. Similarly, a publisher may choose to apply one license to the "Author Accepted Manuscript" (AM) and another to the "Version of Record" (VOR). To accommodate these scenarios, the `` element supports an `applies_to` element. Similarly, the `` element has been extended to support a `content_version` attribute. Publishers can use these element/attribute combinations to apply specific licenses to specific versions of the resource. For example, to indicate the "VOR" version of a document is licensed under a proprietary license, but that the "AM" version of the same document is licensed under an open license, the `` and `` elements could be combined like this: @@ -162,7 +211,7 @@ To accommodate these scenarios, the `` element supports an `applies http://creativecommons.org/licenses/by/3.0/deed.en_US - + http://www.psychoceramics.org/fulltext/vor/10.5555/12345678 @@ -182,7 +231,7 @@ The `` and `` elements along with their respective `start http://www.psychoceramics.org/nc_tdm_license.html - + http://www.psychoceramics.org/fulltext/vor/10.5555/12345678 @@ -192,22 +241,22 @@ The `` and `` elements along with their respective `start http://www.psychoceramics.org/fulltext/tdm/10.5555/12345678.xml -Detailed information on recording licensing information in CrossRef metadata can be found in the CrossRef schema documentation for the [ `` ](http://www.crossref.org/schema/documentation/4.3.4/AccessIndicators_xsd.html#license_ref) element. +Detailed information on recording licensing information in Crossref metadata can be found in the Crossref schema documentation for the [ `` ](http://www.crossref.org/schema/documentation/4.3.4/AccessIndicators_xsd.html#license_ref) element. ## "Libre" vs "Gratis" -The license information recorded in the `` element can tell you what you are allowed to do with the resources the licenses point to, but they do not say anything about whether or not there is a monetary charge involved. -In order to allow a publisher to record whether access to the content requires payment, CrossRef supports a new `` element. The `` element is an empty element. It can include two attributes, a `start_date` and an `end_date`. The `` elements works as follows: +The license information recorded in the `` element can tell you what you are allowed to do with the resources the licenses point to, but they do not say anything about whether or not there is a monetary charge involved. +In order to allow a publisher to record whether access to the content requires payment, Crossref supports a new `` element. The `` element is an empty element. It can include two attributes, a `start_date` and an `end_date`. The `` elements works as follows: - - The presence of a element in CrossRef metadata __should_ be interpreted to mean that the full text content pointed to by the DOI resource is available "gratis" during the time period specified by the start_date and end_date attributes. - - If the element only includes a `start_date` attribute, then the element __should__ be interpreted to mean that the content pointed to by the DOI resource will be made gratis from `start_date` on. + - The presence of a element in Crossref metadata __should_ be interpreted to mean that the full text content pointed to by the DOI resource is available "gratis" during the time period specified by the start_date and end_date attributes. + - If the element only includes a `start_date` attribute, then the element __should__ be interpreted to mean that the content pointed to by the DOI resource will be made gratis from `start_date` on. - If the element only includes a `end_date` attribute, then the element __should__ be interpreted to mean that the content pointed to by the DOI resource will be made gratis from the publication date to and including the `end_date`. - If the element has __no__ `start_date` __or__ `end_date` attributes, then the element __should__ be interpreted to mean that the content pointed to by the DOI resource is available "gratis" from the date of publication on. - If the element is not present in the DOI record, one __should not__ assume that the resource pointed at by the DOI is available to read "gratis". When the `` element is combined with the `` element, the publisher can record sophisticated information about the availability and reusability of content. For example: - + **restrictive licenses and possibly a payment** http://tinypublisher.org/licenses/proprietary.html @@ -232,48 +281,114 @@ In order to allow a publisher to record whether access to the content requires p Funders may be concerned that publisher links to full-text content will become unavailable in exceptional circumstances. They may stipulate that content is archived with a third party archiving organization, and may even suggest a list of acceptable archive organizations with which documents should be archived. -Publishers can record the archive arrangement or archive intention of a document using the `` element in CrossRef deposit metadata. Any number of archive locations can be specified, for example a document may be archived with both `Portico` and `CLOCKSS`: +Publishers can record the archive arrangement or archive intention of a document using the `` element in Crossref deposit metadata. Any number of archive locations can be specified, for example a document may be archived with both `Portico` and `CLOCKSS`: -CrossRef maintains a vocabulary of archive locations within the CrossRef deposit schema. The latest list of possible archive location values can be found in the documentation for the [ `` element ](http://www.crossref.org/schema/documentation/4.3.4/4.3.4.html#archive). +Crossref maintains a vocabulary of archive locations within the Crossref deposit schema. The latest list of possible archive location values can be found in the documentation for the [ `` element ](http://data.crossref.org/reports/help/schema_doc/4.4.0/schema_4_4_0.html#archive). + +## Assigning and registering DOIs at acceptance + +Funders and institutions would like to be notified of impending publications as early as possible. Some mandates, like the [one issued by HEFCE](http://www.hefce.ac.uk/pubs/year/2015/CL,202015/), even require that institutional repositories be notified of an impending relevant publication as soon as the manuscript is accepted- even if said manuscript has not yet been made available online. + +### Assigning and registering DOIs for manuscripts that the publisher *has* made available online + +Crossref has always supported the deposit of DOIs for accepted manuscripts __if__ said manuscripts have also been made available online. This is a common practice that goes under various names including “publish ahead of print,” “article in progress,” “article in press,” “online ahead of print,” “online first”, etc. Crossref rules state that in this situation, Crossref members: + + - __may__ assign DOIs to accepted manuscripts + - __should__ carry over the DOI assigned to the accepted manuscript to the final published version. + - __should__ update the metadata for the DOI with that of the final published version. + + These rules reflect that, though there may be significant value added by he publisher between acceptance and final publication, the accepted and the final published version are interchangeable from a citation point of view. This is because the transition from acceptance to the final published version should not introduce any changes that are likely to effect the interpretation of crediting of the work. + +### Assigning and registering DOIs for manuscripts that the publisher *has not yet* made available online + +Crossref will support a new mechanism and workflow to support the registration of DOIs for accepted manuscripts __before__ they are made publicly available online. This feature can be used by publishers as a mechanism for informing funders and institutions of impending publications. To use this, publishers will deposit a special type of Crossref record called "registered content." + +The schema and rules governing the "registered content" type attempt to balance the publisher's desire to control publicity around their content with the requirements that funders and institutions have to know as soon as possible when content governed under their mandates has been accepted for publication. Once the publication is made available online (either as an accepted manuscript or version of record), then the publisher can simply redeposit and replace the "registered content" record with a full metadata record using a Crossref schema appropriate to the publication (e.g. journal article). + +DOIs for "registered content" will not resolve to a publisher's landing page. Rather, they will resolve to a landing page controlled by Crossref. This landing page will __minimally__ display the DOI, the acceptance date, and an "intent to publish statement" which, by default will read as follows (with the appropriate {variables} filled in): + +>> The DOI {DOI} has been registered for content that was accepted for publication by {publisher name} on {date_of_acceptance}. When this content is available, the publisher will update this DOI, at which point it will automatically redirect you to the copy on the publisher's site. + +Publishers will able to apply limited customizations to the landing page. These include: + +- a custom "intent to publish" statement which will replace the default one provided by Crossref. +- a publisher logo to display at the top of the landing page. +- the display of all provided optional extra metadata such as funder identifiers, ORCID ids, license information, etc. +- a CrossMark, to handle situations in which a publisher rescinds an acceptance. + +If the publisher provides metadata beyond that required, it will be displayed on the landing page below the "intent to publish" statement. "Registered content" records will also be made available through the Crossref REST API and through Crossref metadata search. + +By having Crossref control the landing page for "registered content" we can ensure that: + +- there is always a landing page- even for content that it not yet available online +- metadata is displayed consistently for registered content +- members do not abuse the lightweight metadata requirements of "registered content" in order to register other content types more easily. + +The schema for "registered content" will only support a minimal subset of metadata that can be used by funders and institutions to detect and flag impending publications that are relevant to them. + +Registered content DOI records: + +- __must__ include a DOI +- __must__ include a date of acceptance +- __must__ include the publisher name +- __must__ must be replaced with appropriate full metadata using an appropriate schema for the content type when the publisher makes the content publically available +- __should__ include an "intent to publish statement." If the publisher does not provide an "intent to publish statement" of their own, then Crossref will provide a default statement. +- __may__ include a logo to diplay at the top of the landing page +- __may__ include a custom "intent to publish statement." +- __should__ include funder information +- __should__ include Open Funder Registry funder identifiers corresponding to their funder names where these exist in the registry +- __should__ include ORCIDs +- __should__ include license information +- __should__ author affiliation information +- __may__ include the publication title +- __may__ include the item title (e.g. article title) + +The "registered content" type is specifically designed for publishers who want to be able to inform funders and institutions that an article has been accepted for publication, even when the publisher is for some reason unable to make the accepted manuscript publicly available. The landing page and metadata for the "registered content" type allow the publisher to provide as much data as is needed to notify interested parties, yet not reveal commercially or promotionally sensitive information about their upcoming publications. When the publisher is ready to make the content available, they simply have to replace the temporary "registered content" record with a permanent DOI record. ## Bonus points -The more metadata that publishers record for publications arising from agency funded research, the more useful that metadata will be to said agencies and the more value they will see from publishers. Where as the above sections details metadata elements that agencies will __expect__ in order to be able to compile basic KPIs and offer portal services, additional metadata will allow agencies to create even more sophisticated KPIs and services. As such, publishers should seriously consider depositing the following additional metadata elements in their CrossRef deposits. +The more metadata that publishers record for publications arising from agency funded research, the more useful that metadata will be to said agencies and the more value they will see from publishers. Where as the above sections details metadata elements that agencies will __expect__ in order to be able to compile basic KPIs and offer portal services, additional metadata will allow agencies to create even more sophisticated KPIs and services. As such, publishers should seriously consider depositing the following additional metadata elements in their Crossref deposits. #### Distributing standard bibliographic metadata -Metadata deposited to CrossRef is made available freely via numerous CrossRef query APIs. However all deposited metadata is subject to opt-outs in the case of bulk distribution APIs and data dumps. In order to make sure that bibliographic metadata for publications arising from agency funding is maximally available, publishers __should__ consider setting the value of the `` element for DOIs to `any`. Further details can be found in [CrossRef's schema documentation for the `` element.](http://www.crossref.org/schema/documentation/4.3.4/NO_NAMESPACE.html#metadata_distribution_opts.att_metadata_distribution_opts) +Metadata deposited to Crossref is made available freely via numerous Crossref query APIs. However all deposited metadata is subject to opt-outs in the case of bulk distribution APIs and data dumps. In order to make sure that bibliographic metadata for publications arising from agency funding is maximally available, publishers __should__ consider setting the value of the `` element for DOIs to `any`. Further details can be found in [Crossref's schema documentation for the `` element.](http://www.crossref.org/schema/documentation/4.3.4/NO_NAMESPACE.html#metadata_distribution_opts.att_metadata_distribution_opts) #### Distributing references -References made in publications arising from agency funding can provide agencies with an overview of what literature is considered important in the fields that they fund. Many publishers deposit references to CrossRef as part of their participation CrossRef's [CitedBy](http://www.crossref.org/citedby/index.html) service. However, participation in CitedBy does not automatically make references available via CrossRef's standard APIs. In order for publishers to distribute references along with standard bibliographic metadata, publishers need to set the `` element to `any` for each DOI deposit where they want to make references openly available. By setting this element, references for the DOI will be distributed without restriction through all of CrossRefs APIs and bulk metadata dumps. Further details can be found in [CrossRef's schema documentation for the `` element.](http://www.crossref.org/schema/documentation/4.3.4/4.3.4.html#reference_distribution_opts.att) +References made in publications arising from agency funding can provide agencies with an overview of what literature is considered important in the fields that they fund. Many publishers deposit references to Crossref as part of their participation Crossref's [CitedBy](http://www.crossref.org/citedby/index.html) service. However, participation in CitedBy does not automatically make references available via Crossref's standard APIs. In order for publishers to distribute references along with standard bibliographic metadata, publishers need to set the `` element to `any` for each DOI deposit where they want to make references openly available. By setting this element, references for the DOI will be distributed without restriction through all of Crossrefs APIs and bulk metadata dumps. Further details can be found in [Crossref's schema documentation for the `` element.](http://www.crossref.org/schema/documentation/4.3.4/4.3.4.html#reference_distribution_opts.att) +Metadata deposited to CrossRef is made available freely via numerous CrossRef query APIs. However all deposited metadata is subject to opt-outs in the case of bulk distribution APIs and data dumps. In order to make sure that bibliographic metadata for publications arising from agency funding is maximally available, publishers __should__ consider setting the value of the `` element for DOIs to `any`. Further details can be found in [CrossRef's schema documentation for the `` element.](http://www.crossref.org/help/schema_doc/4.3.6/4_3_6.html#metadata_distribution_opts.att) + +#### Distributing references + +References made in publications arising from agency funding can provide agencies with an overview of what literature is considered important in the fields that they fund. Many publishers deposit references to CrossRef as part of their participation CrossRef's [CitedBy](http://www.crossref.org/citedby/index.html) service. However, participation in CitedBy does not automatically make references available via CrossRef's standard APIs. In order for publishers to distribute references along with standard bibliographic metadata, publishers need to set the `` element to `any` for each DOI deposit where they want to make references openly available. By setting this element, references for the DOI will be distributed without restriction through all of CrossRefs APIs and bulk metadata dumps. Further details can be found in [CrossRef's schema documentation for the `` element.](http://www.crossref.org/help/schema_doc/4.3.6/4_3_6.html#reference_distribution_opts.att) #### CrossMark -[CrossMark](http://www.crossref.org/crossmark/) provides a standard mechanism for alerting researchers to updates to published documents- including corrections, errata, corrigenda retractions and withdrawals. Use of the CrossMark service sends a signal to researchers and agencies that publishers are committed to maintaining the integrity of the scholarly record. +[CrossMark](http://www.crossref.org/crossmark/) provides a standard mechanism for alerting researchers to updates to published documents- including corrections, errata, corrigenda retractions and withdrawals. Use of the CrossMark service sends a signal to researchers and agencies that publishers are committed to maintaining the integrity of the scholarly record. -Additionally, CrossMark also provides a standard, cross-publisher, user interface that researchers can use to view FundRef information and licensing information. This user interface works both from publisher landing pages and from published PDFs. More information can be found on the [CrossMark support site](http://crossmarksupport.crossref.org/) +Additionally, CrossMark also provides a standard, cross-publisher, user interface that researchers can use to view Funder Data and licensing information. This user interface works both from publisher landing pages and from published PDFs. More information can be found on the [CrossMark support site](http://crossmarksupport.crossref.org/) #### Abstracts Many funding agencies are interested in building custom portals that highlight agency-funded research. In order to provide users of these portals with the best experience, agencies will want, where possible, to display abstracts of publications along with their standard bibliographic metadata. -CrossRef supports the deposit of abstracts conforming to the [JATS](http://jats.nlm.nih.gov/) abstract element. Further details can be found in the [CrossRef Schema Documentation of the `` element](http://www.crossref.org/schema/documentation/4.3.4/JATS1.html#abstract). +Crossref supports the deposit of abstracts conforming to the [JATS](http://jats.nlm.nih.gov/) abstract element. Further details can be found in the [Crossref Schema Documentation of the `` element](http://www.crossref.org/schema/documentation/4.3.4/JATS1.html#abstract). #### ORCIDs -[ORCID](http://www.orcid.org/)s are unique identifiers for researchers. CrossRef supports the deposit of ORCIDs for authors. The presence of ORCIDs in CrossRef metadata will, in turn, allow agencies to tie agency funded research publications directly to researchers. Widespread use of ORCIDs in CrossRef deposits could even let agencies start to develop publication KPIs for researchers that they fund. Further details on CrossRef's ORCID support can be found in the [CrossRef Schema Documentation of the `` element](http://www.crossref.org/schema/documentation/4.3.4/4.3.4.html#ORCID) +[ORCID](http://www.orcid.org/)s are unique identifiers for researchers. Crossref supports the deposit of ORCIDs for authors. The presence of ORCIDs in Crossref metadata will, in turn, allow agencies to tie agency funded research publications directly to researchers. Widespread use of ORCIDs in Crossref deposits could even let agencies start to develop publication KPIs for researchers that they fund. Further details on Crossref's ORCID support can be found in the [Crossref Schema Documentation of the `` element](http://www.crossref.org/schema/documentation/4.3.4/4.3.4.html#ORCID) + ## Frequently Asked Questions -**Q:** What license applies to the metadata retrieved by the [CrossRef APIs to support key performance indicators (KPIs) for funding agencies](funder_kpi_api.html)? +**Q:** What license applies to the metadata retrieved by the [Crossref APIs to support key performance indicators (KPIs) for funding agencies](funder_kpi_api.html)?
-**A:** CrossRef asserts no claims of ownership to individual items of bibliographic metadata and associated Digital Object Identifiers (DOIs) acquired through the use of the CrossRef Free Services. Individual items of bibliographic metadata and associated DOIs may be cached and incorporated into the user's content and systems. More information can be found [on our web site](http://www.crossref.org/requestaccount/). +**A:** Crossref asserts no claims of ownership to individual items of bibliographic metadata and associated Digital Object Identifiers (DOIs) acquired through the use of the Crossref Free Services. Individual items of bibliographic metadata and associated DOIs may be cached and incorporated into the user's content and systems. **Q:** What does it mean if a `` element has no `start_date` attribute?
@@ -297,16 +412,16 @@ CrossRef supports the deposit of abstracts conforming to the [JATS](http://jats. ### Full Deposits -Full deposits use the [standard deposit schema](http://www.crossref.org/schema/deposit/crossref4.3.4.xsd). +Full deposits use the [standard deposit schema](http://data.crossref.org/reports/help/schema_doc/4.4.0/4.4.0.html). - [Full deposit](examples/full.xml) - [Full deposit with CrossMark](examples/full-crossmark.xml) ### Partial Deposits -Partial deposits use the [resource deposit schema](http://doi.crossref.org/schemas/doi_resources4.3.2.xsd). +Partial deposits use the [resource deposit schema](http://data.crossref.org/schemas/doi_resources4.3.2.xsd). -Partial deposits update only part of a DOI's metadata. In the CrossRef help system +Partial deposits update only part of a DOI's metadata. In the Crossref help system they are referred to as **resource deposits**, but it is not just resources that can be provided as a partial deposit. Licenses, funding information and CrossMarks can also be provided as partial deposits. @@ -320,4 +435,7 @@ one updating funding information, the other updating license information. - [Partial deposit of funding information without CrossMark](examples/partial-funders.xml) - [Partial deposit of license information without CrossMark](examples/partial-licenses.xml) - [Partial deposit of a CrossMark with license and funding information](examples/partial-crossmark.xml) - + +### Registered Content Deposits + +*coming soon* diff --git a/labs_email.png b/labs_email.png deleted file mode 100644 index 0da924e..0000000 Binary files a/labs_email.png and /dev/null differ diff --git a/rest_api.md b/rest_api.md new file mode 100644 index 0000000..d73f454 --- /dev/null +++ b/rest_api.md @@ -0,0 +1,4 @@ +# Crossref REST API + +[We have moved the main documentation page.](https://github.com/CrossRef/rest-api-doc) +