At Crossref and ROR, we develop and run processes that match metadata at scale, creating relationships between millions of entities in the scholarly record. Over the last few years, we’ve spent a lot of time diving into details about metadata matching strategies, evaluation, and integration. It is quite possibly our favourite thing to talk and write about! But sometimes it is good to step back and look at the problem from a wider perspective.
This year’s public data file is now available, featuring over 156 million metadata records deposited with Crossref through the end of April 2024 from over 19,000 members. A full breakdown of Crossref metadata statistics is available here.
Like last year, you can download all of these records in one go via Academic Torrents or directly from Amazon S3 via the “requester pays” method.
Download the file: The torrent download can be initiated here.
Earlier this year, we reported on the roundtable discussion event that we had organised in Frankfurt on the heels of the Frankfurt Book Fair 2023. This event was the second in the series of roundtable events that we are holding with our community to hear from you how we can all work together to preserve the integrity of the scholarly record - you can read more about insights from these events and about ISR in this series of blogs.
Crossref is undertaking a large program, dubbed 'RCFS' (Resourcing Crossref for Future Sustainability) that will initially tackle five specific issues with our fees. We haven’t increased any of our fees in nearly two decades, and while we’re still okay financially and do not have a revenue growth goal, we do have inclusion and simplification goals. This report from Research Consulting helped to narrow down the five priority projects for 2024-2025 around these three core goals:
Members can participate in Cited-by by completing the following steps:
Deposit references for one or more prefixes as part of your content registration process. Use your Participation Report to see your progress with depositing references. This step is not mandatory, but highly recommended to ensure that your citation counts are complete.
We will match the metadata in the references to DOIs to establish Cited-by links in the database. As new content is registered, we automatically update the citations and, for those members with Cited-by alerts enabled, we notify you of the new links.
Display the links on your website. We recommend displaying citations you retrieve on DOI landing pages, for example:
If you are a member through a Sponsor, you may have access to Cited-by through your sponsor – please contact them for more details. OJS users can use the Cited-by plugin.
Citation matching
Members sometimes submit references without including a DOI tag for the cited work. When this happens, we look for a match based on the metadata provided. If we find one, the reference metadata is updated with the DOI and we add the "doi-asserted-by": "crossref" tag. If we don’t find a match immediately, we will try again at a later date.
There are some references for which we won’t find matches, for example where a DOI has been registered with an agency other than Crossref (such as DataCite) or if the reference refers to an object without a DOI, including conferences, manuals, blog posts, and some journals’ articles.
To perform matching, we first check if a DOI tag is included in the reference metadata. If so, we assume it is correct and link the corresponding work. If there isn’t a DOI tag, we perform a search using the metadata supplied and select candidate results by thresholding. The best match is found through a further validation process. Learn more about how we match references. The same process is used for the results shown on our Simple Text Query tool.
All citations to a work are returned in the corresponding Cited-by query.
Page owner: Isaac Farley | Last updated 2023-April-28