Metadata Literature Review

Today we continue to share outputs from the Metadata 2020 projects. We are excited to announce the publication of a peer-reviewed academic literature review which has been published in RIO Journal. The review presents insights gained from comparing a range of articles that address the challenges and opportunities present in scholarly communications metadata.

  Gregg WJ, Erdmann C, Paglione LAD, Schneider J, Dean C (2019) A literature review of scholarly communications metadata. Research Ideas and Outcomes 5: e38698.  


The idea for the review originated with members of the Researcher Communications project who identified a need for a comprehensive review of the challenges, opportunities, and gaps facing metadata stakeholders.

Share your thoughts! By sharing and inviting comment on this review, Metadata 2020 seeks to facilitate future research and conversations between stakeholders. We look forward to reading your comments and contributions in RIO Journal; just highlight some text, and add your annotations or click the “Review Article” button in the upper left corner.

In addition, a collection of articles reviewed for the paper and other related material can be found in the ScienceOpen collection created by Will Gregg, Stephanie Dawson, and Christopher Erdmann. 10.14293/S2199-1006.1.SOR-UNCAT.CLOD0SO.v1

A summary of the review’s findings is presented below.


Though calls to clean up metadata frequently fall on publishers, demonstrations of metadata’s ROI are increasingly recognized and publishers are projected to invest more in quality metadata. Publishers are also on the forefront of adopting emerging technologies for automatic generation of metadata including full-text semantic analysis. Other stakeholders stand to benefit from hearing from publishers directly about their day-to-day practices.

Service Providers

A lack of consistent metadata standards among service providers, or inconsistencies between publishers and service providers, causes problems for resource discovery and for accessing full text content. Initiatives for open metadata and usage data, transparent pricing and contracts, allowance for community input, and use of well-established international standards to promote interoperability would bring positive change of the sort modeled in other industries.


The scholarly communications lifecycle begins and ends with the researcher. Nevertheless this group faces challenges when it comes to metadata: the literature notes problems creating consistent metadata for article submissions that remains accurate over time as well as significant challenges in managing metadata for research data. Researchers stand to benefit greatly from improved metadata; opportunities for future research include studies of researchers across disciplines, metadata quality assessments, and surveys of researcher practices.


Research funders now commonly require that researchers submit data management plans to promote accessibility and reproducibility of research. Funders receive criticism, however, for failing to provide guidance or tools for managing data. Opportunities for funders reside in this area as well as in utilizing metadata that promotes the fiscal health and long-term success of funding organizations.


Library systems are likely the places where end-users most frequently encounter metadata and the challenges that librarians face are those faced by the metadata supply chain as a whole. The publisher- and vendor-supplied metadata on which librarians rely can be inconsistent or unreliable, leading to problems for the end user in accessing resources. Librarians are in an especially good position to advocate for quality metadata, though the roles of librarians may not be clear to those coming from outside the scope of this wide profession.

Data Curators and Repositories

Data curation has grown in importance with movements for access to data and a view that the entire research process – and not just the final product – has value. Data curators and repositories face metadata challenges, however, when it comes to long-term preservation, describing subject-specific content while promoting access by the general public, and promoting and developing means to cite research data. Promising initiatives have emerged regarding the storage, description, and citation of data but remain far from universally adopted.

All Stakeholders

It is clear from the literature that stakeholders depend on the quality of one another’s metadata, though only a small body of work exists which is dedicated to studying the complex interactions between them. Attempts to diagram scholarly communications address this issue in part but do have not focused explicitly on metadata creation and use. The desire to speak to all parties which interact with metadata may guide literature in the future.

About the author

Will Gregg is affiliated with Metadata 2020’s Researcher Communications Project. He is a recent graduate of the Master’s of Library and Information Science program at Simmons University where he specialized in metadata and encoding protocols for archival arrangement and archives-adjacent projects in the digital humanities. He now lives in New Hampshire.