When it comes to richer metadata, know what you don't know

Full disclosure: I am not a metadata expert. What I am is a branding and marketing consultant specialising in scholarly communications. I’ve been working in the industry since 1999 and in that time I’ve done a whole lot of research with pretty much every type of audience you can imagine.

Hundreds of in-depth interviews and dozens of focus groups later, I know a thing or two about publishers, librarians, researchers, funders and other stakeholders in the scholarly ecosystem. I’ve done research on pricing and products, access and impact…you name it. (Now that you mention it, I’ve done naming research as well.) But metadata? Now that’s a whole other ball game.

So when the opportunity arose to interview a host of experts in the field I jumped at the chance. My brief was to figure out what “richer” metadata really means, and explore the big, overarching benefits; learning more about the challenges the community is facing in reaching this ambitious goal.

In the course of my research I had the pleasure of speaking, one-on-one to the following experts in their respective fields:

Theodora Bloom, Executive Editor, BMJ
Patricia (Trisha) Cruse, Executive Director and Manager Strategic Partnerships, DataCite
Stefanie Haustein, Post Doc Researcher, University of Montreal
Natalia Manola, Project Manager, OpenAIRE
Eva Méndez, Professor of Information & Document Management, University of Madrid
Cameron Neylon, Professor of Research Communications at the Centre for Culture and Technology at Curtin University
Mark Patterson, Managing Executive Editor, eLife
Scott Plutchak, Director of Digital Data Curation Strategies, University of Alabama
Kristen Ratan, Co-Founder and Executive Director of the Collaborative Knowledge Foundation (Coko)
Mike Taylor, Head of Metrics Development, Digital Science
Roy Tennant, Senior Program Officer, OCLC

Here’s an overview of the insights we gained from the research along with a sampling of comments from the people we spoke with:

Insight 1:

The story begins and ends with the researcher

If it’s good for the researcher, it’s good for everyone. What do they need? How will richer metadata make their lives easier?

“If I’m talking to a researcher their automatic, deer in the headlight look is that you’re asking me to do more stuff’. You want to cast it such that we’re very careful what we’re asking for so that it’s clear that there are bigger payoffs.”

Insight 2:

Metadata is the means, not the goal

Demonstrate the importance of the interconnected whole.

“I feel like it could make things more complex. If we need a new collection of attributes that we agree upon then we have to fill it up. Perhaps that becomes another metadata standard. It’s like toothbrushes. Everybody likes the idea but everybody wants to use his or her own. It could happen that we create something that we don’t need.”

Insight 3:

Remember, there is no one size fits all

Pay attention to the detail. Embrace the diversity.

“The minimum we need is one author, one title and one publication date but we still need to give room for details. It’s definitely not just more; it’s knowing what’s behind it. For me it’s also central that you don’t try to do a one size fits all because that will never work. If you can live with the problem that some things aren’t complete, detail is better. It’s so much richer. Accept the diversity.”

Insight 4:

Consider the blank slate

Reimagine, reinvent. Walk and talk like techies. Synchronization and automation is key.

“This happens all the time in the tech industry. They got together to reinvent cloud hosting because they didn’t want to be reliant on Amazon. They invested in shared open infrastructure that everyone could benefit from. They’re all competing but they can be differentiating. Have a look at OpenStack, if you look at the website it’s all these large competing tech companies.”

Insight 5:

Metadata is a messy business

Minimal viable record? Been there, done that. Fine grain the metadata and accept the notion that records will be incomplete.

“Don’t try to make it minimal! For example, if you’re just saying publication date, what does that mean? The minimal record should distinguish between received, reviewed, accepted, deposited online, and then published in the journal.”

Insight 6:

Every day, storytelling

Explore the world of possibilities but keep it real.

“Don’t ask about metadata but ask them ‘Where are you wasting a lot of time?’ The everyday researcher has no clue about what’s happening in the background. It’s all about demonstrating what you’ve done as a researcher. Ask about their every day problems.”

Insight 7:

Drive the agenda

Crossref can take the lead but it shouldn’t be publisher-centric.

“In the industry we’ve done a lot of things by committee and we might want to push things harder than that. Drive the agenda rather than follow it.”

Insight 8:

Walk before you run

Our approach should be methodical and considered.

“I think of this as a sort of wicked problem where the boundaries are not well defined. It requires people with a lot of different skill sets to work on it.”

Insight 9:

Connect all the dots

Get all the players to sit at the table (publishers, funders, governments, platforms, etc.). Scope beyond journals.

“The challenge is being able to simplify and focus. What is it that you’re really trying to achieve? Don’t get lost in the possibilities.”

Insight 10:

Create a sense of urgency

Demonstrate the cost benefit and the opportunity cost. Make it obvious that the payoffs are bigger than the sacrifices. Make it a help, not a hindrance.

“That’s why I like the metaphor of agile development, you need to demonstrate the cost of low quality metadata, that is the key.”

Insight 11:

Job one: Define richer metadata

Make sure we’re all speaking the same language and share the same vocabulary.

“It’s like having fantastic telephone lines but if you’re speaking a foreign language, the two parties can’t communicate effectively. The best technology in the world won’t help”

Keep an eye out for additional posts where I’ll expand on these insights and provide highlights of Metadata 2020 initiatives that seek to answer some of the issues raised. In the meantime we hope you’ll take a few minutes to consider the following questions in the context of your own organization.

As metadata advocates what do you need that you don’t have? Do researchers need to know or care about metadata?
How can we demonstrate the value of metadata? Is the value of metadata something your organization discusses?
How do we make better use of what we have? Can we strike a balance between consistency and flexibility?
What lessons can we learn from other industries?
Are you willing to sacrifice completeness for detail?
Do you have stories to share?
Let us know what you think. We would love to hear from you.

About the author

Paula Reeves is a senior marketing communications consultant with extensive experience working in global markets, having lived and worked in the US, Europe, and the UK. Over the course of her career, Paula has provided strategic consultation and led projects for many of the world’s largest scholarly publishers – developing campaigns for flagship brands across the industry and working on complex issues such as discoverability, accessibility, and reproducibility. In the course of her work, she has facilitated workshops, led focus groups, and conducted one-on-one interviews with hundreds of stakeholders—from funders and publishers to researchers and librarians—gaining invaluable breadth and depth of insight into scholarly research and dissemination.