Coming to Terms with Data

By Christa Williford posted 09-14-2012 09:15

Much of CLIR's activity in recent months relates to the use and management of digital research data. The new CLIR/DLF Postdoctoral Fellowships in Data Curation for the Sciences and Social Sciences, partially funded by the Alfred P. Soan Foundation, is one attempt to address the "problem of data." It is CLIR's hope that these fellows, by collaborating with (or becoming) tomorrow's leading data curators and data archivists, will help envision solutions to our academic culture's "data problems."

At their recent kickoff seminar at Bryn Mawr College, the new fellowship recipients, their professional mentors, and invited speakers wrestled with the concepts associated with research data and research data curation from multiple perspectives. Through their conversations, it soon became clear that the definitions of these terms are by no means consistent or clear. What, after all, distinguishes research data from other forms of data or, indeed, from digital files of any kind?

The answer to the question "What is data?" is as broad and complicated as the question "What is research?" And then, "What is data curation?" How are data curation activities distinct from data storage or archiving? As one seminar guest, Clifford Lynch, admitted, as far as data curation is concerned, we have "terrible language problems" that must be addressed in order to create effective strategies for dealing with data.

Sayeed Choudhury has offered a useful model for coming to terms with these latter questions, but much more remains to be done to translate this model into plans, policies, training programs, or divisions of roles and responsibilities. The "messiness" of data, to use Choudhury's word, is inherent and we shouldn't expect it to change, but our terminology can become more precise.

As we move forward with the Data Curation Fellows, we are eager to hear from our constituents about their own experiences in designing data services at their institutions. How do you define research data and data curation? Who shares responsibility for it, and what changes are you implementing to make "the problem of data" less daunting? How have your notions about these concepts and strategies been evolving over time?

For those of you grappling with these issues at your institutions, I draw your attention to a current opportunity: CLIR is right now recruiting new prospective hosts for the 2013-2015 Postdoctoral Fellows in Data Curation for the Sciences and Social Sciences. For details, please visit the program's web pages.