Addressing tensions between rights and access in CLIR’s proposed digitization program

By Christa Williford posted 11-13-2014 13:54


As two previous posts here have noted, we at CLIR have spent much time this year considering the feasibility of building a national digitization program on the model of Cataloging Hidden Special Collections and Archives. There seems to be strong interest among our sponsors and other constituents in our pursuing this possibility, and we hope to be in a position to announce specific plans about this new venture by the end of 2014.

In some ways, we can view the transition to digitization as a natural evolution of our previous efforts to expose critically important materials in ways that advance knowledge. Yet, even if scholarship remains the focus of our program, the work of creating full digital surrogates of rare and unique collections raises challenging legal questions about intellectual property and privacy. Cultural heritage professionals must grapple with these questions before digitizing content that is (or could be) under legal protection.

Fortunately, leaders in the cultural heritage communities have already invested much time and energy into helping institutions set policies and procedures that maximize full access to materials while attending to the complex legal questions inherent in their digitization.

Peter Hirtle, Emily Hudson, and Andrew Kenyon published Copyright and Cultural Heritage Institutions: Guidelines for Digitization for U.S. Libraries, Archives, and Museums in 2009; this comprehensive work includes guidelines for libraries on taking a “risk management” approach to digitization practice, rather than conducting research into intellectual property issues on an item-by-item basis. Earlier that year, the Society of American Archivists’ (SAA) Intellectual Property Working Group released its Orphan Works: Statement of Best Practices. In 2010, OCLC Research held asymposium for the Research Libraries Group membership that outlined key challenges and promoted “Well-intentioned practice for putting digitized collections of unpublished materials online,” while the Association of Research Libraries (ARL) released its “Principles to Guide Vendor/Publisher Relations in Large- Scale Digitization Projects of Special Collections Materials.”

In 2012, ARL published a special issue of Research Library Issues titled Special Collections and Archives in the Digital Age. This issue includes a model deed of gift, a model digitization agreement, and an essay by Kevin Smith that neatly outlines four strategies for assessing possible legal risks of digitization efforts. The Berkeley Digital Library Copyright Project, with the support of the Alfred P. Sloan Foundation, has produced numerous papers that explore the many legal issues relevant to digitization in cultural heritage institutions. Individual institutions and consortia, such as the Triangle Research Libraries Network, have begun drafting and sharing intellectual property policy strategy documents that govern their digitization programs.

Dealing with privacy concerns can be even more challenging than addressing questions of intellectual property raised in the course of digitization work. Without careful monitoring, exposing whole collections online can potentially violate the privacy of donors or other subjects. Health, educational, financial or legal records are a few of the more obvious types of content that require special attention, and such items appear frequently among personal papers and often within other kinds of unique collections. Libraries, archives, and museums have both legal and ethical obligations to provide access to such materials only in circumstances where there is minimal or no risk to the privacy of an individual or family.

Discussions about privacy have long been important to librarians and archivists, but the past year has seen reanimated interest and activity in this area, some of which is fueled by increased focus on digitization and access and some of which focuses on risks associated with the use of digital content. At the most recent SAA annual meeting, archivists (including representatives of a project funded through Cataloging Hidden Special Collections and Archivesdiscussed the potential of “privacy aware processing” methodologies for efficiently creating access. Just last week, a panel discussion at the Charleston Conference focused on “Privacy in the Digital Age: Publishers, Libraries, and Higher Education.”

As an administrator of a new grant program, CLIR would be in no position to offer any grant applicant or recipient legal advice; however, staff will monitor current standards and practices relating to intellectual property and privacy to help us assess applicants’ approaches to legal issues related to digitization. It will also be important that we establish clear, consistent requirements for our program, at least for intellectual property. These will have to reflect the values expressed in the policy of the program’s potential funder, The Andrew W. Mellon Foundation.

While program guidelines are still in development and subject to the foundation’s approval, our approach will involve a program-specific intellectual property agreement that all recipients would be required to sign as a condition of accepting the grant. In this binding agreement, enforceable by CLIR, the recipient institution would accept full legal responsibility for all project activities. It would require that recipients make all digital content created through their projects available for noncommercial scholarly and educational purposes and that all metadata describing that content be explicitly dedicated to the public domain.

Requiring a public domain dedication for project metadata was inspired by the policy adopted last year for the Digital Public Library of America(DPLA), which was itself inspired by action taken by the global digital library leader Europeana in 2012. CLIR’s rationale for following the DPLA’s example of requiring (in DPLA Director Dan Cohen’s words) a “maximally permissive” license to metadata is to promote the widespread and long-term discoverability of digital content, even in cases where access to the content itself is legally restricted in some way. While metadata are essentially statements of fact and so (under U.S. law) not copyrightable, a public domain dedication removes any doubt about the legal status of the descriptions of digitized materials and the purposes for which those descriptions may be used. In the fast-evolving world of the Web, allowing those descriptions to be used and re-used without limitations or additional requirements will help ensure that records of today’s “unhidden” collections persist over time.

That said, allowing metadata to be used for any purpose does not eliminate the need to establish and promote good practices for sharing and using the data created by collecting institutions. The DPLA’s Data Use Best Practices Statement summarizes why proper citation and attribution remain critical to building effective digital libraries. We hope that recipient institutions will find ways to make it easy for those who use their collections to credit sources correctly and to understand and abide by any legal restrictions applicable to the content they digitize.

With these goals in mind, DPLA has been working with global partners to develop a standardized set of machine-actionable rights statements that can be incorporated into metadata; the most recent version of the DPLA’s metadata application profile indicates that future versions of the profile will require use of these standardized rights statements in metadata aggregated into DPLA collections. CLIR will watch this initiative closely and will look to grant applicants and recipients for ways these standards can help individual institutions streamline their own approaches to digitization and metadata creation.

As we continue to plan for our new program, CLIR is interested to hear from you about resources that you have found helpful in understanding and addressing legal issues arising in your work to digitize rare and unique materials. What guidelines and best practices seem most appropriate to your collections and institutional contexts? What barriers unreasonably inhibit access to collections protected by law, and how might these barriers be surmounted? How might CLIR use its new program as an opportunity to facilitate conversation and promote good practices related to intellectual property and privacy in the context of digitization projects?

We would welcome suggestions of additional resources or responses to any of these questions in the comments below, or privately at




12-02-2014 10:26

This comment addresses risk analysis and perceived copyright concerns, Jean Dryden has published some interesting data about risk analysis of digitization in cultural institutions across both the U.S. and Canada. Her doctoral dissertation, completed at the University of Toronto in 2008, investigated the copyright practices of Canadian archival repositories in making their holdings available on the Internet. She was also the Principal Investigator for a similar U.S. comparative study funded by IMLS that investigated the copyright practices of American archival repositories in digitization projects and their impact on users. She published the results of the U.S. study in the Fall/Winter 2011 issue of American Archivist.
At the 2011 SAA annual conference, Peter Hirtle, Jean Dryden, and I presented papers about risk analysis for digitization and online access. My paper (Risqué Business: Risks and Realities of Digitizing Artists' Papers) was a case study of the Archives of American Art's large scale digitization initiative begun in 2005. The paper argued that fair use and transformative use are strong defenses, particularly if the institution is digitizing entire series and collections, and that there are other risks to consider in addition to copyright, such as appropriateness, privacy, and third party issues. And, all in all the benefits far outweighed the risk.
The Archives of American Art began digitizing entire manuscript collections on a large scale in 2005 with the goal of providing open online access. Because of our long history of microfilming for access, we assumed a calculated high-risk approach from the beginning, relying on fair use. For example, we do not seek permissions from third parties represented in the papers; we digitize portions of published materials, such as news clippings and published content found in scrapbooks (not published books in their entirety); we rely on our processing archivists' appraisal skills to flag content not suitable for scanning, such as PII, banking records, pornography, photographs of nude children, etc. But, for the most part, our goal is to digitize entire series/collections. In other words, we've been implementing the tenants outlined in the Well Intentioned Practice...paper for nearly 10 years now with only positive outcomes. Now, with circa 140 fully digitized archival collections measuring over 1,000 linear feet and represented online by 2 million digital images, we've had only 2 complaints regarding copyright. One complaint resulted in making one digitized photograph available online as a thumbnail only, and the second only required seeking the approval of the creator/copyright holder. We do have a "take down" policy, but have never removed any digital content.
When copyright is known we provide proper credit in our item-level interface, but no credits are given when we digitize entire collections because we do not create item level metadata. Rather, we repurpose the EAD finding aid as the only descriptive metadata for the online interface. For archivists concerned with copyright violation, presenting digital content as entire series or entire collections also supports a transformative use argument (in my opinion.)
Barbara Aikens
Head of Collections Processing
Archives of American Art, Smithsonian Institution

11-20-2014 18:46

Michael this is an incredibly helpful comment, which gives a much richer context to the issues at play than we were able to do in the post.
The point we were trying to get across in the post was that we are aware that our grant applicants are grappling with a lot of complex intellectual property challenges in their digitization work, and that we are trying to create a program that makes space for applicants to think creatively about the ways they address those challenges. Digitizing only public domain materials because it is "safer" or "easier" does a disservice to scholarship, and we want our applicants to prioritize what is important from a scholarly perspective.
But there is still a pretty distinct difference between opening the door to digitization of collections still protected by law and creating opportunities for institutions to introduce new restrictions over public domain materials. Everything we have been hearing from our consultants during our research toward program development is in agreement with the Europeana Public Domain Charter and the community leaders you cite, and that is certainly consonant with the spirit in which we have been developing our plans and crafting the IP agreement that applicants would commit to signing.
From a very practical standpoint of program design, the tension is this: when drawing up our guidelines and creating an application we must think very carefully about what we mandate as requirements, what we state as strong recommendations, and what we leave our more knowledgeable applicants and reviewers to negotiate during the review process. We have a choice to make here: do we as CLIR staff accept the administrative burden of assessment and enforcement that would accompany a mandate against these practices, or do we allow applicants to tell us why their proposed practices make sense and serve the public interest and leave it to our reviewers to judge on a case-by-case basis what is reasonable and worthy of funding (creating opportunities for learning and dialogue about these issues in the process)? Both options have advantages and drawbacks. Until we have a firm set of guidelines in place and approved, this is still an unsettled question. I would be very interested in hearing from potential applicants about what their preference would be. Is there genuine utility to setting up a strict eligibility requirement for those working with public domain materials? Is the tide shifting in the community so strong already that introducing a requirement of this kind is even necessary? We will undoubtedly be looking to our panel as we make firm choices and review our eligibility requirements if we are given an opportunity to move forward.
I am drafting our follow-up post even now and this tension between requirements/mandates and recommendations may actually be an ideal starting point. Many, many thanks.

11-20-2014 14:03

Thank you for this lucid and constructive overview of the rights and access landscape surrounding the planning of Hidden Collections 2.0 ;)
I have a question/concern about this statement,
"[Our intellectual property terms] would require that recipients make all digital content created through their projects available for noncommercial scholarly and educational purposes and that all metadata describing that content be explicitly dedicated to the public domain."

The metadata terms are great (!), but if I'm reading that sentence correctly, it seems that we (CLIR) would allow awardees who are digitizing materials which are public domain in their analog forms (16th century maps, for example) to publish the digitized versions of those materials under a noncommercial license. I hope that this is not so.
From my perspective, the tide is running the other way. Smart, forward-looking organizations like the Yale University museums, the Rijksmuseum, the Getty, the National Gallery of Art (USA), and Europeana have realized that releasing public domain materials under noncommercial licenses impedes progress towards their goals, weakens their reputations and collaborative relationships, harms their financial viability, and that the restrictive, noncommercial license is contrary to the very act of scholarship and learning itself.
To be a leader, and to encourage the broadest and most profound impact in society, CLIR must adopt a stance that echoes Europeana's Public Domain Charter (2010), which states,

Digitization of Public Domain content does not
create new rights over it: works that are in the
Public Domain in analogue form continue to be in
the Public Domain once they have been digitized.

Furthermore, if CLIR allows awardees to enclose public domain materials in more restrictive licensing structures, CLIR would be encouraging collecting institutions to adopt unprofitable and self-defeating business practices. (Reference Simon Tanner's seminal study: Reproduction charging models & rights policy for digital images in American art museums, a Mellon Foundation Study, 2004 )
...Not to mention the fact that such practices are contrary to the vital purpose of the public domain: to stimulate the creation of new works. As a point-of-reference, 1/3 of respondents to a 2014 survey conducted for the College Art Association's Fair Use project said "that they had avoided or abandoned a project due to actual or perceived inability to obtain permission to use third-party works." (See, page 49)

As a part of the review panel for Hidden Collections over the last several years, and as a current CLIR Presidential Distinguished Fellow, I've been inspired by the organizations who choose to release their digitized collections into the public domain. They are not always large, well-funded institutions - - some of them are quite small - - but they are always organizations, leaders, and teams who want to create the most good in society from the work they do. I've seen the positive effect that these teams have on scholarship, teaching, learning, publishing - - and on the opinions and policies of their peers.
CLIR is in a position to scale the impact of its investment in the next phase of the Hidden Collections program by insisting that awardees publish their public domain analog resources as public domain digital resources. In this way, CLIR will set the gold standard for the next decade of digitization and public access; it will ensure that the public domain can continue to play its vital role in society; and it will enable thousand or millions of valuable resources to be found, shared, and used by future generations of scholars, learners, and creators.