Guidelines for Preserving New Forms of Scholarship

Introduction

Background

Scholars are making extensive use of new digital technologies to express their research. Publishers, in turn, are working to support increasingly complex publications that are not easily represented in print. Examples include publications with embedded visualizations, multimedia, data, complex interactive features, maps, annotations, or that depend on third-party platforms or APIs, such as YouTube or Google Maps. These publications present formidable challenges for long-term preservation.

To study this challenge, a group of digital preservation institutions, libraries, and university presses worked together on an Andrew W. Mellon Foundation funded project, Enhancing Services to Preserve New Forms of Scholarship, led by New York University Libraries. Publishing organizations included NYU Press, Michigan Publishing, the University of Minnesota Press, UBC Press and Stanford University Press. Preservation service organizations included CLOCKSS, Portico and the libraries of the University of Michigan and NYU. Together, they examined a variety of enhanced ebooks and identified which features can be preserved at scale using tools currently available. Their findings, combined with the knowledge and research of experts in preservation, publishing, and copyright, resulted in this set of guidelines and best practices. These guidelines were organized and authored by: Deb Verhoff, Digital Collections Manager, NYU Libraries; Jonathan Greenberg, Digital Scholarly Publishing Specialist, NYU Libraries / NYU Press; Karen Hanson, Senior Research Developer, Portico, ITHAKA. The PI for the project is David Millman, Associate Dean for Technology/Chief Information Officer, NYU Division of Libraries.

Intended audience

These recommendations will guide publishers to create digital publications that are more likely to be preservable. They are meant to be shared with authors, editors, digital production staff, software developers and those who design and maintain publishing platforms. We hope that publishers and platforms will adapt these guidelines to create versions that take into account local workflows, technologies, and cultures. The guidelines were derived from research performed by professional preservation services who work at scale. They will also aid the wider publishing and preservation communities, from the individual content creator to those who steward digital collections and ensure long term access.

These guidelines are neither categorical nor prescriptive in nature. They represent a variety of approaches for improving preservability, out of which only a subset may apply to a particular publication. Publishers, authors, platform developers, and preservationists will sometimes need to weigh technological creativity against preservability, and some pathbreaking digital work may always resist preservation. Nonetheless, in order to make these decisions, those involved with digital publishing should understand the implications of complex digital features for digital preservation. As these guidelines demonstrate, many pitfalls are easy to avoid if planning begins early in the publication process.

Terminology

Publication resources: The digital materials that make up the publication. “Resource” does not necessarily refer to a specific file but to the content and form. Some resources may consist of multiple files or be expressed in different formats. The publication itself is a resource. If separate from the body of the publication, each of the supporting items are also resources - each figure graphic, video clip, piece of software, dataset, 3D visualization etc. The publisher can work with a preservation service to determine which publication resources are required to represent its core intellectual components, and which files are appropriate to represent each resource.

Core intellectual components: The aspects of a publication that are considered integral to the understanding of it. Rather than pertaining to specific digital resources or renditions of a publication, this is a more abstract sense of the facets of the work. For example: the linear text divided by section headings, the media placed at specific locations within the text, the additional digital resources supplied by the author with descriptive information tied to them. This kind of abstraction is helpful in preservation for communicating requirements and then designing a strategy that covers the important aspects of a work.

Keywords

The following keywords were applied to the guidelines to aid with navigation.

  • embedded resources: Non-text resources such as images, audio, video, visualizations, etc. that appear in a publication.
  • export packages: A package of content created to represent a publication for the purpose of transferring it to a preservation service.
  • EPUB: An electronic publication file format and technical standard published by the International Digital Publishing Forum.
  • planning: Bringing a preservation mindset into pre-production and design phases.
  • publishing platforms: Software used to manage and provide access to one or more publications.
  • rights: Issues pertaining to copyright, terms of use/service, or licensing of publication resources.
  • software and data: In general these refer to raw and executable materials offered in support of the key points of the text.
  • third party dependencies: A component of a publication that is dependent on a service or platform that is outside of the control of the publisher.
  • web based publications: Publications designed primarily to be presented in the form of a website rather than downloaded as a document (as with EPUB or PDF).

Citing or linking to this work

This website was created in order to publish the latest guidelines and aid in navigation. These guidelines are a work in progress. As with most things on the web, the URLs are likely to change or disappear over time. We have published a full static, print-ready version in NYU's Faculty Digital Archive, which can be referenced for formal citations. Here is a sample citation:

If citing a webpage is preferred, we have made this website web archive friendly. To cite or link to specific pages on this site, one option is to use the Robustify your links service which archives the page in a public web archive and supplies a link to use for citation.

Read the Guidelines