Digital objects as manuscripts:
How to select material that is born digital for long-term preservation

Ellis Weinberger[*]
CEDARS Project Officer
Cambridge University Library

Based on a presentation to the Cambridge Libraries Group
15 June 1999

1 Questions

In order to decide whether to archive a digital object, we must consider three things:

Do we want to preserve this digital object?
May we preserve this digital object?

Can we preserve this digital object?

The content of a digital object may be, for example, a monograph, a journal, or a database. The delivery method of a digital object may be, for example, a CD-ROM or on-line.

2 Do we want this digital object?

Will preserving this digital object help our library to do better what our library already does?

2.1 Focus on strength

The digital objects preserved in a library should contribute to the subject strengths of the library. The purpose of the library and the needs of the users of the library must be clearly defined before selection decisions can be made. A particular object might be preserved because it belongs in an existing special collection.

2.2 Take care

The digital objects should be approved by subject specialists who also understand the scale of the costs involved in preserving these objects. The objects must also be analysed by experts in the field of digital preservation, to determine the best method of preservation. The costs of preserving a digital object are hard to calculate, but one can reasonably assume it will be expensive. In general, thinking of digital objects as manuscript collections with perpetual conservation needs will help us understand the challenges involved.

3 May we preserve this digital object?

Is the publisher of this digital object willing to offer support and encouragement in connection with the preservation of this digital object?

3.1 Details of the digital object

In order to preserve a digital object for posterity, we will need detailed information about the structure and content of the files, and about the software used by the digital object. We can gain access to such information only if the publisher agrees to a library preserving the digital object.

3.2 Rights to preserve

When we talk to the publishers of the digital object about preserving the digital object, we need to check whether they have the right to permit us to preserve the software and the content, or whether the rights belong to somebody else. When the publishers received permission to use material in the digital object, they may not have received the right to allow anyone else to use the material. If the publishers are not legally in a position to give us permission to preserve the digital object, we will have to get permission from the owner of the copyright.

4 Can we preserve this digital object?

Since digital objects are not as stable as books, we have to devote greater resources to their care. Because caring for digital objects will be more expensive than caring for books, our policy for choosing digital objects has to be more rigorous than our policy for choosing books.

For example, we can look at the very basic cost of storing 1 gigabyte, which is less than two CD-ROMs, for a year. We can assume that the computing service will charge between 25-50 pounds sterling a year. Preserving the digital object for posterity will be expensive. We can assume that the cost of storage will fall, but if we are serious about preserving a digital object, we need to budget for at least 20 years at current storage prices.

4.1 Paper and other things

Good vegetable ink, on good acid-free paper, stored in a cool, dry, dark room, will last a thousand years. Vellum will last longer. Ink on paper is very stable.

If, however, we take a digital object, stored on any kind of electronic or magnetic medium, and put it in a cool, dry, dark room, walk away, and come back in perhaps even as little as five years, we will probably not be able to use the digital object. Either the medium deteriorates, or the hardware to read the medium disappears, or the software to interpret the information on the medium becomes unavailable. We will probably have to migrate the information to new platforms every five years, forever, in order to ensure preservation.

The libraries thinking about preserving a digital object need to be sure that they have the expertise to preserve it, and the money to pay for the people, the software, and the hardware they will need to do so.

The structure of the digital object will need to be analysed, and the digital object will then need to be migrated to another physical form, which may prove more stable in the medium term, or which may facilitate future migrations.

5 Digital Objects as manuscripts

Choosing digital objects for preservation will be facilitated, if we think of digital objects as fragile manuscript collections, and not as books.

5.1 Costs

When we consider accepting a manuscript collection into a library archive, we think about the cost of the manuscript collection to the library. The manuscript collection may need cataloguing, conservation, or special storage facilities. The manuscript will need storage space. Every time a library is offered material, these issues are considered.

In the same way, the preservation needs of each digital object may be different. Every digital object will tend to be a special case. Each digital object to be preserved will have specific hardware and software needs, which will have to be discovered, confirmed, and paid for.

5.2 Access

When a manuscript collection is accepted into a library, access and copyright terms need to be negotiated. The terms may be different for each collection of manuscripts which the library receives.

In the same way, each digital object will probably have its own set of preservation and access conditions, which will have to be negotiated with the publisher. The publisher may not have the ability to give the library the legal right to preserve the digital object. The copyright position of each digital object will need to be checked.

6 Conclusion

As we have seen, preserving digital objects will be expensive. We need to be sure before we start that preserving the digital object will be in the best interests of our library, that we are legally entitled to preserve the digital object, and that we have the technical resources to preserve the digital object.

The Consortium of University Research Libraries Exemplars in Digital Archives project, CEDARS, will provide guidance in best practice regarding storage techniques, metadata elements, collection management policy, and copyright policy, for the preservation of material that is born digital.

with thanks to Miss Patricia Killiard, Dr Mark Nicholls, and Professor Stefan Reif

