Russell, Kelly - Cost Elements Of Digital Preservation

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Russell, Kelly - Cost Elements Of Digital Preservation as PDF for free.

More details

  • Words: 3,501
  • Pages: 4
Cost elements of digital preservation

1 de 4

http://www.leeds.ac.uk/cedars/documents/CIW01r.html

Cost elements of digital preservation Kelly Russell and Ellis Weinberger draft of 31 May 2000

1 Introduction Although not a great deal is known about the costs of preserving complex digital objects over time, there is an accepted or perceived wisdom within the library community that it will be more expensive and more intensive than preservation of traditional library materials. Although it may too early to make meaningful comparisons of the costs of digital vs traditional preservation, one thing is certain: the costs of preservation of digital materials will be different than for other materials and will require resource commitments of a different nature on an ongoing basis. The ongoing costs of digital preservation are also likely to span a more extended timeframe than traditional preservation and it may be the case that different technical strategies will prescribe quite different costing timeframes and schedules. This document will attempt to identify some of the main costs elements that libraries can expect to encounter when considering digital preservation as part of their ongoing collection management function. It is divided into two parts: part one will provide an introduction and overview of some of the general issues associated with the costs of digital preservation and part two will provide a detailed breakdown of specific cost elements. This paper makes use of a number of quite specific terms many of which are based on the Open Archival Information Systems reference model. Some of these terms are defined in Annex A. For more detailed discussion of OAIS and its terminology please refer to the OAIS reference manual. 1.1 A Timeframe for Digital Preservation The costs of preservation always represent an ongoing commitment - whether for digital or traditional materials. However there is growing realisation that the time between a object’s "creation" and its preservation is shrinking rapidly for digital materials. Preservation will need to be addressed increasingly at the time of acquisition or even creation of the digital resource. For these new digital materials it is not yet clear what commitment over the long term will mean for libraries. In part this will depend on the archiving model in which the preservation occurs and how responsibility is allocated, the technical strategy chosen for preservation and the type of access required. Regardless of these variations, digital preservation will require ongoing resources. It is important to recognise that different technical strategies for preservation and for access have different cost timeframes. For example, if an archive adopts a migration strategy which will move the digital object into current software, action (and therefore resources) will be necessary each time a software upgrade occurs. By comparison if another type of migration strategy is adopted where materials on ingest into the archive are migrated into standards formats then action (and therefore resource) to migrate that object will be required less frequently. 1.2 The lifecycle of a digital resource Unlike other more traditional library materials, digital resources represent a continuum; where a book is published, put on a shelf to be accessed and preservation occurs only when the object begins to deteriorate, digital materials are created only to require some sort of ongoing "re-creation" (migration, refreshing onto new media etc.) in order to ensure access is preserved. For digital materials the link between creation and preservation is much more important because decisions about the way a digital object is created influence how (or indeed whether) it can be preserved. Likewise, decisions taken at the time of preservation can impact on how (or indeed whether) the material can be accessed in future. Therefore the "costs" of preservation start at creation of the resource. In this sense the creation of a digital object is the true starting point for digital preservation. For libraries involved in digitisation projects this means preservation of the digital files must be considered when the project begins – and it must be budgeted for! Many other digital resources are created outside the library however and work with publishers and other content creators will be critical to encourage the adoption of appropriate standards and technologies which will help rather than hinder preservation. 1.3 Cost/Benefit The costs of preserving digital materials need to be considered in light of the relative benefits. Digital preservation will inevitably be about trade-offs. As with investments of any kind, what you put in tends to be reflected in what you get out. Decisions to save money could compromise the completeness of the preservation. However enormous costs to preserve a complex digital object to which no one requests access is also undesirable. This suggests a preservation strategy which is appropriate to the perceived value of the digital object. However, the long-term value of digital materials can be difficult to determine – particularly when rapidly changing technology requires decisions about long-term value before this has a chance to be proved through a period of use! Analysing the benefits of preservation is inextricably linked with policies for selection of materials for archiving. 1.4 Selection of Material for Digital Preservation In considering the issue of selection of materials it is important to consider both existing collection management policies within the institution and an object’s suitability as part of the collection as well technical considerations to do with the specific digital object and its requirements for continuing access. These will be considered separately below but it should be clear that they must be considered together. 1.41 Collection Management Policy Issues Preservation is part of a suite of activities associated with collection management including selection, organisation and access. As such for all materials these functions impact on one another. However for digital materials, as has been suggested above, creation/acquisition and preservation are inextricably linked and decisions about preserving materials for the long term should reflect selection policy for the collection as a whole. If, for example, a library maintains a selection policy which describes areas of specialisation for the collection as a whole it is this material which should be considered candidates for preservation – whether digital or not. However where no formal selection policy exists for the whole collection, a good place to start *for some types* of objects is at the point of selection for digitisation. A great deal of work is currently ongoing on selection criteria for digitisation which might be of use when considering the long-term value of all types of digital content. 1.42 Technical Considerations Preservation of digital material necessarily involves consideration of technical issues and what will be necessary to render the object from bits and bites into a meaningful digital object. However there are a number of levels for consideration between a digital object’s bits and bytes and the functionality and properties that make the digital object what it is to a user. For digital materials simply maintaining a bytestream does not necessarily ensure the digital material will be preserved at a level acceptable to the archive and its users. "Access" can be at a variety of levels for digital materials ranging from access to the full range of functionality and content to simply access the `bare bones’ intellectual content. The level at which a digital resource is archived and maintained will depend on value judgements made by the archivist. In the Cedars

10/4/2007 16:56

Cost elements of digital preservation

2 de 4

http://www.leeds.ac.uk/cedars/documents/CIW01r.html

project this is called assessing a digital object’s "significant properties". Determining the significant properties of a digital object will dictate the amount of information or "metadata" (including detailed technical metadata called "representation information") that must be stored alongside the bytestream to ensure the object is accessible to that level. A digital object’s significant properties are not assumed to be empirical; archives will make judgements at levels appropriate to fulfil their preservation responsibilities and meet the needs of the archive's user communities. For example, in some cases archives will need to ensure exact replication of a digital object for legal purposes. This require preservation of the object’s full functionality and will have significant associated costs. These costs need to be weighed against the desirability/necessity of preserving the object. For digital materials the preservation of complex functionality may prove considerably more costly than preservation of the basic intellectual content. In general, the more complex the digital object, the more involved (and resource intensive) the digital preservation. The question that must be asked is whether the object’s perceived long term value is worth the expense of preserving the `bells and whistles’. One way of reducing costs of preservation is by encouraging the use of standards or system-independent file formats either on creation (preferable) or on migration. Material created in this type of environment will require less preservation action to ensure access is maintained over time. 1.5 Collaborative Approaches It is unlikely that in the UK (or elsewhere) libraries will be able to rely on duplication of effort to ensure the preservation of digital materials. The level of commitment, resource and expertise required to archive digital material will mean co-ordination across the library sector will be critical. As with all things, there are economies of scale associated with a collaborative approach. Cooperative collection management at the point of selection, acquisition and access to traditional library collections is already taking place across existing consortia and proving very effective. For digital preservation, collaboration may be carried out at different stages or in relation to various aspects of digital preservation and this might significantly reduce the costs for a single organisation. For example, collaboration might take place for selection of materials, for copyright negotiations or for administration of the archives. There are a variety of options to be explored. However, it should be noted that cost influences may differ (or not) depending on whether the collaboration is occurring regionally, nationally, or internationally.

2 Cost Elements for Digital Preservation The cost of preserving an object will depend on many factors. As suggested above there will be a multitude of other considerations which will impact on how these costs are made manifest and it may be that any decision-making based on the following elements is best expressed as a matrix. The elements have been listed according to the order in which they will tend to occur within the collection manager’s workflow. The elements below are those which are most closely associated with preservation of a digital object and ensuring the object remains accessible over the long term. However it should be recognised that for digital materials it is not always easy (or even possible) to separate costs of preservation from costs of access. It may be that an institution’s investment in technical infrastructure for providing access to digital materials also supports a preservation function and in this sense the cost is shared across both preservation and access. Likewise, costs for providing resource discovery and delivery of materials from the archive may also vary depending on the extent to which the archive is integrated into existing collection management functions where access arrangements are shared across a range of collections. While acknowledging that it is not always possible in practice to distinguish between preservation costs and costs for providing access, this list of cost elements attempts to focus on preservation activities specifically.

1. Selecting a particular digital object for preservation. It is likely that there will be two different representative groups involved in the selection of material for long-term preservation. These are Collection Managers (e.g. archivists, subject specialists) and Systems Managers who will need to act in consultation with one another on issues relating to the long-term retention of digital materials. The collection manager can provide advice about usage or about the relative value of object to the overall collection. The systems manager can discuss the cost of specific technical issues such as required conversion, migration or even emulation as well as the necessary technical metadata or representation information. Selection decisions may be based on existing policy documents or, in some cases, taken on an object by object or collection by collection basis. More time will be required if there is no existing policy for selection. There may be collaborative agreements (across consortia) about preservation responsibility which in time, may make this less time-consuming (and therefore less costly), but can be very costly at the outset. As mentioned above, when possibly this activity should reflect the library’s collection management policies (i.e. preservation decisions may be made on acquisition of the material). There are a number of selection policies for digital materials available such as Guidelines for the Selection of Online Australian Publications Intended for Preservation by the National Library of Australia and the Berkeley Digital Library Sunsite.

2. Negotiating the right to preserve the object This will include the time of the negotiator and the time of the person drafting and exchanging the agreements. This may also include detailed consideration of the object to assess all the relevant rights holders including rights holders of software and underlying technologies. It may be that in the case of some materials, the publisher does not own the rights for the underlying technology and this will require separate negotiations. Based on the Cedars Project experience, this is likely to be a lengthy process.

3. Negotiating the right to provide access to the preserved object There will also be time required for negotiating access arrangement for materials stored in the archive if end-users are to have short-term or near-term access to archived materials. This may not apply to all archives. Like negotiations for preservation, negotiations for access may require considerable time and expertise.

4. Determining the appropriate technical strategy for preservation and continuing access. This will include the time taken to ensure the digital object is adequately prepared for archiving as well as the resources for agreeing on a specific preservation strategy for continuing access (e.g. migration or emulation). This will requires detailed consideration of the digital object to determine its Significant Properties and, based on this decision, determining the underlying technical requirements for preservation. This element may include the cost of purchase or design of any software or hardware needed to prepare an object for archiving. Resources will also be required to determined the best technical strategy for providing continuing access to material in the archive – i.e. migration, emulation. This will be determined by agreement on an object’s significant properties. For example, an object

10/4/2007 16:56

Cost elements of digital preservation

3 de 4

http://www.leeds.ac.uk/cedars/documents/CIW01r.html

which requires preservation of its "look and feel" may require the development or enhancement of emulation tools.

5. Validating the completeness of the object on delivery to the archive. This will include the time taken to obtain any necessary documentation and the time spent checking the object received against documentation received relating to the object. For many digital objects this may require significant human resource.

6. Producing Metadata. This will include study of the documentation provided with the object and/or an inspection of the item itself and will draw upon information gathered during technical preservation (element 3). Depending on how (or whether) the archiving function is integrated with existing collection management activities, some metadata may be collected or incorporated from existing cataloging or other metadata records. Development of appropriate representation information or detailed technical metadata should also be represented in the preservation metadata and will require specific technical expertise. Metadata costs will also need to accommodate the gathering of rights management information. See section above on the right to preserve and provide access to an object.

7. Storing files. This will include maintenance and purchase of hardware, software, and transfer of files from generation to generation of storage media as well as the periodic inspection of stored files and of the storage media itself. The creation of backup copies etc will also be included in this element.

8. Administering the archive. This will include the costs involved in following the developments in technology and law which will make a difference to preservation of the object, and updating the archive. It may also include the costs of changing the archive system in accordance with changes in archive policy. It should also include staff costs (salaries, overheads, training/retraining/skills upgrading), insurance, building overheads (heat/light/air/security protection), certification/compliance etc.

Annex A Definitions of Terms Action: any activity associated with preservation requiring resources Resources: Funding commitment either in the form of direct payment or human time and expertise Collection Manager: used broadly to mean librarian, archivist, subject specialist etc. Systems Manager: used broadly to mean technical specialist Preservation strategy: a digital preservation strategy is a particular technical approach to the preservation of digital materials. Broadly speaking there are three main technical approaches to preserving digital materials: technology preservation, technology emulation and data migration. The first two focus on the technology itself. In each of these, it is understood that, in order to preserve the functionality of any digital resource, there must be a preservation action taken to preserve the technical environment which originally created and ran it. Data migration strategies focus on the need to maintain the digital files in a format which is accessible using "current technology" and require regular migration from one technical environment to a newer one. The appropriateness of a digital preservation strategy will be determined by agreement on its "significant properties". Significant Properties: Those technical characteristics agreed by the archive or by the collection manager to be most important for preserving the digital object over time. For digital materials simply maintaining a bytestream does not necessarily ensure the digital material will be preserved at a level acceptable to the archive and its users. A digital object’s significant properties are not assumed to be empirical; archives will make judgements at levels appropriate to fulfil their preservation responsibilities and meet the needs of the archive's user communities. For Cedars, it is the creation and maintenance of the detailed metadata associated with the object’s significant properties which is the backbone of an archive’s preservation function. Significant Properties: A simple example. If an archive takes deposit of a PDF electronic journal and decides that the significant properties are only the text within the journal, there may be no need to store information about the PDF environment but only to include information about retrieving (or rendering) an ASCII text file. These are decisions that must be made by the collection manager or archivist (often in consultation with technicians over what is possible and the associated costs). Significant Properties: A more complex example. An electronic journal which is published via the web as HTML . The "significant properties" are deemed to include the hypertext links (internal) as well as the multimedia unctions (e.g. sound and video clips). It is at this level of functionality (full) that preservation will occur. Although end-users currently access the journal in HTML, these pages are created on the fly from SGML. For archive purposes the archive takes the SGML files. Therefore the information (or representation network) which is developed includes technical descriptions of the objects including information about the systems and the software necessary to run the video and sound as well as less complex information about retrieving the text and images.

Metadata for Preservation The effective use of digital resources in an archive will rely on a robust system of resource description – for the purposes of resource discovery, managing access and ensuring preservation of the resources. Metadata research and continues to generate interest world-wide; to date, most activity has focused on metadata for resource discovery. However, there is increasing awareness that effective digital archives will depend on the creation and storage of relevant descriptive information (metadata) required to support a chosen preservation strategy (i.e. migration, emulation or technical preservation). This information will need to describe the data in detail including file format, and software and hardware platforms. It may also contain information about rights management and access control. Specifically preservation metadata will take two forms:

10/4/2007 16:56

Cost elements of digital preservation

4 de 4

http://www.leeds.ac.uk/cedars/documents/CIW01r.html

descriptive information which includes general resource description as well as rights mangement information and descriptions of actions taken for the purposes of preservation representation information which maps the stored data into more meaningful concepts – ie systems information which renders simple bits and bytes into a meaningful digital object. E.g. the ASCII definition which maps data (bits) into readable symbols.

The Open Archival Information System Reference Model The Open Archival Information Systems Reference Model has been developed by the Consultative Committee on Space Data to provide a conceptual framework and reference tool for defining a digital archive. It describes a specific functional model of both people and systems requirements for implementing a digital archive. The reference model could also be applied to a non-digital archive. The OAIS is undergoing the ISO process and its publication as a standards is expected later this year and the Cedars project has provided a demonstrator project based on it. The importance of OAIS to the archiving community is undeniable but its usefulness to research libraries and archives largely unexplored. The NEDLIB project is also implementing the OAIS model within the context of the deposit of electronic materials for archiving.

10/4/2007 16:56

Related Documents

Elements Of Cost Sheet
November 2019 34
Preservation
November 2019 29
Russell
June 2020 15
Preservation Of Goals
November 2019 8
Kelly
May 2020 28