Metadata

Summary/Best Practices

Digital preservation metadata begins with establishing a basic inventory of your digital materials, and escalates to generating fully fledged PREMIS metadata within a robust digital preservation system. By establishing a commitment to stewardship of your digital files at the start of your projects and program, you can position yourself for the long-term maintenance of your digital assets, regardless of your current infrastructure. And if you already have a significant backlog of materials to preserve, there are also simple things you can do to describe and organize your content in anticipation of a cohesive digital preservation strategy.



Step by Step


Level 0 to Level 1

  • Inventory of content and its storage location
  • Ensure backup and non-collocation of inventory
  1. Create your digital object inventory – Conduct a high level overview of digital assets (including born-digital content!), capturing the following information (an excel spreadsheet at this point is just fine):

    • Size of existing collection: number of items and storage requirements
    • Storage location - where and how much redundancy (i.e. how many copies)?
    • Format of objects
    • Relationships between objects
    • Anticipated growth rate- existing and new collections
    • Existing metadata (Technical, Administrative, Descriptive, etc.)
    • Copyright/ access restrictions
    • Search functionality for objects
    • Vulnerabilities
  2. Where are you storing your inventory? Make sure it is in two places, both within and separate from the content of your collections.

Level 1 to Level 2

  • Store administrative metadata
  • Store transformative metadata and log events

In moving from Level 1 to Level 2, we shift from a retroactive position to a proactive one, capturing data and information at the outset of projects, and developing accompanying workflows to do so. Specifically, we seek to tackle administrative metadata. In addition, we start to capture log files of our existing data, and note any transformative event (file migration; refreshing of data; etc) that occurs to our files.

What is administrative metadata?

  • Information about rights and reproduction or other access requirements
  • Selection criteria or archiving policy for digital content
  • Audit trails or logs created by a digital asset management system, including persistent identifiers
  • Administrative metadata may also encompass repository-like information, such as billing information or contractual agreements for deposit of digitized resources into a repository

Metadata can be stored along with your files as a basic read-me txt file, or as part of your overall inventory tracking sheet. Be sure to maintain multiple copies of your data, and establish methods for determining the “authoritative” copy.


Level 2 to Level 3

  • Store standard technical and descriptive metadata

In moving from Level 2 to Level 3, we adjust our digital production and digital archiving workflows to accommodate the capture of both technical and descriptive metadata. Given that “provenance” takes on a greater significance within the born-digital world, we recommend taking different approaches for a digitization workflow versus a digital archiving workflow.

Within a digital production workflow (converting analog material to digital), we recommend embedding both technical and descriptive metadata within the file header. This will ensure that the file will still be “knowable” should it get separated from its original context. Please note, when capturing with a D-SLR camera, many technical metadata attributes are automatically captured within the file-header.

Within a digital archiving workflow, we recommend generating read-me txt files within the same directory as you store the digital data (similar to how you might be storing your administrative data). This will ensure that important provenance information will not be accidentally overwritten through embedded metadata.

Technical metadata

  • Information about how the file was created, its format, whether it be born digital or analog to digital capture. This can include scanner/camera information; color space/ profiles; capture date; operator; and any inhibitors to the file.

Descriptive metadata  

  • Descriptive Metadata enables identification, location and retrieval of information resources by users, often including the use of controlled vocabularies for classification and indexing and links to related resources. (DCC citation)

Tools:

Still image metadata:

Audio files:


Level 3 to Level 4

  • Store standard preservation metadata

Moving from Level 3 to Level 4 requires a fully fledged digital preservation system, such as Archivematica, Preservica, or a similar system. Such a system allows for the systematic capture of PREMIS data and "events" to assist in the long term readability and accessibility of your files. Regardless, it's useful to understand what PREMIS documents, and how your digital objects conform to the metadata model.

PREMIS Data Model:

  • Provenance: Who has had custody/ownership of the digital object?
  • Authenticity: Is the digital object what it purports to be?
  • Preservation Activity: What has been done to preserve the digital object?
  • Technical Environment: What is needed to render and use the digital object?
  • Rights Management: What IPR must be observed?

PREMIS Objects have:

  • a unique identifier for the object (type and value)
  • fixity information such as a checksum and the algorithm used to derive it
  • the size of the object
  • the format of the object, which can be specified directly or by linking to a format registry
  • the original name of the object
  • information about its creation
  • information about inhibitors (passwords or encryption, for example)
  • information about its significant properties
  • information about its environment (see below)
    • where and on what medium it is stored
    • relationships with other objects and other types of entities

PREMIS Events:

  • a unique identifier for the event (type and value)
  • the type of event (creation, ingestion, migration, etc.)
  • the date and time the event occurred
  • a detailed description of the event
  • a coded outcome of the event
  • a more detailed description of the outcome
  • agents involved in the event and their roles
  • objects involved in the event and their roles

Preservation Metadata in Context