I attended the 16th International Conference on Digital Preservation (iPRES 2019) as part of my research travel project funded by the Gordon Darling Foundation. My previous post discussed the tutorial and workshop I attended on day one. Here I will discuss a few selected highlights from the conference.

Towards a Universal Virtual Interactor (UVI) for Digital Objects

Euan Cochrane (Yale University Library), Klaus Rechert (OpenSLX GmbH), Ethan Gates (Yale University Library)

Photograph by Sebastiaan ter Burg, CC BY 4.0 / flickr
Cochrane discussed the Universal Virtual Interactor (UVI) that forms part of the Emulation-as-a-Service Infrastructure (EaaSI) program and how it builds on many years of preceding work. This includes the Universal Virtual Computer design concept from IBM, the Dioscuri emulator design from the National Library of the Netherlands, the Keeping Emulation Environments Portable (KEEP) project and the Baden-W├╝rttemberg Functional Long-Term Archiving (bwFLA) project which developed the suite of tools referred to as Emulation as a Service (EaaS).

UVI was described as a two part process, where (1) a file is provided to a program interface for analysis which suggests pre-configured emulators that can (2) interact with/render it. Analysis is based on identification (using Siegfried and DROID), dates (eg created, modified) and any further metadata provided or available. Cochrane used a Works Word Processor file, which does not render accurately in modern Microsoft Word, to demonstrate how the UVI utilises a script to automatically open the file in the appropriate software once the emulated environment opens in a web browser.

Cochrane highlighted how documentation and discovery are an important part of the project, where they are documenting various aspects of software such as what file formats it can import, save to and export. As I mentioned in my IDCC19 reflection discussing EaaSI, copyright law is a challenge with various legal jurisdictions for a global network for emulation but they are using the fair-use rights available in the United States of America to facilitate the sharing of environments across the current network.


Digital Preservation and Enterprise Architecture Collaboration at the University of Melbourne: A Meeting of Mindsets 

Jaye Weatherburn (University of Melbourne) , Lyle Winton (University of Melbourne) and Sean Turner (University of Melbourne)
Photograph by Matthew Burgess
Weatherburn, Winton and Turner provided a great example of working together to achieve a common goal. Their presentation style was engaging and I enjoyed their 'real talk', where they sat down for a question and answer session about a meeting of mindsets between digital preservation and enterprise architecture. They highlighted how collaboration has been a driver for greater visibility and understanding of digital preservation across the University of Melbourne, which is now on their Enterprise Architecture Roadmap as an important socio-technical ecosystem. Turner discussed key information that is useful for IT, where a focus on standards and models such as OAIS are useful by providing a framework and terminology to understand the challenge of digital preservation. Concrete understanding of digital preservation was the starting point, with an awareness of terms like 'born-digital collections' and their importance.

Their talk highlighted the importance of dedicated, permanent roles (rather than a project approach) with the ability to establish person to person relationships and sharing mindsets. Enterprise architecture aims to reduce complexity and cost through standardisation, with a strong focus on cost and effectiveness, and digital preservation is now seen as a key component for this. Their collaborative approach has resulted in more understanding across the organisation that digital preservation is more than technology, and also requires ongoing work around resourcing, policy, process and governance.


Cloud Atlas: Navigating the Cloud for Digital Preservation 

Andrea Goethals (National Library of New Zealand), Jefferson Bailey (Internet Archive), Roslynn Ross (Library and Archives Canada) and Nicholas Taylor (Stanford Libraries)

This panel discussion offered contrasting institutional perspectives on the potential or the perils of the cloud for digital preservation, featuring case studies on how memory institutions can leverage the cloud in deliberate and mission-supporting ways, and how some are working to build alternative, community-based infrastructures.

Roslynn Ross discussed questions to ask when setting course for cloud storage, including how will we move our collection to the cloud? How will we preserve integrity? How will we deal with privacy and copyright? How will we ensure security? What is our exit strategy? She said that cloud providers guarantee they will get your data back, but not in what format, so you need to be clear on that. She spoke about how Library and Archives Canada is taking an iterative approach by choosing a pilot project/collection to begin with.

Nicholas Taylor said that "The Cloud" is playing a growing role in digital preservation but which "The Cloud" we use, and how we use it, matters both for our missions and the likely success of our efforts. He discussed threats to digital information and the difference between commercial and community cloud, highlighting pilot models and values-aligned partnerships to build private clouds. He said that our values as memory institutions suggest that there are question we should be asking of large, for-profit cloud service providers and asked whether we can claim to have custody and intellectual control over content stored in commercial cloud. He also highlighted opaque data integrity with commercial providers, where we must trust that the service is performing fixity checks and it may be prohibitively expensive to retrieve content to perform hashing.

Jefferson Bailey spoke about how the Internet Archive runs their own data centers, how their archiving service supports the archive itself and that they do not monetise input and output. He said that only 20% of users download their data which highlights that they trust the Internet Archive as a storage provider.

During discussions, concerns were raised about the cost of storage and retrieval in the cloud as collections grow. The Internet Archive was used as an example of the financial benefit to work on premises rather than outsource when working at scale, where it would cost three or four years of the Internet Archive budget to retrieve their entire storage/data from a certain commercial provider. There was conversation on keeping a local copy, where the cloud copy could be seen as redundancy. Panelists were asked for their thoughts on the long term reliability of the cloud, where Ross said you really need to start with your exit strategy in mind.


The Integrated Preservation Suite: Scaled and automated preservation planning for highly diverse digital collections

Peter May (British Library), Maureen Pennock (British Library), David A. Russo (British Library)

Photograph by Sebastiaan ter Burg, CC BY 4.0 / flickr

Peter May discussed the Integrated Preservation Suite (IPS) at the British Library, which aims to enhance preservation planning capability through automation and documentation. He highlighted how digital preservation knowledge is generated at the Library through collection profiles (documentation to understand what collections are about, their preservation intent), help desk (where colleagues from across the Library can send requests, whether it is about rendering issues in the reading room, or curators dealing with a new digital acquisition), projects (such as Emerging Formats) and file format assessments (detailed understanding on particular formats and their preservation risks).

May described IPS as a suite of tools and services providing a way to manage knowledge and facilitate access to it in an automated way. He said that the knowledge base underpins a lot of activity and was developed using their own data model. They pull information from various sources, where their own knowledge is treated as a separate data source. To avoid duplication of information across multiple sources, information is first held in a staging area where a user compares existing data and determines whether to keep/discard/merge new information. While this is currently a manual process, they aim to automate this in the future. He mentioned that the National Library of Australia provided them with a spreadsheet documenting links between file formats, software and hardware which expanded their knowledge base to help determine what file formats software can create, render, validate and extract metadata from.

Discussion at the end of this session highlighted the potential community benefits for this project, where it could possibly tie in will with the Preservation Action Registry and also be helpful to others if the knowledge base was publicly available. May commented that it has been thought about, but there are no definitive plans on how they will do that at this point.

I was really interested in the concept of collection profiles and preservation plans, particularly when it comes to complex collections. I later discussed with May whether they intend to create a profile and/or plan for each individual collection, where we agreed it would not be necessary for all incoming collections given the analogous nature of some. I can see this in my current work, where it would be useful to create an overall collection profile and preservation plan for all incoming born-digital photographs for example, but look at more specific plans for incoming born-digital manuscript collections that might be more heterogeneous in scope and file formats.


The Australasia Preserves Story: Building a digital preservation community of practice in the Australasian region

Jaye Weatherburn (University of Melbourne)

Photograph by Sebastiaan ter Burg, CC BY 4.0 / flickr
Australasia Preserves' presence at iPRES was evident with Weatherburn's poster presentation on building capacity through collaboration. The poster aimed to highlight the growth of the digital preservation community of practice and generate discussion and input on how to build on the initiative. I had conveniently printed a bunch of flyers before I left Australia and decided to stand in solidarity with Jaye and help prompt discussion, answer questions and promote the community. The flyers proved popular and there were a few questions on whether we had any stickers - something to consider for next time!


You can find all of the papers I have discussed, and much more, on the iPRES2019 website (https://ipres2019.org/program/conference-programme/) and in the collaborative notes on Google Drive.

Cover image: Taken during the canal cruise on the way to the conference dinner, 18 September 2019. Photograph by Matthew Burgess