Personal Digital Preservation: How I Manage My Digital Life


I have been taking photos with digital cameras since 2002. I still have a lot of those photos, but there are two distinct occasions where a computer or hard drive failure resulted in the loss or corruption of some images. I have previously discussed my attitude to naming files over the years. Thankfully my education in both photography and information management improved that quite significantly and I recently finished going through my archive and giving everything a meaningful file name and organising it in a meaningful way. But the question remains: how can I mitigate the risk of losing my digital files? Can I do anything to salvage the corrupt photos?

My first instance of a hard drive failure occurred in 2006. I was backing everything up to an external hard drive and occasionally to CD. It had been months since I had backed up to CD so I lost some photos. I wrote a personal blog entry at the time, stating "this enforces the fact that digital photography is not safe and I should take more precautions in the future". In 2007 my PC hard drive failed right after I purchased my first iMac. I was due to have some event photography published in a local magazine but missed the deadline due to the failure. I wrote at the time, "I no longer trust technology. It hates me. It's the second time in two years a hdd has died and i've lost my work."

Since then, my backup system has not been much of a system. Up until recently I had my archive across multiple hard drives, with at least two copies of everything. I was also using DVDs up until mid-2009. At the end of 2016, I decided it was time to bring my archive onto a single backup system so I had everything in one, accessible location. This led to the purchase of a two bay RAID enclosure, which I set up in a RAID 1 drive mirroring configuration using two identical 3TB hard drives. I was still using an iMac at the time as my main computer so I decided to set this up in Apple's HFS+ format.

I recently decided to build myself a computer for the first time, which presented its own challenges (I have nightmares about thermal paste). Switching from OS X to Windows 10 is problematic with a HFS+ formatted external hard drive. There are programs available to be able to both read and write to Apple formatted hard drives on Windows computers (such as Paragon HFS+ or Mac Drive), but that is not ideal. I made the decision to copy the archive onto a dedicated hard drive on my new computer before erasing the external hard drive and reformatting it for Windows, which provided the perfect opportunity to assess and organise my archive. I knew there were corrupted images in my archive but hadn't tried to do anything about it until now.

The photo at the beginning of this post is an example of one of the many corrupted Canon CR2 camera raw files I have in my archive from the time of the second hard drive failure. These images were recovered by a family member at the time, but obviously not everything was a success. All of my backup copies appear to have the same corrupted files. I am thankful that I had started shooting in the camera raw file format at the time, because that gives me some options for recovery because most cameras embed a JPEG preview image into a raw file. Depending on the camera model and manufacturer, embedded JPEG files may be full resolution or smaller.

So how do you extract a JPEG from a camera raw file? Utilising a tool I have been using a lot both personally and professionally - ExifTool by Phil Harvey. ExifTool has a lot of great uses, particularly when it comes to digital photographs and metadata. Using the command line tool, it is possible to extract the JPEG image from the raw file and embed all the metadata from the original file (example provided by Harvey here under "copying examples"). Unfortunately for me, the camera model I was using at the time embedded smaller resolution previews so I will never have the full resolution images again, but a low resolution JPEG is better than nothing!

Preview JPEG extracted from corrupted Canon CR2 camera raw file.
Once I have finished extracting all of the JPEGs from my corrupted CR2 files, my next step will be to reformat my external RAID hard drive to Windows-based NTFS and copy everything back across from my computer. Before I do that, I will first make a backup copy of the files currently on my new computer to a regular, portable external hard drive (encrypted with BitLocker). This will become my offsite copy that will be stored at a relatives' house.

Ultimately, I will have three separate devices with a copy of all of my digital files. Technically this will be four copies with the RAID 1 setup. To combat the issue of file corruption I plan to use the BagIt standard to store checksums which can be validated on a regular basis. It is not ideal for working files, though, as any changes to the files will change their checksum.

It is hard to follow digital preservation standards on a personal level because it is not cheap. Good practice includes multiple independent copies that are geographically separated, using different storage technologies and actively monitoring storage to ensure any problems are detected and corrected quickly (Digital Preservation Handbook). I did not even bother looking at the costs involved in cloud storage for my almost 3TB digital archive and I am not sure if there are any consumer equivalents to digital asset and preservation management systems used by collecting institutions. My experience has taught me a lesson, though, and it is important to do the best you can within your means to backup, monitor and organise your digital life. Digital preservation is an ongoing activity. I have previously made the mistake of just putting things on a hard drive/CD/DVD and thinking it was safe. I will not make that same mistake again.