The How We Digitize series has been looking at the process of creating files for UBC Digital Collections. But how do we store them all for the long term?
Well, that is a BIG issue!
In general, for each item you see displayed in Digital Collections we have one unedited master file and one edited file. These images typically range in size from about 20 MB to 200 MB each. For some maps imaged on the Contex Scanner we have single files as big as 1.5 GB and some video files from the Westland collection are larger than 20 GB. This adds up quickly as we are constantly adding new material. We currently have at least 69 TB of data on our network storage!
Our network storage drives are great for everyday use. They are fast and easy to use, with automatic backup snapshots and geographically distributed redundancy built in. However, UBC Libraries has been working on building a Trusted Digital Repository. Progress towards that goal can be measured by using the Trustworthy Repositories Audit & Certification (TRAC) check list. TRAC takes a holistic look at the organization and the requirements to responsibly steward digital information into the future. Is there trained staff and sustainable funding? Are preservation policies in place? Does the organization have the technology and infrastructure necessary to properly store the digital objects?
These are HARD questions! But, as the TRAC assessment points out, to guarantee the long-term survival of our materials requires more than just big storage drives—we really need a Digital Preservation strategy.
With so much electronic information in our world today, its hard to realize that digital is fragile. If you drop a print book in a dry spot it could survive on its own for centuries. Not so with an ebook! Digital files are completely dependent on technology to make them usable which means they face many challenges.
In the high tech world of rapid change, the most obvious is obsolescence. The software necessary to open the file or the hardware necessary to read the storage media may disappear. Just think, could you get your old Word Perfect documents off the stack of 3½-inch floppies you used in 1990? Probably not. The hardware to read the floppies and the software to open the files is already very rare—even if the bytes survived intact without errors!
Which brings us to media degradation. The physical materials our digital files are stored on decay or become damaged. Unlike printed paper, no digital storage media are truly stable. The electric or magnetic charges storing the data on the media may slowly be lost. Furthermore, items such as hard drives depend on moving parts and complicated circuitry that have expected life spans of less than a decade. Even optical media decays as the reflective layer breaks down. If your photo CD is scratched, will you be able to recover any of the images?
Why do we do digital preservation? Because we NEED to—digital files require active and continuous management to remain usable. If we want to preserve our Digital Collections for future users, we need to take positive steps to steward our data assets.
You can learn more about what we are doing in the next post!