October 10 is Electronic Records Day, and October is American Archives Month! We’re celebrating the work of archivists and the importance of archives with a series of blog posts about the Presidential libraries. The records created by Presidents while in office will become part of the National Archives, and eventually will be used by researchers. Here’s how it happens!
Today’s post comes from John Laster, Director of the Presidential Materials Division at the National Archives.
It is American Archives Month! This is an opportunity to celebrate our profession—all that we have accomplished and the exciting challenges that await us in the future. For me, there is nothing more exciting—or daunting—than the challenges that digital records pose for archivists.
As the Director of the Presidential Materials Division, I see firsthand the issues playing out when born-digital Presidential records are transferred every four or eight years and then again through the following steps of the lifecycle as these records are searched, reviewed, and made available.
Digital records are nothing new, but with each passing year they become more prevalent and intertwined in our professional lives. With Presidential records, we have gone from receiving basic email created by the Reagan administration to preparing to accept a wide range of information from social media sites used by President Obama’s Office of Digital Strategies. The variety of electronic records created grows. And the volume grows. The Clinton administration created less than 4 terabytes while the George W. Bush administration generated nearly 80 terabytes.
I have been asked to reflect on the challenges that digital records pose for archivists. Many meetings of professional organizations contain important and nuanced discussions of this topic. A simple Google search reveals a wealth of interesting and thought-provoking pieces devoted to this issue. Born-digital records can challenge our archival assumptions about series, arrangement, and even description. But for most of us, theoretical discussions have to be balanced by the practical question of how we can do our work.
Using absolute terms can be dangerous. But I think it is safe to say that there has never been a time when archivists have had to depend so heavily on non-archivists in order to do our work. We rely on IT professionals to help us ingest electronic record and to help us develop ways of preserving, reviewing, and releasing those records in an electronic format. This presents us with one of the biggest challenges we face. Our Information Technology Office has a strong understanding of what we archivists need. However, working with IT professionals outside of the agency requires that archivists learn to speak their language in order to ensure that the services they provide are the ones we need.
So what do we need? We need to receive the records. But even this seemingly simple task is complex.
Transferring records has moved well beyond hardware being passed from creators to archivists. There is the challenge of sheer logistics—there is a certain amount of physics involved just in getting digital content transferred in an appropriate amount of time. Formats also pose challenges, but we need to make sure we capture the data and can preserve it forever. Under the terms of the Presidential Records Act, anything that rises to the level of record is transferred to the Archives as permanent. This obligates us to figure out how to ingest (or transform appropriately before ingest) a wide variety of formats being used by the records creators.
While we strive to keep up with developments and changes in the ways records are being created, we nonetheless need to analyze formats to determine their suitability for transfer and ingest. Format analysis and planning for export and transfer require a significant amount of attention on the part of archivists and information professionals working on these tasks. Since we strive to preserve and access the data and not the systems, proprietary formats can sometimes prove a challenge to export and ingest. We never want to be in a situation of perpetuating proprietary formats.
Search and accessibility challenge us as well. Traditional searches of electronic records will return a word or phrase with 100% reliability, which can be useful at times. But when you must search common words across a large body of records, say the approximately 200 million emails NARA received from the George W. Bush administration, finding needed documents can become daunting. For instance, Boolean searches are often inadequate for the many subject-based requests received under the provisions of the Freedom of Information Act. Taking advantage of strides made in the e-discovery field requires integration with the tools and systems that we do have in order to make search and access as seamless as possible.
We must continue to seek ways to integrate new and better technologies into our work here at the National Archives and Records Administration, such as technology that will allow us to speed the redaction of personally identifiable information (PII) or identify advisory information that must be withheld. Technology is key if NARA hopes to be able to review such large volumes of born-electronic records in order to make them available to the public.
The challenges of born-digital materials loom large over the archival profession. We have a tremendous opportunity, however, to study and use the flexibility of archival theory to adequately preserve and provide these records to an increasingly interested public. It is an incredibly exciting time to be an archivist!