Menu Close

Category: appraisal

Archival Photographs as Art: A Part of Larry Sultan’s Legacy

EvidenceLarry Sultan was famed as both a photographer and archives researcher. He passed away on Sunday, December 13th, 2009 and his obituary in the New York Times describes his use of archival photographs as “harnessing found photographs for the purposes of art while using them as a way to examine the society that produced them”. The 59 photographs, selected in collaboration with Mike Mandel from a broad assortment of corporate and government archives, were originally displayed and published as a collection named ‘Evidence’ in 1977. A reprint of Evidence was published in 2004, including a new scholarly essay and additional images not in the original.

The Stephen Wirtz Gallery has a number of images from the 2004 exhibition available online and features this great summary of the original project:

Sultan and Mandel created the series Evidence with documentary photographs mined from image banks of government institutions, corporations, scientific research facilities, and police departments. An NEA grant gave the artists a persuasive edge in gaining access these resources, and images were selected for their mysterious and perplexing subject matter. The series was presented in an exhibition at the San Francisco Museum of Modern Art in 1977, and simultaneously collected in the book Evidence, which is recognized among the most important publications in the history of photography. Removed from their original contexts and repositioned without references to their sources, these images challenged the viewer to examine the conceptual concern of identifying meaning and authorship in the creation and consideration of the art photograph.

I used WorldCat to find the closest copy of Evidence and happily found a copy of the 1977 imprint at the Art Library at the University of Maryland, College Park. It had been a long time since I had looked at photographs on paper and bound in a book rather than on a computer monitor. I love the idea of re-purposing of archival image – but I was also fascinated to realize that the word ‘archive’ does not appear anywhere in the publication. Even the description above mentions ‘image banks’, not ‘archives’.

The organizations thanked at the start of the book included major corporations, U.S. federal agencies and a long list of highway, fire and police departments. Sultan and Mandel seemed to focus their research efforts in California and Washington, DC – perhaps due to a need to limit their travel. While today one would likely still need to travel to many archives to find images like those used in Evidence, there are so many images available online (at least for preview). How would someone approach a project like this now?

It is so easy to create a slide show or website featuring images from repositories from around the world. Even the images that have not been digitized have a decent chance of at least being mentioned in an online finding aid. The recently introduced Flickr Galleries make it easy to select up to 18 images from across Flickr – like my November Flickr Commons Photos of the Month Gallery. Also, much of the online culture of reuse encourages giving proper attribution for materials.

Part of Evidence’s power is the extraction of the images from their original context and their unexplained juxtaposition with one another. Finding and harvesting an image online would make it much harder to entirely strip that context away to leave the raw image behind. I can imagine a web-wide hunt for an image’s origin. While that might be fun (maybe an archives answer to the DARPA Network Challlenge?), it would not be the same as a sleek hardback book with 59 stark, unlabeled, black-and-white photos that sits on the shelf of an art library.

I find it poetic that Evidence’s photos are a perfect example of a ‘secondary value’ of archival records, even though the images were literally evidential records necessary for the carrying out of daily business. That said, I don’t believe that ‘possibly useful to future artists’ is a typical reason given for retaining and preserving archival records. We are just lucky that artists have been (and will almost certainly continue to be) innovative in their hunt for inspiration.

If you have the opportunity, I encourage you to sit quiety with a copy of Evidence. The images include landscapes, explosions, deep pits, plants, rocks, people, planes, machinery, wires and a car on fire. My laundry list of contents cannot begin to do the images justice – but I hope that they might wet your appetite.

This combination of gallery exhibition and book has inspired me to wonder about other similar projects that specifically leverage archival images for artistic purposes. Please list any that you are aware of in the comments (be they in gallery exhibitions or published volumes).

Archivists and New Technology: When Do The Records Matter?

Navigating the rapidly changing landscape of new technology is a major challenge for archivists. As quickly as new technologies come to market, people adopt them and use them to generate records. Businesses, non-profits and academic institutions constantly strive to find ways to be more efficient and to cut their budgets. New technology often offers the promise of cost reductions. In this age of constantly evolving software and technological innovation, how do archivists know when a new technology is important or established enough to take note of? When do the records generated by the latest and greatest technology matter enough to save?

Below I have include two diagrams that seek to illustrate the process of adopting new technology. I think they are both useful in aiding our thinking on this topic.

The first is the “Hype Cycle“, as proposed by analyst Jackie Fenn at Gartner Group. It breaks down the phases that new technologies move through as they progress from their initial concept through to broad acceptance in the marketplace. The generic version of the Hype Cycle diagram below is from the Wikipedia entry on hype cycle.

Gartner Hype Cycle (Wikipedia)

Each summer, Gartner comes out with a new update on Where Are We In The Hype Cycle?. Last summer, microblogging was just entering the ‘Peak of Inflated Expectations’, public virtual worlds were sliding down into the ‘Trough of Disillusionment’ and location aware applications were climbing back up the ‘Slope of Enlightenment’. There is even a book about it: Mastering the Hype Cycle: How to Choose the Right Innovation at the Right Time.

The other diagram is the Technology Adoption Lifecycle from Geoffrey Moore’s Crossing the Chasm. This perspective on the technology cycle is from the perspective of bringing new technology to market. How do you cross the chasm between early adopters and the general population?

Technology Adoption Lifecycle (Wikipedia)

Archivists need to consider new technology from two different perspectives. When to use it to further their own goals as archivists and when to address the need to preserve records being generated by new technology. A fair bit of attention has been focused on figuring out how to get archivists up to speed on new web technology. In August 2008, ArchivesNext posted about hunting for Web 2.0 related sessions at SAA2008 and Friends Told Me I Needed A Blog posted about SAA and the Hype Cycle shortly thereafter.

But how do we know when a technology is ‘important enough’ to start worrying about the records it generates? Do we focus our energy on technology that has crossed the chasm and been adopted by the ‘early majority’? Do we watch for signs of adoption by our target record creators?

I expect that the answer (such as there can be one answer!) will be community specific. As I learned in the 2007 SAA session about preserving digital records of the design community, waiting for a single clear technology or software leader to appear can lead to lost or inaccessible records. Archivists working with similar records already come together to support one another through round tables, mailing lists and conference sessions. I have noticed that I often find the most interesting presentations are those that discuss the challenges a specific user community is facing in preserving their digital records. The 2008 SAA session about hybrid analog/digital literary collections discussed issues related to digital records from authors. Those who worry about records captured in geographic information systems (GIS) were trying to sort out how to define a single GIS electronic record when last I dipped my toes into their corner of the world in the Fall of 2006.

It is not feasible to imagine archivists staying ahead of every new type of technology and attempting to design a method for archiving every possible type of digital records being created. What we can do is make it a priority for a designated archivist within every ‘vertical’ community (government, literary, architecture… etc) to keep their ear to the ground about the use of technology within that community. This could be a community of practice of its own. A group that shares info about the latest trends they are seeing while sharing their best practices for handling the latest types of records being seen.

The good news is that archivists aren’t the only ones who want to be able to preserve access to born digital records. Consider Twitter, which only provides easy access to recent tweets. A whole raft of third-party tools built to archive data from Twitter are already out there, answering the demand for a way to backup people’s tweets.

I don’t think archivists always have the luxury of waiting for technology to be adopted by the majority of people and to reach the ‘Plateau of Productivity’. If you are an archivist who works with a community  that uses cutting edge technology, you owe it to your community to stay in the loop with how they do their work now. Just because most people don’t use a specific technology doesn’t mean that an individual community won’t pick it up and use to the exclusion of more common tools.

The design community mentioned above spoke of working with those creating the tools for their community to ensure easy archiving down the line. In our fast paced world of innovation, a subset of archivists need to stay involved with the current business practices of each vertical being archived. This group can work together to identify challenges, brainstorm solutions, build relationships with the technology communities and then disseminate best practices throughout the archives community. I did find a web page for the SAA’s Technology Best Practices Task Force and its document Managing Electronic Records and Assets: A Working Bibliography, but I think that I am imagining something more ongoing, more nimble and more tied into each of the major communities that archivists must support. Am I describing something that already exists?

Vice President Ruled Part of Executive Branch: Cheney’s Records Must Be Preserved

CNN’s headline is Cheney must keep records, judge orders.  The very short version of all this is that the Citizens for Responsibility and Ethics in Washington (CREW) sued “Vice President Richard B. Cheney in his official capacity, the Executive Office of the President (“EOP”), the Office of the Vice President (“OVP”), the National Archives and Records
Administration (“NARA”), and Dr. Allen Weinstein, Archivist of the United States, in his official capacity” to force everyone involved to “preserve all vice presidential records, broadly defined to encompass all records relating to the vice president carrying out his constitutional, statutory or other official or ceremonial duties” (see the CREW site article: Court Orders Cheney to Preserve Records in CREW Lawsuit).

Turns out that a judge agrees with CREW and has ordered that:

Defendants shall preserve throughout the pendency of this litigation all documentary material, or any reasonably segregable portion thereof created or received by the Vice President, his staff, or a unit or individual of the Office of the Vice President whose function is to advise and assist the Vice President, in the course of conducting activities which relate to or have an effect upon the carrying out of the constitutional, statutory, or other official or ceremonial duties of the Vice President, without regard to any limiting definitions that Defendants may believe are appropriate

I love that last bit – keep it all, even if you don’t think you should.  The court order finishes by saying that they should still give the records to NARA as long as NARA is going to treat them as covered by the Presidential Records Act (see NARA’s PRA page or Wikipedia’s PRA page – I will let you guess which is easier to read).

Is it bad of me to be excited that this is being treated as front page news? As of 9:30pm September 20th 2008, CNN is featuring the article in its ‘prime top left with a big photo’ spot and the New York Times has a link off the main page to their article: Cheney Is Ordered to Preserve Wide Set of Records. They say that there is no such thing as bad publicity. I would like to believe that front page news stories such as this one help improve understanding of archives in general (and NARA in particular).

SAA2008: Preservation and Experimentation with Analog/Digital Hybrid Literary Collections (Session 203)

floppy disks

The official title of Session 203 was Getting Our Hands Dirty (and Liking It): Case Studies in Archiving Digital Manuscripts. The session chair, Catherine Stollar Peters from the New York State Archives and Records Administration, opened the session with a high level discussion of the “Theoretical Foundations of Archiving Digital Manuscripts”. The focus of this panel was preserving hybrid collections of born digital and paper based literary records. The goal was to review new ways to apply archival techniques to digital records. The presenters were all archivists without IT backgrounds who are building on others work … and experimenting. She also mentioned that this also impacts researchers, historians, and journalists.For each of the presenters, I have listed below the top challenges and recommendations. If you attended the sessions, you can skip forward to my thoughts.

Norman Mailer’s Electronic Records

Challenges & Questions:

  • 3 laptops and nearly 400 disks of correspondence
  • While the letters might have been dictated or drafted by Mailer, all the typing, organization and revisions done on the computer were done by his assistant Judith McNally. This brings into question issues of who should be identified as the record creator. How do they represent the interaction between Mailer & McNally? Who is the creator? Co-Creators?
  • All the laptops and disks were held by Judith McNally. When she died all of her possessions were seized by county officials. All the disks from her apartment were eventually recovered over a year later – but it causes issues of provenance. There is no way to know who might have viewed/changed the records.

Revelations and Recommendations:

What is accessioning and processing when dealing with electronic records? What needs to be done?

  • gain custody
  • gather information about creator’s (or creators’) use of the electronic records. In March 2007 they interviewed Mailer to understand the process of how they worked together. They learned that the computers were entirely McNally’s domain.
  • number disks, computers (given letters), other digital media
  • create disk catalog – to reflect physical information of the disk. Include color of ink.. underlining..etc. At this point the disk has never been put into a computer. This captures visual & spacial information
  • gather this info from each disk: file types, directory structure & file names

The ideal for future collections of this type is archivist involvement earlier – the earlier the better.

Papers of Peter Ganick

  • Speaker: Melissa Watterworth
  • Featured Collection: Papers of Writer and Small Press Publisher Peter Ganick, Thomas J Dodd Research Center, University of Connecticut

Challenges & Questions:

  • What are the primary sources of our modern world?
  • How do we acquire and preserve born digital records as trusted custodians?
  • How do we preserve participatory media – maybe we can learn from those who work on performance art?
  • How do we incrementally build our collections of electronic records? Should we be preserving the tools?
  • Timing of acquisition: How actively should we be pursuing personal archives? How can we build trust with creators and get them to understand the challenges?
  • Personal papers are very contextual – order matters. Does this hold true for born digital personal archives? What does the networking aspect of electronic records mean – how does it impact the idea of order?
  • First attempt to accession one of Peter Ganick’s laptops and the archivist found nothing she could identify as files.. she found fragments of text – hypertext work and lots of files that had questionable provenance (downloaded from a mailing list? his creations?). She had to sit down next to him and learn about how he worked.
  • He didn’t understand at first what her challenges were. He could get his head around the idea of metadata and issues of authenticity. He had trouble understanding what she was trying to collect.
  • How do we arrange and keep context in an online environment?
  • Biggest tech challenge: are we holding on for too long to ideas of original order and context?
  • Is there a greater challenge in collecting earlier in the cycle? What if the creator puts restrictions on groupings or chooses to withdraw them?
  • Do we want to create contracts with donors? Is that practical?

Revelations and Recommendations:

  • Collect materials that had high value as born digital works but were at a high risk of loss.
  • Build infrastructure to support preservation of born digital records.
  • Go back to the record creator to learn more about his creative process. They used to acquire records from Ganick every few years.. that wasn’t frequent enough. He was changing the tools he used and how he worked very quickly. She made sure to communicate that the past 30 years of policy wasn’t going to work anymore. It was going to have to evolve.
  • Created a ‘submission agreement’ about what kinds of records should be sent to the archive. He submitted them in groupings that made sense to him. She reviewed the records to make sure she understood what she was getting.
  • Considering using PDFa to capture snapshot of virtual texts.
  • Looked to model of ‘self archiving’ – common in the world of professors to do ongoing accruals.
  • What about ’embedded archivists’? There is a history of this in the performing arts and NGOs and it might be happening more and more.

George Whitmore Papers

Challenges & Questions:

  • How do you establish identity in a way that is complete and uncorrupted? How do you know it is authentic? How do you make an authentic copy? Are these requirements as unreasonable and unachievable?

Revelations and Recommendations:

  • Refresh and replicate files on a regular schedule.
  • They have had good success using Quick View Plus to enable access to many common file formats. On the downside, it doesn’t support everything and since it is proprietary software there are no long term guarantees.
  • In some cases they had to send CP/M files to a 3rd party to have them converted into WordStar and have the ascii normalized.
  • Varied acquisition notes.. and accession records.. loan form with the 3rd party who did the conversion that summarized the request.. they did NOT provide information about what software was used to convert from CP/M to DOS. This would be good information to capture in the future.
  • Proposed an expansion of the standards to include how electronic records were migrated in the <processinfo> processing notes.

Questions & Answers

Question: As part of a writers community, what do we tell people who want to know what they can DO about their records. They want technical information.. they want to know what to keep. Current writers are aware they are creating their legacy.

Answer: Michael: The single best resource is the interPARES 2 Creator Guidelines. The Beineke has adapted them to distrubute to authors. Melissa: Go back to your collection development policies and make sure to include functions you are trying to document (like process.. distribution networks). Also communities of practice (acid free bits) are talking about formats and guidelines like that Gabriela: People often want to address ‘value’. Right now we don’t know how to evaluate the value of electronic drafts – it is up to authors.

Question: Cal Lee: Not a question so much as an idea: the world of digital forensics and security and the ‘order of volatility’ dictate that everyone should always be making a full disk copy bit by bit before doing anything else.

Comment: Comment on digital forensic tools – there is lots of historical and editing history of documents in the software… also delete files are still there.

Question: Have you seen examples of materials that are coming into the archive where the digital materials are working drafts for a final paper version? This is in contrast to others are electronic experiments.

Answer: Yes, they do think about this. It can effect arrangement and how the records are described. The formats also impact how things are preserved.

Question: Access issues? Are you letting people link to them from the finding aids? How are the documents authenticity protected.

Answer: DSpace gives you a new version anytime you want it (the original bitstream) .. lots of cross linking supports people finding things from more than one path. In some cases documents (even electronic) can only be accessed from within the on site reading room.

Question: What is your relationship is like with your IT folks?

Answer: Gabriela: Our staff has been very helpful. We use ‘legacy’ machines to access our content. They build us computers. They are also not archivists, so there is a little divide about priorities and the kind of information that I am interested in.. but it has been a very productive conversation.

Question: (For Melissa) Why didn’t you accept Peter’s email (Melissa had said they refused a submission of email from Peter because it didn’t have research value)?

Answer: The emails that included personal medical emails were rejected. The agreement with Peter didn’t include an option to selectively accept (or weed) what was given.

Question: In terms of gathering information from the creators.. do you recommend a formal/recorded interview? Or a more informal arrangement in which you can contact them anytime on an ongoing basis?

Answer: Melissa: We do have more formal methods – ‘documentation study’ style approaches. We might do literature reviews.. Ultimately the submission agreement is the most formal document we have. Gabriela: It depends on what the author is open to.. formal documentation is best.. but if they aren’t willing to be recorded, then you take what you can get!

My Thoughts

I am very curious to see how best practices evolve in this arena. I wonder how stories written using something like Google Documents, which auto-saves and preserves all versions for future examination, will impact how scholars choose to evaluate the evolution of documents. There have already been interesting examinations of the evolution of collaborative documents. Consider this visual overview of the updates to the Wikipedia entry for Sarah Palin created by Dan Cohen and discussed in his blog post Sarah Palin, Crowdsourced. Another great example of this type of visual experience of a document being modified was linked to in the comments of that post: Heavy Metal Umlaut: The Movie. If you haven’t seen this before – take a few minutes to click through and watch the screencast which actually lets you watch as a Wikipedia page is modified over time.

While I can imagine that there will be many things to sort out if we try to start keeping these incredibly frequent snapshot save logs (disk space? quantity of versions? authenticity? author preferences to protect the unpolished versions of their work?) – I still think that being able to watch the creative process this way will still be valuable in some situations. I also believe that over time new tools will be created to automate the generation of document evolution visualization and movies (like the two I link to above) that make it easy for researchers to harness this sort of information.

Perhaps there will be ways for archivists to keep only certain parts of the auto-save versioning. I can imagine an author who does not want anyone to see early drafts of their writing (as is apparently also the case with architects and early drafts of their designs) – but who might be willing for the frequency of updates to be stored. This would let researchers at least understand the rhythm of the writing – if not the low level details of what was being changed.

I love the photo I found for the top of this post. I admit to still having stacks of 3 1/2 floppy disks. I have email from the early days of BITNET.  I have poems, unfinished stories, old resumes and SQL scripts. For the moment my disks live in a box on the shelf labeled ‘Old Media’. Lucky me – I at least still have a computer with a floppy drive that can read them!

Image Credit: oh messy disks by Blude via flickr.

As is the case with all my session summaries from SAA2008, please accept my apologies in advance for any cases in which I misquote, overly simplify or miss points altogether in the post above. These sessions move fast and my main goal is to capture the core of the ideas presented and exchanged. Feel free to contact me about corrections to my summary either via comments on this post or via my contact form.

After The Games Are Over: Olympic Archival Records

What does an archivist ponder after she turns off the Olympics? What happens to all the records of the Olympics after the closing ceremonies? Who decides what to keep? Not knowing any Olympic Archivists personally, I took to the web to see what I could find.

Olympics.org uses the tag line “Official Website of the Olympic Movement” and include information about The International Olympic Committee’s Historical Archives. The even have an Olympic Medals Database with all the results from all the games.

The most detailed list of Olympics archives that I could find is the Olympic Studies International Directory listing of Archives & Olympic Heritage sites. It is from this page that I found my way to records from the Sydney Olympic Park Authority.

The Olympic Television Archive Bureau (OTAB) website explains that this UK based company “has over 30,000 hours of the most sensational sports footage ever seen, uniquely available in one library”  and aims to provide “prompt fulfilment of your Olympic footage requirements”.

Then I thought to dig into the Internet Archive. What a great treasure trove for all sorts of interesting Olympic bits!

First I found a Universal Newsreel from the 1964 Olympics in Tokyo (embedded below).

I also found a 2002 Computer Chronicles episode Computer Technology and the Olympics which explores the “high-tech innovations that ran the 2002 Winter Olympic Games” (embedded below).

Other fun finds included a digitized copy of a book titled The Olympic games, Stockholm, 1912 and the oldest snapshot of the Beijing 2008 website (from December of 2006). Seeing the 2008 Summer Games pages in the archive made me curious. I found the old site of the official Athens summer games from 2004 which kindly states: “The site is no longer available, please visit http://www.olympic.org or http://en.beijing2008.com/”. The Internet Archive has a bit more than that on the athens2004.com archive page – though some clicking through definitely made it clear that not all of the site was crawled. Lucky for us we can still see the Athens 2004 Olympics E-Cards you could send!

Then I turned to explore NARA‘s assorted web resources. I found a few photos on the Digital Vaults website (search on the keyword Olympics).  A search in the Archival Research Catalog (ARC) generates a long list – including footage of the US National Rifle Team in the 1960 Olympics in Italy.

My favorite items from NARA’s collections are in the Access to Archival Databases (AAD). First I found this telegram from the American Embassy in Ottawa to the Secretary of State in Washington DC (Document ID # 1975OTTAWA02204) sent in June 1975:

 1. EMBASSY APPRECIATES DEPARTMENT’S EFFORTS TO ASSIST CONGEN IN CARING FOR VIPS WHO CERTAINLY WILL ARRIVE FOR 1976 OLYMPIC GAMES WITHOUT TICKETS OR LODGING. HAS DEPARTMENT EXPLORED POSSIBILITY OF OBTAINING 4,000 TICKETS ON CONSIGNMENT BASIS FROM MONTGOMERY WARD, WITH UNDERSTANDING THAT, AS TICKETS ARE SOLD, PROCEEDS WILL BE REMITTED? PERHAPS SUCH AN ARRANGEMENT COULD BE WORKED OUT WITH FURTHER UNDERSTANDING THAT UNSOLD TICKETS BE RETURNED TO MONTGOMERY WARD AT SOME SPECIFIED DATE PRIOR TO BEGINNING OF EVENTS.

2. EMBASSY WILL FURNISH AMOUNT REQUIRED TO RESERVE SIX DOUBLE ROOMS FOR PERIOD OF GAMES. AT PRESENT HOTEL OWNERS AND OLYMPIC OFFICIALS ARE IN DISAGREEMENT AS TO AMOUNTS THAT MAY BE CHARGED FOR ROOMS DURING OLYMPIC PERIOD. NEGOTIATIONS ARE CURRENTLY BEING CARRIED OUT AND AS SOON AS ROOM RATES HAVE BEEN ESTABLISHED, QUEEN ELIZABETH HOTEL MANAGER WILL ADVISE US OF THEIR REQUIREMENTS TO RESERVE THE SIX DOUBLE ROOMS.

Immediately beneath that one, I found this telegram from October 1975 (Document Number 1975STATE258427):

SUBJECT:INVITATION TO PRESIDENT FORD AND SECRETARY
KISSINGER TO ATTEND OLYMPIC GAMES IN AUSTRIA,
FEBRUARY 4-15, 1976

THE EMBASSY IS REQUESTED TO INFORM THE GOA THAT MUCH TO THE PRESIDENT’S AND THE SECRETARY’S REGRET, THE DEMANDS ON THEIR SCHEDULES DURING THAT PERIOD WILL NOT MAKE IT POSSIBLE FOR THEM TO ATTEND THE WINTER GAMES. KISSINGER

There are definitely a lot of moving parts to Olympic Archival Records. So many nations participate.  New host countries with the option to handle records however they see fit. I explored this whole question two years ago and came up against the fact that control over the archival records produced by each Olympics was really in the hands of the hosting committee and their country. A quick glance down the list of Archives & Olympic Heritage sites I mentioned above gives you an idea of all the different corners of the world in which one can find Olympic Archival Records in both government and independent repositories. Given that clearly not all Olympic Games are represented in that list, it makes me wonder what we will see on this front from China now that the closing ceremony is complete.

I also suspect that with each Olympic Games we increase the complexity of the electronic records being generated. Would it be worthwhile to create an online collection for each games – as has been done for the Hurricane Digital Memory Bank or The September 11 Digital Archive, but extend it to include access to Olympic electronic records data sets? The shear quantity of information is likely overwhelming – but I suspect there is a lot of interesting information that people would love to examine.

Update: For those of you (like me) who wondered what Montgomery Ward had to do with Olympic Tickets – take a look at Tickets For The ’76 Olympics Go On Sale Shortly At Montgomery Ward over in the Sports Illustrated online SI Vault. Sports Illustrated’s Vault is definitely another interesting source of information about the Olympic Games. If my post above has made you nostalgic for Olympics gone by – definitely take a look at the current Summer Games feature on their front page. I couldn’t figure out a permanent link to this feature, but if I ever do I will update this post later.

Will Crashed Hard Drives Ever Equal Unlabeled Cardboard Boxes?

Photo of Crashed Hard Drive - wonderferret on FlickrHow many of us have an old hard drive hanging around? I am talking about the one you were told was unfixable. The one that has 3 bad sectors. The one they replaced and handed to you in one of those distinctive anti-static bags. You know the ones I mean – the steely grey translucent plastic ones that look like they should contain space food.

I have more than one ‘dead’ hard drive. I can’t quite bring myself to throw them out – but I have no immediate plans to try and reclaim their files.

I know that there are services and techniques for pulling data off otherwise inaccessible hard drives. You hear about it in court cases and see it on TV shows. A quick Google search on hard drive rescue turns up businesses like Disk Data Recovery

Do archivists already make it a policy to hunt not just for computers, but for discarded and broken hard drives lurking in filing cabinets and desk drawers? Compare this to a carton of documents that needed special treatment to permit access to the records they contained and yet are appraised as valuable. If the treatment required were within budgetary and time constraints – it would be performed. Mold, bugs, rusty staples, photos that are stuck together… archivists generally know where to get the answers they need to tackle these sorts of problems. I suspect that a hard drive advertised or discovered to be broken would be treated more like an empty box than a moldy box.

For now I would stack this challenge near the bottom of the list below archiving digital records that we can access easily but that run on old hardware or software, but I can imagine a time when standard hard drive rescue techniques will need to be a tool for the average archivist.

The Edges of the GIS Electronic Record

I spent a good chunk of the end of my fall semester writing a paper ultimately titled “Digital Geospatial Records: Challenges of Selection and Appraisal”. I learned a lot – especially with the help of archivists out there on the cutting edge who are trying to find answers to these problems. I plan on a number of posts with various ideas from my paper.

To start off, I want to consider the topic of defining the electronic record in the context of GIS. One of the things I found most interesting in my research was the fact that defining exactly what a single electronic record consists of is perhaps one of the most challenging steps.

If we start with the SAA’s glossary definition of the term ‘record’ we find the statement that “A record has fixed content, structure, and context.” The notes go on to explain:

Fixity is the quality of content being stable and resisting change. To preserve memory effectively, record content must be consistent over time. Records made on mutable media, such as electronic records, must be managed so that it is possible to demonstrate that the content has not degraded or been altered. A record may be fixed without being static. A computer program may allow a user to analyze and view data many different ways. A database itself may be considered a record if the underlying data is fixed and the same analysis and resulting view remain the same over time.

This idea presents some major challenges when you consider data that does not seem ‘fixed’. In the fast moving and collaborative world of the internet, Geographic Information Systems are changing over time – but the changes themselves are important. We no longer live in a world in which the way you access a GIS is via a CD which has a specific static version of the map data you are considering.

One of the InterPARES 2 case studies I researched for my paper was the Preservation of the City of Vancouver GIS database (aka VanMap). Via a series of emails exchanged with the very helpful Evelyn McLellan (who is working on the case study) I learned that the InterPARES 2 researchers concluded that the entire VanMap system is a single record. This decision was based on the requirement of ‘archival bond’ to be present in order for a record to exist. I have included my two favorite definitions of archival bond from the InterPARES 2 dictionary below:

archival bond
n., The network of relationships that each record has with the records belonging in the same aggregation (file, series, fonds). [Archives]

n., The originary, necessary and determined web of relationships that each record has at the moment at which it is made or received with the records that belong in the same aggregation. It is an incremental relationship which begins when a record is first connected to another in the course of action (e.g., a letter requesting information is linked by an archival bond to the draft or copy of the record replying to it, and filed with it. The one gives meaning to the other). [Archives]

I especially appreciate the second definition above because it’s example gives me a better sense of what is meant by ‘archival bond’ – though I need to do more reading on this to get a better grasp of it’s importance.

Given the usage of VanMap by public officials and others, you can imagine that the state of the data at any specific time is crucial to determining the information used for making key decisions. Since a map may be created on the fly using multiple GIS layers but never saved or printed – it is only the knowledge that someone looked at the information at a particular time that would permit those down the road to look through the eyes of the decision makers of the past. Members of the VanMap team are now working with the Sustainable Archives & Library Technologies (SALT) lab at the San Diego Supercomputer Center (SDSC) to use data grid technology to permit capturing the changes to VanMap data over time. My understanding is that a proof of concept has been completed that shows how data from a specific date can be reconstructed.

In contrast with this approach we can consider what is being done to preserve GIS data by the Archivist of Maine in the Maine GeoArchives. In his presentation titled “Managing GIS in the Digital Archives” delivered at the 2006: Joint Annual Meeting of NAGARA, COSA, and SAA on August 3, 2006, Jim Henderson explained their approach of appraising individual layers to determine if they should be accessioned in the archive. If it is determined that the layer should be preserved, then issues of frequency of data capture are addressed. They have chosen a pragmatic approach and are currently putting these practices to the test in the real world in an ambitious attempt to prevent data loss as quickly as is feasible.

My background is as a database designer and developer in the software industry. In my database life, a record is usually a row in a database table – but when designing a database using Entity-Relationship Modeling (and I will admit I am of the “Crow’s Feet” notation school and still get a smile on my face when I see the cover of the CASE*Method: Entity Relationship Modelling book) I have spent a lot of time translating what would have been a single ‘paper record’ into the combination of rows from many tables.

The current system I am working on includes information concerning legal contracts. Each of these exists as a single paper document outside the computers – but in our system we distribute information that is needed to ‘rebuild’ the contract into many different tables. One for contact information – one for standard clauses added to all the contracts of this type – another set of tables for defining financial formulas associated with the contract. If I then put on my archivist hat and I didn’t just choose to keep the paper agreement, I would of course draw my line around all these different records needed to rebuild the full contract. I see that there is a similar definition listed as the second definition on the InterPARES 2 Terminology Dictionary for the term ‘Record‘:

n., In data processing, a grouping of interrelated data elements forming the basic unit of a file. A Glossary of Archival and Records Terminology (The Society of American Archivists)

Just in this brief survey we can see three very different possible views on where to draw a line around what constitutes a single Geographic Information System electronic record. Is it the entire database, a single GIS layer or some set of data elements which create a logical record? Is it worthwhile trying to contrast the definition of a GIS record with the definition of a record when considering analog paper maps? I think the answer to all of these questions is ‘sometimes’.

What is especially interesting about coming up with standard approaches to archiving GIS data is that I don’t believe there is one answer. Saying ‘GIS data’ is about as precise as saying ‘database record’ or ‘entity’ – it could mean anything. There might be a best answer for collaborative online atlases.. and another best answer for state government managed geographic information library.. and yet another best answer for corporations dependent on GIS data for doing their business.

I suspect that it will be via thorough analysis of the information stored in a GIS system, how it is/was created, how often it changes and how it was used that will determine the right approach for archiving these born digital records. There are many archivists (and IT folks and map librarians and records managers) around the world who have a strong sense of panic over the imminent loss of geospatial data. As a result, people from many fields are trying different approaches to stem the loss. It will be interesting to consider these varying approaches (and their varying levels of success) over the next few years. We can only hope that a few best practices will rise to the top quickly enough that we can ensure access to vital geospatial records in the future.