Menu Close

The Yahoo! Time Capsule

Yahoo! is creating a time capsule. The first paragraph of the Yahoo! Time Capsule Overview concludes by claiming “This is the first time that digital data will be gathered and preserved for historical purposes”. Excuse me? What has the Internet Archive been doing since 1996? What are the Hurricane Digital Memory Bank and The September 11 Digital Archive doing? And that is just off the top of my head – the list could go on and on.

I think that what they are doing (collecting digital content from around the world for 30 days, then giving the timecapsule to the Smithsonian Folkways Recordings in Washington, DC) is great. I am not sure what the bit about being “beamed along a path of laser light into space” is all about – but it sounds sort of cool. To add an entry, it must be put under one of 10 themes: Love, Anger, Fun, Sorrow, Faith, Beauty, Past, Now, Hope or You. It seems like an interesting attempt at organizing what would could otherwise be just an endless stream of images. At the time of this post, they had 15,564 contributions over the course of the first 3 days. I even explored some of what they have – it is pretty. It reminded me a bit of the America 24/7 project from a few years back – though with more types of media and an aim to record a snapshot of the world, not just America.

They have another ridiculous claim on the main time capsule page: “This first-ever collection of electronic anthropology captures the voices, images and stories of the online global community.”

Go ahead and make a fabulous digital archive of contributions from around the world Yahoo!, but please stop claiming that you invented the idea. I can’t be the only person who is frustrated by the way they are presenting this. Please tell me I am not alone!

Archival Transcriptions: for the public, by the public

There is a recent thread on the archives listserv that talks about transcriptions – specifically for small projects or those that have little financial support. There is even a case in which there is no easy OCR answer due to the state of the digitized microfilm records.
One of the suggestions was to use some combination of human effort to read the documents – either into a program that would transcribe them, or to another human who would do the typing. It made me wonder what it would look like to make a place online where people who wanted to could volunteer their transcription time. In the case where the records are already digitized and viewable, this seems like an interesting approach.

Something like this already exists for the genealogy world over at the USGenWeb Archives Project. They have a long list of different projects listed here. Though the interface is a bit confusing, the spirit of the effort is clear – many hands make light work. Precious genealogical resources can be digitized, transcribed and added to this archive to support the research of many by anyone – anywhere in the world.

Of course in the case of transcribing archival records there are challenges to be overcome. How do you validate what is transcribed? How do you provide guidance and training for people working from anywhere in the world? If I have figured out that a particular shape is a capital S in a specific set of documents, that could help me (or an OCR program) as I progress through the documents, but if I only see one page from a series – I will have to puzzle through that one page without the support of my past experience. Perhaps that would encourage people to keep helping with a specific set of records? Maybe you give people a few sample pages with validated translations to practice with? And many records won’t be that hard to read – easy for a human’s eye but still a challenge for an OCR program.

The optimist in me hopes that it could be a tempting task for those who want to volunteer but don’t have time to come in during the normal working day. Transcribing digitized records can be done in the middle of the night in your pajamas from anywhere in the world. Talk about increasing your pool of possible volunteers! I would think that it could even be an interesting project for high school and college students – a chance to work with primary sources. With careful design, I can even imagine providing an option to select from a preordained set of subjects or tags (or in Folksonomy friendly environment, the option to add any tags that the transcriber deems appropriate) – though that may be another topic worthy of its own exploration independent of transcription.

The initial investment for a project like this would come from building a framework to support a distributed group of volunteers. You would need an easy way to serve up a record or group of records to a volunteer and prevent duplication of effort – but this is an old problem with good solutions from the configuration management world of software development and other collaboration work environments.

It makes a nice picture in my mind – a slow, but steady, team effort to transcribe collections like the Colorado River Bed Case (2,125 pages of digitized microfilm at the University of Utah’s J. Willard Marriott Library) – mostly done from people’s homes on their personal computers in the middle of the night. A central website for managing digitized archival transcriptions could give the research community the ability to vote on the next collection that warrants attention. Admit it – you would type a page or two yourself, wouldn’t you?

SAA 2007 Session Proposal Submitted

Abby submitted the completed panel proposal for our “Preserving Context and Original Order in a Digital World” panel for SAA 2007. We recruited both a 3rd person to join our panel (Jean-François Blanchette) and a panel chair (L. Rebecca Johnson Melvin). We also earned an endorsement from the EAD Roundtable. Now all we can do is try not to think about it.

Thanks to everyone for your encouragement and support.

Reflections on Blogging at SAA 2006

Mark A. Matienzo’s recent post (and its related comments) On what “archives blogs” are and what ArchivesBlogs is not over on got me thinking about my experience of blogging SAA2006 again (as well as making me want to send out a special thank you to everyone for their kind words – as much as I am writing for myself, I will admit to being encouraged that there are others who find my posts worth reading).

Since there was no internet available in the rooms where the panels were held – I found myself taking notes on my laptop. 37 pages of notes later and sitting at home alone trying to convert those notes into coherent posts and I found it hard sometimes to not be overwhelmed. It was interesting to try and strike a balance between sharing the ideas the panelists had presented and including my own insights. I think what I ended up with was a decent mix – with the opportunity to include ideas about the connections among many of the panel topics, as well as other ideas and websites from outside the conference. On the downside – I never did finish writing up all the talks I took notes on. The scale of the task got to me – and realized that I had started to wish I could write about something else. So I did!

I do wonder how different my posts would have been if I could have posted them live. I think that I would have covered a greater breadth of speakers – but with a loss of depth. I would have had less opportunity to reflect on how the speakers talks connected with the rest of the archival world – especially those examples and other ideas I was able to link to as a result of my extra time.

I hope that we (ie, anyone who wants to try their hand at it) can coordinate a broader group of bloggers at SAA 2007 in Chicago, both to expose the ideas presented with those who could not attend as well as to permit further reflection on connections among all the new ideas that might otherwise be hard to share. The library community is ahead of us on this front. Take a look at the page for the Public Library Associations’ recent conference in Boston. This page gives people an easy link to view the posts from the PLA 2006 conference – while spreading the work among many keyboards. Perhaps there is a place for something like this in the future of archives conferences.

Records Speaking to the Present: Voices Not Silenced

When I composed my main essay for my application to University of Maryland’s MLS program, I wrote about why I was drawn to their Archives Program. I told them I revel in hearing the voices of the past speak through records such as those at I love the power that records can wield – especially when they can be accessed digitally from anywhere in the world. It is this sort of power that let me see the ship manifests and the names of the boats on which my grandparents came to this country (such as The Finland ).

All this came rushing back to me while reading the September 18th article 2 siblings reunited after being separated in Holocaust. The grandsons of a Holocaust survivor looked up their grandmother in Yad Vashem’s central database of Shoah Victims’ Names – and found an entry stating that she had died during the Holocaust. One thing led to another – and two siblings that thought they had lost each other 65 years earlier were reunited.

The fact that access to records can bring people together across time speaks to me at a very primal level. So now you know – I am a romantic and an optimist (okay, if you have been reading my blog already – this shouldn’t come as any surprise). I want to believe that people who were separated long ago can be reunited – either through words or in person. This isn’t the first story like this – a quick search in google news turned up others – such as this holocaust reunion story from 2003.

This led me to do more research into how archival records are being used to find people lost during the Holocaust.

The Red Cross Holocaust Tracing Center has researched 28,000 individuals – and found over 1,000 of them alive since 1990. The FAQ on their website states that they believe there to be over 280,000 Holocaust survivors and family members in the United States alone and that they believe their work may continue for many years. As much as I love the idea of finding a way to provide access to digitized records – it is easy to see why the Tracing Center isn’t going away anytime soon. First of all – consider their main data sources – lots of private information that likely does NOT belong someplace where it can be read by just anyone:

While the American Red Cross has been providing tracing for victims of WWII and the Nazi regime since 1939, impetus for the creation of the center occurred in 1989 with the release of files on 130,000 people detained for forced labor and 46 death books containing 74,000 names from Auschwitz. Microfilm copies released to the International Committee of the Red Cross (ICRC) by the Soviet Union provided the single largest source of information since the end of WWII.

The staff of the center have also forged strong ties with the ICRC’s International Tracing Service in Arolsen, Germany – and get rapid turnaround times for their queries as a result. They have access to many organizations, archives and museums around the world in their hunt for evidence of what happened to individuals. They use all the records they can find to discover the answers to the questions they are asked – to be the detectives that families need to discover what happened to their loved ones. To answer the questions that have never been answered.

The USC Shoah Foundation Institute for Visual History and Education consists of 52,000 testimonies of survivors and other witnesses to the Holocaust collected in 56 countries and 32 languages from 1994 through 2000. These video testimonies document experiences before, during and after the Holocaust. It is the sort of first hand documentation that just could not have existed without the vision and efforts of many. They say on their FAQ page:

Now that this unmatched archive has been amassed, the Shoah Foundation is engaged in a new and equally urgent mission: to overcome prejudice, intolerance, and bigotry – and the suffering they cause – through the educational use of the Foundation’s visual history testimonies… Currently, the Foundation is committed to making these videotaped testimonies accessible to the public as an international educational resource. Simultaneously, an intensive program of cataloguing and indexing the testimonies is underway. This process will eventually enable researchers and the general public to access information about specific people, places, and experiences mentioned in the testimonies in much the same way as an index permits a reader to find specific information in a book.

The testimonies also serve as a basis for a series of educational materials such as interactive web exhibits, documentary films, and classroom videos developed by the Shoah Foundation.

I guess I am not sure where I am going with this – other than to point out a dramatic array of archives that are touching the lives of people right now. Consider this post a fan letter to all the amazing people who have sheparded these collections (and in some cases their digital counterparts) into the twenty-first century where they will continue to help people hear the voices of their ancestors.

I have more ideas brewing on how these records compare and contrast with those about the survivors and those who were lost to 9/11, The Asian Tsunami and Katrina. How do these types of records compare with the Asian Tsunami Web Archive or the Hurricane Digital Memory Bank? Where will the grandchildren of those who lost their homes to Katrina go in 30 years to find out what street the family home used to be on? Who will give witness to the people lost in Asia to the Tsunami? Lots to think about.