Conversations and articles about the problem of archiving and accessing e-mail are often accompanied by the wringing of hands or the shrugging of shoulders. It has often seemed to me that figuring out how to archive and facilitate access to e-mail is a challenge that most people would rather ignore because it seems so difficult (and because there are plenty other things that need work too).
“In October 2003 the US Federal Energy Regulatory Commission placed 200,000 of Enron‘s internal emails from 1999-2002 into the public domain as part of its ongoing investigations.” So says TrampolineSystems on their facinating website that lets you explore those 200,000 public domain e-mails using their SONAR platform (that stands for Social Networks and Relevance). I would highly recommend taking a look and browsing around the Enron e-mails.
It appears that SONAR somehow tags the emails without human intervention – though they do not state this specifically one way or the other. The implication from the SONAR PR page is that you plug in the platform – and you instantly have this new access to your information. It is my impression that this works for either a fixed collection of e-mails (as is the case with the Enron emails) – or for an active live e-mail collection that is changing over time.
I like the social network Visualizer and the way it shows you how people are related to one another as represented by their e-mail correspondence. I like the theme and people tag clouds. I like the ease with which I can search for and read emails. I like how clearly they specify what you searched on at the top of your e-mail result list – and how many e-mails, people and themes the list represents.
On the other hand, there are a number of things I wish I could do. I wish that it was clear to me what order the emails are listed in when I do a search on a term. I searched on the word ‘pager’ – and received 2012 emails in no obvious order (most likely relevance – but that is not at all clear). I would like to be able to re-sort the results (by date for example). I would like to be able to add together multiple tags and people to get a scoped list of emails between two people on a specific set of theme.
Just as in traditional archival collections – there is some non-unique information in the mix. I found a generic Hotwire promotional email while looking at the theme The Insider (4th hit on the list). While I suppose spam and legitimate e-mail ads (ie, ones you asked for) are interesting – perhaps software considering e-mail to retain permanently could block some of these somehow.
I like clicking on things in the Visualizer and seeing the social networks hidden within the e-mails – but that gets old quickly unless you are looking for something very specific. I found myself wanting more context. Who are these people? What are their jobs? How are they ‘officially’ related in the corporate hierarchy? How do these e-mails compare with a timeline of events? What about the content of attachments (they don’t seem to be part of this interface)? All of this information could be linked into this interface in such a way as to improve an outsider’s understanding of this amazing landscape of 200,000 e-mails.
All in all I think it is an excellent starting point and I applaud them for trying to find an answer to the email question rather than just ignoring the problem.
- SAA 2006 Poster: Communicating Context in Online Collections
- Squirl.info – an interesting option for putting collections online
- Google Newspaper Archives
- Copyright Slider: Quick Easy Access to Copyright Laws and Guidelines