Menu Close

Category: EAD

EAD, Encoded Archival Description, is an international standard for the XML encoding of archival finding aids.

SAA2008 Here I Come! After the Revolution: Unleashing the Power of EAD

SAA2008 I got the word just before the holidays – the panel proposal of which I was a part has been accepted for SAA 2008 in San Francisco . The title of the panel is ‘After the Revolution: Unleashing the Power of EAD’ and the working title for my paper/presentation is ‘Visualizing Archival Collections: Leveraging the Power of EAD’.

My co-presenters are Max Evans (currently of the NHPRC, soon to be of the LDS Church Historical Department) and Elizabeth Yakel (of University of Michigan, School of Information). Jodi Allison-Bunnell from Northwest Digital Archives, Orbis Cascade Alliance is our panel Chair.

This is the description of our panel that we submitted with our proposal:

Encoded Archival Description (EAD) was created in 1995 to increase uniformity and interoperability of data about archival collections to facilitate discovery. It has yet to realize that goal: most online finding aids merely recreate paper documents. Speakers will demonstrate how the structured, standardized nature of EAD can form the basis of user-friendly interfaces and finding aids that can accommodate multiple perspectives and utilize graphical and visual interfaces–while faithfully recording and presenting the context, structure, and content of the collection. Panelists will also address the challenges of unleashing the power of EAD, including normalizing XML, the lack of standard values for cross-institutional aggregation of data, and different approaches to subject terms, with a discussion of the technological and practical issues that surround them. The session relates to the SAA strategic priorities of technology and public awareness and engages elemental questions of revolutionary and evolutionary change.

My portion of the panel will focus on my ArchivesZ information visualization project. I will be discussing both the power of this type of graphical interface to archival collections as well as addressing the roadblocks to their practical implementation. My plan is to continue the work I started last Spring over the course of this Spring and Summer – and show off a new version of ArchivesZ in San Francisco (as well as online here of course!).

Here are the descriptions of Max, Elizabeth and Jodi’s planned contributions (cribbed from our proposal submission):

  • Max Evans will explore the fundamental purposes of finding aids and explore what can be done to leverage EAD’s structure to render graphical, informative, and elegant finding aids online.
  • Elizabeth Yakel will discuss usability test findings and how these were incorporated into the EAD-based Polar Bear Expedition Digital Collections to allow communities to engage with collections in new ways.
  • Jodi Allison-Bunnell brings a lively interest in user-centered presentations of finding aids that emerge from her work as manager of a five-state EAD consortium.

I am so pleased and excited. So – who is planning on going to San Fransisco in August? I hope to see you there.

Image Credit: Society of American Archivists, ARCHIVES 2008: Archival R/Evolution & Identities web page.

Visualizing Archival Collections

As I mentioned earlier, I am taking an Information Visualization class this term. For our final class project I managed to inspire two other classmates to join me in creating a visualization tool based on the structured data found in the XML version of EAD finding aids.

We started with the XML of the EAD finding aids from University of Maryland’s ArchivesUM and the Library of Congress Finding Aids. My teammates have written a parser that extracts various things from the XML such as title, collection size, inclusive dates and subjects. Our goal is to create an innovative way to improve the exploration and understanding of archival collections using an interactive visualization.

Our main targets right now are to use a combination of subjects, years and collection size to give users a better impression of the quantity of archival materials that fit various search criteria. I am a bit obsessed about using the collection size as a metric for helping users understand the quantity of materials. If you do a search for a book in a library’s catalog – getting 20 hits usually means that you are considering 20 books. If you consider archival collections – 20 hits could mean 20 linear feet (20 collections each of which is 1 linear foot in size) or it could mean 2000 linear feet (20 collections each of which is 100 linear feet in size). Understanding this difference is something that visualization can help us with. Rather than communicating only the number of results – the visualization will communicate the total size of collections assigned each of the various subjects.

I have uploaded 2 preliminary screen mockups one here and the second here trying to get at my ideas for how this might work.

Not reflected in the mock-ups is what could happen when a user clicks on the ‘related subject’ bars. Depending on where they click – one of two things could happen. If they click on the ‘related subject’ bar WITHIN the boundaries of the selected subject (in the case above, that would mean within the ‘Maryland’ box), then the search would filter further to only show those collections that have both the ‘Maryland’ and newly ‘added’ tag. The ‘related subjects’ list and displayed year distribution would change accordingly as well. If, instead, the user clicks on a ‘related subject’ bar OUTSIDE the boundary of the selected subject — then that subject would become the new (and only) selected subject and the displayed collections, related subjects and years would change accordingly.

So that is what we have so far. If you want to keep an eye on our progress, our team has a page up on our class wiki about this project. I have a ton of ideas of other things I would love to add to this (my favorite being a map of the world with indications of where the largest amount of archival materials can be found based on a keyword or subject search) – but we have to keep our feet on the ground long enough actually build something for our class project. This is probably a good thing. Smaller goals make for a greater chance of success.

Spring 2007:Access and Information Visualization

I don’t often post explicitly about my experiences as a graduate student – but I want to let everyone know about the focus of my studies for the next four months. I am taking two courses that I hope will complement one another. One course is on Archival Access (description, MARC, DACS, EAD and theory). The other is on Information Visualization over in the Computer Science department.

My original hope was that in my big Information Visualization final project I might get the opportunity to work with some aspect of archives and/or digital records. I want to understand how to improve access and understanding of the rich resources in the structured digital records repositories in archives around the world. What has already happened just one week into the term is that I find myself cycling through multiple points of view as I do my readings.

How can we support interaction with archival records by taking advantage of the latest information visualization techniques and tools? We can make it easier to understand what records are in a repository – both analog and digital records. I have been imagining interactive visual representations of archives collections, time periods, areas of interest and so forth. When you visit an archives’ website – it can often be so hard to get your head around the materials they offer. I suspect that this is often the case even when you are standing in the same building as the collections. In my course on appraisal last term we talked a lot about examining the collections that were already present on the path to creating a collecting policy. I am optimistic about ways that visualizing this information could improve everyone’s understanding of what an archives contains, for archivists and researchers alike.

Once I get myself to stop those daydreams… I move on to the next set of daydreams. What about the products of these visual analytics tools? How do we captured interactive visualizations in archives? This seems like a greater challenge than the average static digital record (as if there really is such an animal as an ‘average’ digital record). I can see a future in which major government and business decisions are made based on the interpretation of such interactive data models, graphs and charts. Instead of needing just the ‘records’ – don’t we need a way to recreate the experience that the original user had when interacting with the records?

This (unsurprisingly) takes me back to the struggle of how to define exactly what a record is in the digital world. Is the record a still image of a final visualization? Can this actually capture the full impact of an interactive and possible 3D visualization? With information visualization being such a rich and dynamic field I feel that there is a good chance that the race to create new methods and tools will zoom far ahead of plans to preserve its products.

I think some of my class readings will take extra effort (and extra time) as my mind cycles through these ideas. I think that a lot of this will come out in my posts over the next four months. And I still have strong hopes for rallying a team in my InfoViz class to work on an archives related project.

Session 305: Extended Archival Description Part I – Archives of American Art

Session 305 included perspectives from three digital collections which are trying to use EAD and meta data to solve real world problems of navigation and access. This post addresses the presentation by the first speaker, Barbara Aikens from the Archives of American Art at the Smithsonian.

The Archives of American Art (AAA) has over 4,500 collections focusing on the history of American art. They received a 3.6 million dollar grant from the Terra Foundation to fund their 5 year project. They had already been using EAD for their standard in online finding aids since 2004. They also had already looked into digitizing their microfilmed holdings and they believe that the history of microfilming at AAA made the transition to scanning entire collections at the item level easier than it might otherwise have been. So far they have digitized 11 full collections (45 linear feet).

Their organization of the digitized files was based on collection code, box and folder. Basing their template on the EAD Cookbook, AAA used Note Tab Pro to create their XML EAD finding aid. I wonder how they might be able to take advantage of the open source software tools being developed such as Archon and the Archivists’ Toolkit (if you are interested in these packages, keep your eye open for my future post looking at them each in detail). There was some mention of re-purposing DCDs, but I was not clear about what they were describing.

The resulting online finding aid lets you read all the information you would expect to find in a finding aid (see an example), as well as permitting you to drill down into each series or container to view a list of folders. Finally the folder view provides thumbnails on the left and a big image on the right. Note that this item level folder view includes very basic folder meta data and a link back to that folder’s corresponding series page. There is no meta data for any of the images of individual items. This approach for organizing and viewing digitized collections is workable for large collections. The context is well communicated and the user’s experience is very like that of going through a collection while physically visiting an archive. First you use the finding aid to location collections of interest. Next you examine the Series and or Container descriptions to location the types of information for which you are looking. Finally, you can drill down to folders with enticing names to see if you can find what you need.

As an experiment, I tested the ‘Search within Collections/Finding Aids’ option by searching for “Downtown Gallery” and for gallery artist files to see if I was given a link to the new Downtown Gallery Records finding aid. My search for “Downtown Gallery” instead directed me to what appears to be a MARC record in the Smithsonian Archives, Manuscripts and Photographs catalog. Two versions of the finding aid are linked to from this record – with no indication as to how they are different (it turned out one was an old version – the other the new one which includes links to the digitized content). A bit more experimentation showed me that the new online collection finding aids are not integrated into the search. I will have to remember to try this sort of searching in a few months to see what the search experience is like.

What I was hoping for (in a perfect world) would be highlighting of the search terms and deep linking from the search results directly to the series and folder description pages. I wonder what side effects there will be for the accuracy of search results given that the series/folder detail description page does not include all the other text from the main finding aid. (ie New Finding Aid vs New Finding Aid Series Level Page). Oddly enough – the old version of the finding aid for this same collection includes the folder level descriptions on the SAME page (with HTML anchors permitting linking from the side bar Table of Contents to the correct location on the page). So a search for terms that appear in the historical background along with the name of an artist only listed at the folder level WOULD return results (in standard text searching) for the old finding aid but not for the new one. Once the new finding aids are integrated into the search results – it would be very helpful to have an option to only return finding aids that include digitized collections.

While exploring the folder level view, I assumed that the order of the images in the folders is the original order in the analog folder. If so, then that is a fabulous and elegant way of communicating the original order of the records to the user of the digital interface. If NOT – then it is quite misleading because a user could easily assume, as I did, that the order in which they are displayed in the folder view is the original order.

Overall, this is exciting work – and shows how well the EAD can function as a framework for the item level digitization of documents. It also points to some interesting questions about how to handle search within this type of framework.

UPDATE: See the comment below for the clarification that the new finding aids based on the work described in this presentation are NOT online yet – but should be at the end of the month (posted: 08/09/2006).