access | Spellbound Blog

Amazing Visual Tools

November 21, 2006 1 Comment

This has been a great week for exploring new visual tools. This post will look at one new tool that can compare photos to find images that have similarities to one another, another that creates 3D photo montages and a third that presents grouped keywords for image browsing.

First there is Riya, a visual search engine still in beta. I was especially intrigued by the Riya Personal Search that will let you use “face and text recognition to auto tag your photos”. Imagine a collection of 10,000 photos that include many combinations of the same 25 people. instead of going through and hand tagging them all – you should be able to use this feature to select good examples of each of the 25 people’s faces and let Riya’s magic do the busywork for you.

Over on Research Buzz they talk about Riya’s first commercial venture Like.com in Visual Search Engine for Finding Items by Photo. Basically this site lets you find clothing and accessories based on images in celebrity photos. What about buildings? What about automatic tagging of location by recognition of landmarks with distinctive shapes?

That leads us to Photosynth from Microsoft Live Labs . Photos are compared and stitched together into 3D models of space. ( Read more about how here.) They claim that the version they have out there now is a ‘tech preview’ or a ‘sneak peak’ at what is being built for real back in the labs. I played with it a bit – and it is very interesting. I love the idea overall – but I wish that there were a mode in which you could see the 3D world with many photos showing at once. Right now it shows you a sort of 3D point plotted version of the physical structures and then ‘hangs’ the individual photos in the proper place to give you a bit of ‘skin’ on the 3D skeleton as you browse through the photos. Take a look and play with it to see what I mean.

My fascination with this (beyond the cool factor) is the idea of feeding in photos taken a long time ago to create a virtual 3D space to walk through. The lower east side of New York City in the early 1900s or New Orleans before Katrina. I also wonder at being able to create 3D environments with a tie into time so that another slider control on the screen could let you alter the time you were viewing such that you could view the same facade change in fast forward.

Finally I wanted to share the approach taken by CSA Images for their Visual Brainstorming Index. They have provided a way to view keywords grouped by topic. Their goal is to inspire their customers by showing them terms that might jump out as being useful. This appears a great way to give users a quick grasp of what sorts of information (in this case images) can be found on this site. If a standard tool could show classified sets of keywords for users to explore it could go a long way to easily communicating the kinds of information available in a digitized archive online – especially if the browsable terms evolved as records were added to the collection. I am sure there are other great ways to communicate the big picture, but I think that a way to browse keywords or tags would give users a handle on the records in a way that a well written finding aid might not manage. Ask 100 users what a finding aid is. Then ask the same 100 users what a keyword or tag is. Moving to a model that users are comfortable with can only increase the use of the records we are working so hard to put online. Of course this assumes that the records in question are assigned keywords, tags or subject terms of some sort – but I think there are many paths to that (including the aforementioned Riya tool).

All three of these creations are interesting steps forward in searching through, processing and understanding images. As archivists make more progress in smoothing out the digitization process (or management of existing digital records), they will finally have more time to consider the wide array of tools that might make accessing those records easier. I hope that it happens sooner rather than later… and that those just starting the process of making archival records available online for research might make their plans with innovative ideas like these in mind.

Archival Context and Description – Taking It to the Next Level

November 14, 2006

In her post Describe, display, explain…, Jill Hurst-Wahl of the fabulous Digitization 101 talks about being more ambitious and committed to describing things well. This resonated very strongly for me, for so much of what I believe about enhancing access to and understanding of archival records online is tied up with describing them well. I don’t mean this solely in the traditional use of the term Description as it is used in the archival arena – but more in terms of the Five Ws.

They are not so very different really. The best finding aids and item level meta data can give you the same type of background information that would make any journalist proud. Who created these records? What are they? Why were they created? Where were they created? When? How?

This gives me a great chance to point people again to the Library of Congress American Memory Browse Collections page. I fell in love with it back when I was doing research for the paper that became my SAA 2006 Poster, but I couldn’t put my finger on how to explain why I liked it so much. I think it goes back to that basic journalistic approach to thinking about things. Am I interested in who created the records and why? Browse by topic. Am I interested in where the records were created? Browse by place. Am I interested in when the records were created? Browse by time period. You get the idea. (In a perfect world, there would be an advanced search option that would let me specify more than one of these interests at a time – but that is a topic for another day.)

So much of this goes back to some of the basics of web design. If you care about this sort of thing and you haven’t yet read Steve Krug’s Don’t Make Me Think – do yourself a favor and get your hands on a copy. It is an easy, quick and fun read – and it will leave you wishing that you could force it on the designers of every frustrating website you have ever been forced to muddle through when trying to get things done.

What would it be like if users of online archival collections didn’t need to learn new terminology in order to get the most out of the records? What if you never had to hunt to find the background history for a record because it was so obvious where to click? While I love the idea of thorough item level descriptions – I understand why that is not a reasonable expectation for most digitized archival collections. What I want is the context for any record an easy and obvious click away.

Of course if you want the dreamy version, then I would encourage folksonomy tagging and end user descriptions of digitized items. I know this would not work many places – but I suspect some experimentation with this kind of model with the right sort of collections would take the “Do you know who is in this photo” sorts of appeals to the next level.

I remember asking in my first Archives course if the archivists take information from the users of the records and add that information back into the finding aids. The answer was of the “maybe… sometimes” variety – followed up with an explanation of how archivists often have private files about a collection that are not shown to patrons. The thought was that perhaps an archivist might add a note there. My thinking was that since one of the biggest challenges with archival records is that it is possible that any single researcher is looking at a record for the first time since it was created (at least in any intent sort of way) – wouldn’t you want to harness that attention and use it to help others know what that first person found there?

This is already being done at the University of Michigan with the Polar Bear Expedition Digital Collections. They permit entry of comments at the item level and support search of the text in those comments. Every time someone adds a comment they are extending the description of the item. Take a look this comment on the Silver Parrish diary item (this was listed as the most recent comment just know when I looked at the site).

So thank you Jill for making me think about description again – from a new point of view. The more we do to describe everything fully, make it easy to find that information and let our users add more information to the mix – the more dynamic, usable and alive archival records will become.

Squirl.info – an interesting option for putting collections online

November 7, 2006 1 Comment

I got an interesting email from the folks over at Squirl.info. They wanted me to take a look and see what I thought of their site. I explored it a bit and exchanged emails back and forth with one of the founders (John McGrath).

Okay – remember those big dreams of mine? Specifically relating to A Hosting Service for Digitized Collections and Archival Transcriptions? Well Squirl.info looks like an interesting option to explore with these ideas in mind. I will admit that the current collections are predominately individual collectors showing off what they have accumulated over the years – but you can start to get an idea of how this might work for digitized collections from looking at the Lewis Carroll Postcards and London Hospitals Postcards.

I had lots of questions for John. What about more than one image per item (I want to see the backs of those postcards!) – he says it is on the way. What about the clause in the terms of service about deleting inactive accounts – he says only empty ones.

I wanted to know how customized each set of meta data was based on the type of object you entered – had he considered any of the standards when setting them up. His answer (from an email to me):

In a previous life I wrote content management systems for publishing companies, so I’m aware of classification and taxonomic issues, and I made a conscious effort to play nice with the ones I was aware of when I built our system. Metadata can also be added to all items in the form of tags. We also have a feature that lets you export all your data in comma-separated files, letting you manipulate it however you’d like. Our general approach has been to give users as much flexibility with their data as possible, while trying to keep the site easy to use. And we’re wide open to suggestions.

So far so good. Then we get to the fun parts that are already part of what the folks at Squirl have already implemented:

RSS: Imagine providing an RSS feed to fans of your institution as an easy way to provide a steady flow of the latest additions to your online collections.
Blog Integration: Check out the Blog Widgets that will let you add a block of up to 10 of your most recently added items from any of your collection to your blog or other web page.
Easy Interface: They have done a nice job with making it easy and obvious how to get your content online.
Inexpensive: Three collections for free – unlimited collections for $10 a year. There is some mention of no ads with the Squirl+plus plan, but that seems to have more to do with your experience when adding content rather than your users’ experiences when viewing your collections.
Integration: Squirl’s collection view pages include links to not only the RSS feed, but also quick and easy icons to let those who view your collection to add the link to digg, del.icio.us, reddit and others.
Search: You get the free support of their built in search features.

My wish list is shorter:

an ad free mode for those viewing your collection
a direct URL for collections with a pretty name (maybe just a direct URL for an institution as the homepage for their collections – with a pretty name)
support for transcriptions (ie, to support the ideas I described in my Archival Transcriptions post)
support for groups adding content – and control of the privileges for those users (John says this one is in the pipeline, but no promises on when).
a way to export your ENTIRE collection (tags, images, meta data – the whole nine yards) – of course to a certain degree any web archiving software could do this for you to some degree

Go take a look at Squirl. See what you think. Create an account and experiment with your three free collections (you can keep them private if you like while you are experimenting). Ask the nice people for things you need – and see what they say. In a world where archivists are wishing for an easy way to get their collections online – this might be an answer for some of them. I suspect that some critical mass of archives using a single tool would help drive further support for features that archivists want and need.

Archival Transcriptions: for the public, by the public

October 12, 2006 11 Comments

There is a recent thread on the archives listserv that talks about transcriptions – specifically for small projects or those that have little financial support. There is even a case in which there is no easy OCR answer due to the state of the digitized microfilm records.
One of the suggestions was to use some combination of human effort to read the documents – either into a program that would transcribe them, or to another human who would do the typing. It made me wonder what it would look like to make a place online where people who wanted to could volunteer their transcription time. In the case where the records are already digitized and viewable, this seems like an interesting approach.

Something like this already exists for the genealogy world over at the USGenWeb Archives Project. They have a long list of different projects listed here. Though the interface is a bit confusing, the spirit of the effort is clear – many hands make light work. Precious genealogical resources can be digitized, transcribed and added to this archive to support the research of many by anyone – anywhere in the world.

Of course in the case of transcribing archival records there are challenges to be overcome. How do you validate what is transcribed? How do you provide guidance and training for people working from anywhere in the world? If I have figured out that a particular shape is a capital S in a specific set of documents, that could help me (or an OCR program) as I progress through the documents, but if I only see one page from a series – I will have to puzzle through that one page without the support of my past experience. Perhaps that would encourage people to keep helping with a specific set of records? Maybe you give people a few sample pages with validated translations to practice with? And many records won’t be that hard to read – easy for a human’s eye but still a challenge for an OCR program.

The optimist in me hopes that it could be a tempting task for those who want to volunteer but don’t have time to come in during the normal working day. Transcribing digitized records can be done in the middle of the night in your pajamas from anywhere in the world. Talk about increasing your pool of possible volunteers! I would think that it could even be an interesting project for high school and college students – a chance to work with primary sources. With careful design, I can even imagine providing an option to select from a preordained set of subjects or tags (or in Folksonomy friendly environment, the option to add any tags that the transcriber deems appropriate) – though that may be another topic worthy of its own exploration independent of transcription.

The initial investment for a project like this would come from building a framework to support a distributed group of volunteers. You would need an easy way to serve up a record or group of records to a volunteer and prevent duplication of effort – but this is an old problem with good solutions from the configuration management world of software development and other collaboration work environments.

It makes a nice picture in my mind – a slow, but steady, team effort to transcribe collections like the Colorado River Bed Case (2,125 pages of digitized microfilm at the University of Utah’s J. Willard Marriott Library) – mostly done from people’s homes on their personal computers in the middle of the night. A central website for managing digitized archival transcriptions could give the research community the ability to vote on the next collection that warrants attention. Admit it – you would type a page or two yourself, wouldn’t you?

SAA 2007 Session Proposal: Preserving Context and Original Order in a Digital World

September 28, 2006 1 Comment

Abby Adams, Assistant Access and Outreach Archivist of the Richard B. Russell Library for Political Research and Studies, University of Georgia, and I are putting together a proposal for a session at SAA 2007 in Chicago. She and I found each other via my poster at SAA 2006: Communicating Context in Online Collections. We have been pondering many of the same questions related to the effective communication of context and original order in online digitized collections.

Our proposal is for a traditional 3 presentation panel with the title “Preserving Context and Original Order in a Digital World”. All we need now is a 3rd presenter, the endorsement of an SAA section or roundtable and (of course) the approval of the session selection committee. (And some plane tickets!)

This is the current version of our description for the proposal (mostly composed by Abby) :

Now that digitization projects have become more common in archival repositories, user and archivists alike have uncovered problems when it comes to understanding the context of online materials. However, there are various ways to provide more contextual information, thus enhancing the use of digital archives. But, archivists must confront the obstacles surrounding this task by developing best practices and incorporating new software into their digitization projects. In order to simplify the problem, we should return to our traditional archival principles and draw connections to collection arrangement and description in a digital environment. Join three archivists to explore how to improve on “analog” techniques in the communication of context. When done right, the digitization of a collection will not only retain all the same opportunities for communicating context that we are familiar with, it may revolutionize the way that archivists and users interact and understand our records.

The short take on what we want to cover in our session’s presentations is:

What should archivists be doing to not loose context and original order information in the transition from analog records to digitized records?
What can digitization give us the ability to do that we couldn’t do in the analog world?
What tools and standards are out there today to help archivists do both of the above? What information should archivists be capturing to permit them to take advantage of the opportunities to communicate context and original order that these tools and standards offer?

Abby’s part of the session, titled “Where’s the Context? Enhancing Access to Digital Archives”, will examine the need for preserving context and original order when digitizing archival materials – focusing on how it enhances online use and access to archives. How can new systems retain the existing ability to communicate context and original order when moving from “analog” to “digital”?

My portion, “Communicating Context: The Power of Digital Interfaces”, will discuss what archivists can do in the digital world they cannot do (or at least not easily) with analog records to communicate context and original order. I will focus on various innovative methods to do this including the use of GIS, hot-linking for ease of navigation, the ability to ‘collect’ digital surrogates for examination and more. I plan to include a combination of exciting new interfaces doing great things alongside new ideas of what could be done. Keep your fingers crossed for us that there is internet access in the session rooms in Chicago.

We have a vision of a third speaker whose talk would consider what the leading standards and software tools are permitting people to do today. How can archivists leverage the existing and evolving standards (EAD, EAC, TEI and other DTD s) to capture and communicate context and original order in the digital world? In addition, it would provide a high level review of common software packages (Archon , Archivists’ Toolkit, ContentDM , and others) and how they address original order and context. Finally we have a notion of a checklist of what to capture when digitizing to take advantage of what these tools and standards can provide for you.

Are you our mystery 3rd panelist that we are having so much trouble finding? Your first tip is that you have already mapped out 5 powerpoint slides in your head and started scribbling a rough draft of the “Archivists’ Digitization Checklist for Preserving Context” on a scrap of paper near your computer.

Maybe you know someone who would be a great person to pitch this to? Or you have advice for us concerning who to pass our proposal along to in the great hunt for that elusive session endorsement?

The deadline looms large (October 9)! Please contact us either via email (jeanne AT spellboundblog DOT com and adamsabi AT uga DOT edu) or in the comments of this post.

Records Speaking to the Present: Voices Not Silenced

September 27, 2006 1 Comment

When I composed my main essay for my application to University of Maryland’s MLS program, I wrote about why I was drawn to their Archives Program. I told them I revel in hearing the voices of the past speak through records such as those at EllisIsland.org. I love the power that records can wield – especially when they can be accessed digitally from anywhere in the world. It is this sort of power that let me see the ship manifests and the names of the boats on which my grandparents came to this country (such as The Finland ).

All this came rushing back to me while reading the September 18th article 2 siblings reunited after being separated in Holocaust. The grandsons of a Holocaust survivor looked up their grandmother in Yad Vashem’s central database of Shoah Victims’ Names – and found an entry stating that she had died during the Holocaust. One thing led to another – and two siblings that thought they had lost each other 65 years earlier were reunited.

The fact that access to records can bring people together across time speaks to me at a very primal level. So now you know – I am a romantic and an optimist (okay, if you have been reading my blog already – this shouldn’t come as any surprise). I want to believe that people who were separated long ago can be reunited – either through words or in person. This isn’t the first story like this – a quick search in google news turned up others – such as this holocaust reunion story from 2003.

This led me to do more research into how archival records are being used to find people lost during the Holocaust.

The Red Cross Holocaust Tracing Center has researched 28,000 individuals – and found over 1,000 of them alive since 1990. The FAQ on their website states that they believe there to be over 280,000 Holocaust survivors and family members in the United States alone and that they believe their work may continue for many years. As much as I love the idea of finding a way to provide access to digitized records – it is easy to see why the Tracing Center isn’t going away anytime soon. First of all – consider their main data sources – lots of private information that likely does NOT belong someplace where it can be read by just anyone:

While the American Red Cross has been providing tracing for victims of WWII and the Nazi regime since 1939, impetus for the creation of the center occurred in 1989 with the release of files on 130,000 people detained for forced labor and 46 death books containing 74,000 names from Auschwitz. Microfilm copies released to the International Committee of the Red Cross (ICRC) by the Soviet Union provided the single largest source of information since the end of WWII.

The staff of the center have also forged strong ties with the ICRC’s International Tracing Service in Arolsen, Germany – and get rapid turnaround times for their queries as a result. They have access to many organizations, archives and museums around the world in their hunt for evidence of what happened to individuals. They use all the records they can find to discover the answers to the questions they are asked – to be the detectives that families need to discover what happened to their loved ones. To answer the questions that have never been answered.

The USC Shoah Foundation Institute for Visual History and Education consists of 52,000 testimonies of survivors and other witnesses to the Holocaust collected in 56 countries and 32 languages from 1994 through 2000. These video testimonies document experiences before, during and after the Holocaust. It is the sort of first hand documentation that just could not have existed without the vision and efforts of many. They say on their FAQ page:

Now that this unmatched archive has been amassed, the Shoah Foundation is engaged in a new and equally urgent mission: to overcome prejudice, intolerance, and bigotry – and the suffering they cause – through the educational use of the Foundation’s visual history testimonies… Currently, the Foundation is committed to making these videotaped testimonies accessible to the public as an international educational resource. Simultaneously, an intensive program of cataloguing and indexing the testimonies is underway. This process will eventually enable researchers and the general public to access information about specific people, places, and experiences mentioned in the testimonies in much the same way as an index permits a reader to find specific information in a book.

The testimonies also serve as a basis for a series of educational materials such as interactive web exhibits, documentary films, and classroom videos developed by the Shoah Foundation.

I guess I am not sure where I am going with this – other than to point out a dramatic array of archives that are touching the lives of people right now. Consider this post a fan letter to all the amazing people who have sheparded these collections (and in some cases their digital counterparts) into the twenty-first century where they will continue to help people hear the voices of their ancestors.

I have more ideas brewing on how these records compare and contrast with those about the survivors and those who were lost to 9/11, The Asian Tsunami and Katrina. How do these types of records compare with the Asian Tsunami Web Archive or the Hurricane Digital Memory Bank? Where will the grandchildren of those who lost their homes to Katrina go in 30 years to find out what street the family home used to be on? Who will give witness to the people lost in Asia to the Tsunami? Lots to think about.

My New Daydream: A Hosting Service for Digitized Collections

September 20, 2006 3 Comments

In her post Predictions over on hangingtogether.org, Merrilee asked “Where do you predict that universities, libraries, archives, and museums will be irresistibly drawn to pooling their efforts?” after reading this article.

And I say: what if there were an organization that created a free (or inexpensive fee-based) framework for hosting collections of digitized materials? What I am imagining is a large group of institutions conspiring to no longer be in charge of designing, building, installing, upgrading and supporting the websites that are the vehicle for sharing digital historical or scholarly materials. I am coming at this from the archivists perspective (also having just pondered the need for something like this in my recent post: Promise to Put It All Online ) – so I am imagining a central repository that would support the upload of digitized records, customizable metadata and a way to manage privacy and security.

The hurdles I imagine this dream solution removing are those that are roughly the same for all archival digitization projects. Lack of time, expertise and ongoing funding are huge challenges to getting a good website up and keeping it running – and that is even before you consider the effort required to digitize and map metadata to records or collections of records. It seems to me that if a central organization of some sort could build a service that everyone could use to publish their content – then the archivists and librarians and other amazing folks of all different titles could focus on the actual work of handling, digitizing and describing the records.

Being the optimist I am I of course imagine this service as providing easy to use software with the flexibility for building custom DTDs for metadata and security to protect those records that cannot (yet or ever) be available to the public. My background as a software developer drives me to imagine a dream team of talented analysts, designers and programmers building an elegant web based solution that supports everything needed by the archival community. The architecture of deployment and support would be managed by highly skilled technology professionals who would guarantee uptime and redundant storage.

I think the biggest difference between this idea and the wikipedias of the world is that there would be some step required for an institution to ‘join’ such that they could use this service. The service wouldn’t control the content (in fact would need to be super careful about security and the like considering all the issues related to privacy and copyright) – rather it would provide the tools to support the work of others. While I know that some institutions would not be willing to let ‘control’ of their content out of their own IT department and their own hard drives, I think others would heave a huge sigh of relief.

There would still be a place for the Archons and the Archivists’ Toolkits of the world (and any and all other fabulous open-source tools people might be building to support archivists’ interactions with computers), but the manifestation of my dream would be the answer for those who want to digitize their archival collection and provide access easily without being forced to invent a new wheel along the way.

If you read my GIS daydreams post, then you won’t be surprised to know that I would want GIS incorporated from the start so that records could be tied into a single map of the world. The relationships among records related to the same geographic location could be found quickly and easily.

Somehow I feel a connection in these ideas to the work that the Internet Archive is doing with Archive-IT.org. In that case, producers of websites want them archived. They don’t want to figure out how to make that happen. They don’t want to figure out how to make sure that they have enough copies in enough far flung locations with enough bandwidth to support access – they just want it to work. They would rather focus on creating the content they want Archive-It to keep safe and accessible. The first line on Archive-It’s website says it beautifully: “Internet Archive’s new subscription service, Archive-It, allows institutions to build, manage and search their own web archive through a user friendly web application, without requiring any technical expertise.”

So, the tag line for my new dream service would be “DigiCollection’s new subscription service, Digitize-It, allows institutions to upload, manage and search their own digitized collections through a user friendly web application, without requiring any technical expertise.”

Just Promise to Put It All Online

September 19, 2006

As reported in Inside Higher Ed’s article Harming the Historical Record, the NEH Guidelines for Scholarly Editions Grants have been updated. The crucial passage is as follows:

In keeping with the goals of the NEH Digital Humanities Initiative, the Scholarly Editions Program requires that applicants employ digital technology in the preparation, management, and online publication of all critical and documentary editions. Projects that include TEI (Text Encoding Initiative) conformant transcription and offer free online access are encouraged and will be given preference. (emphasis mine)

Offering free online access is encouraged (not required) and the description of the Digital Humanities Initiative does sound inspiring. It includes this sentence:

[The] NEH is interested in fostering the growth of digital humanities and lending support to a wide variety of projects, including those that deploy digital technologies and methods to enhance our understanding of a topic or issue; those that study the impact of digital technology on the humanities–exploring the ways in which it changes how we read, write, think, and learn; and those that digitize important materials thereby increasing the public’s ability to search and access humanities information.

I love a lot of the sites that they list as having been sponsored by the NEH (such as Valley of the Shadow and Maryland Institute for Technology in the Humanities), and I am always one for “increasing the public’s ability to search and access humanities information”, but it is so frustrating that the glamour of digital access to records would cross over into requirements for funding Scholarly Editions. The core goal of these grants are described as “[to] support the preparation by a team of at least two editors and staff of texts and documents that are currently inaccessible or available in inadequate editions.”

I feel strongly that this sort of expectation for digitization is rarely set with an full understanding of all the other elements that need consideration, ongoing support and financial backing.

First of all – what does it mean to be ‘put online’. While I understand that they likely desire online access to some version of the scholarly edition created with the grant funding – the requirement is still very vague. One could easily wonder if we are talking about images of the records? Transcriptions of the text of the records? What sort of supporting data must be provided? It isn’t as if you can just upload 10,000 scans of records and create a single page with links to them and call it a day. Of course no-one would think that interface was useful, but it could certainly could be considered as being online. Will the grants provide some provision for supporting the online sites in subsequent years? Websites need hardware, bandwidth and support from IT personnel. Unfortunately there is no accepted, open-source, freely hosted solution for serving up digital records. Some institutions have been experimenting with using Flickr as a Digital Collection Host – but that entire topic (and all the issues inherent therein) is fodder for another post in the future.

Next let us consider copyright and privacy issues. Many archival collections are kept, supported and maintained by an archival institution that does NOT in fact retain the copyright to the records. To demand that a project promise to publish all records for free online would unfairly punish collections that do not have the right to publish all the records online even if they have secured the rights to publish records in books. On the privacy side – archivists must often restrict access to certain records or selected series of a collection due to the private information about individuals included in those records or series. This presents yet another challenge to blanket digitization requirements.

The Inside Higher Ed article went on to mention that by requiring free publication of records online, the NEH is removing creative ways for institutions to find additional funding to support their important work. “Virginia is a major player in archival series, publishing — among others — the Papers of George Washington and the Papers of James Madison . Much material from those projects is placed online, free, Kaiserlian said. But Virginia is also selling site licenses to libraries to enable them to have access to everything, while supporting the work that goes into the project.” As the sources for funding for humanities projects such as these are shrinking every day it is unfortunate that a grant might force institutions to consider the income they would loose if they apply for a grant with such strings attached.

Browsing through the rest of the NEH Digital Humanities Initiative website I did find a lot to be enthusiastic and hopeful about – such as the Digital Humanities Start-Up Grants :

NEH’s Digital Humanities Start-Up Grants will encourage scholars with bright new ideas and provide the funds to get their projects off the ground. Some projects will be practical, others completely blue sky. Some will fail while others will succeed wildly and develop into important projects. But all will incorporate new ways of studying the humanities.

I love it. I want to see what they fund. I want to participate. I want that grant to still exist when I am done with my graduate degree and have more focus to my ideas.

Browsing the sidebar of the main Digital Humanities Initiative page you can see how they are presenting all their grants in the context of being digital in some way. If I want to “create digital humanities tools for analyzing and manipulating humanities data” I should apply for either a Reference Materials Grant or a Research and Development Grant. If I want to “develop a Web site or other digital project for a general public audience” I should apply for a Special Projects Grant. And if I want to “create a digital or online version of a scholarly edition” I should apply for a Scholarly Editions Grant. In some ways it just feels as if they added a ‘digital’ element to all their grants without any other major restructuring, not that I am an expert on the history of NEH grants.

I wonder – if I had arrived at the NEH Digital Humanities Initiative page without prior knowledge of how these grants have been used in the past on existing projects, would I have ended up with a post with similar questions, but less frustration? Less stress over what appears to be perceived (if the quotes in the article at the start of this post are to believed) as a major change to the structure of a grant that many doing fine and important work have come to depend upon? That said – I still think that the issues of vague online access expectations, the challenges related to privacy and copyright and the lack of ongoing funding to support websites and their patrons are real and worth consideration.

GIS, Access, Archives and Daydreams

September 13, 2006 9 Comments

Today in my Information Structure class, our topic was Entity Relationship Modeling. While this is a technique that I have used frequently over the many years I have been designing Oracle databases, it was interesting to see a slightly different spin on the ideas. The second half of class was an exercise to take a stab (as a class) at coming up with a preliminary data model for a mythical genealogical database system.

While deciding if we should model PLACE as an entity, a woman in our class who is a genealogy specialist told us that only one database she has ever worked with tries to do any validation of location – but that it is virtually impossible due to the scale of the problem. Since the borders and names of places on earth have changed so rapidly over time, and often with little remaining documentation, it is hard to correlate place names from archival records with fixed locations on the planet. Anyone who has waded through the fabulous ship records on the Ellis Island website hunting for information about their grandparents or great-grandparents has struggled with trying to understand how the place names on those records relate to the physical world we live in.

So – now to my daydream. Imagine if we could somehow work towards a consolidated GIS database that included place names and boundary information throughout history. Each GIS layer would relate to specific years or eras in time. Imagine if you could connect any set of archival records that contained location data to this GIS database and not only visualize the records via a map – but visualize the records with the ability to change the layers so you could see how the boundaries and place names changed. And view the relationship between records that have different place names on them from different eras – but are actually from the same location.

I poked around to see what people are already doing – and found all of this:

Digital Earth and it’s more recently updated counterpart Geospatial Applications and Interoperability (GAI), a working group of the Federal Geographic Data Committee that seems to now exist within the National Geospatial Program Office of the USGS.
GOS – Geospatial One Stop which led me to the fabulous Lewis and Clark GeoSystems
The National Atlas (also found off GOS) that includes a special History Chapter (that starts to head in the direction I am imagining I think)
GEOnet Names Server (GNS) that provides access to the National Geospatial-Intelligence Agency’s (NGA) and the U.S. Board on Geographic Names‘ (US BGN) database of foreign geographic feature names (take this and add in a history element, and we are getting even warmer)
GIS for the Humanities – funded by a 2003 NEH Focus Grant, this project’s goal is “designed to create, and train faculty in the use of, mapping modules intended to enhance humanities courses”. I included this one because it gives a slice of the kind of teaching my dream GIS database could fuel.
And two clearinghouses for information: the US National Geospatial Data Clearinghouse and the United Nations Environment Programme / Global Resource Information Database (UNEP/GRID) Spatial Data Clearinghouse

I know it is a daydream – but I believe in my heart of hearts that it will exist someday as computing power increases, the price of storing data decreases and more data sources converge. I do forsee another issue related to the challenges presented by different versions of borders and place names from the same time period – but there are ways to address that too. It could happen – believe with me!

Ideas about Zotero and Digitized Archives

September 9, 2006 9 Comments

Dan Cohen posted recently about the soon to be available, open-source, firefox plugin, research support software named Zotero . Looking at the quick start guide, I immediately spotted the icon to “add a new collection folder”. As the “archivist-in-training” that I am, my reaction now to the word “collection” is different than it would have been a year ago. Though I strongly suspect it will not be the case (at least not in the first released version) I immediately was daydreaming of browsing a digitized collection online – clicking the “add a new collection folder” icon – and ending up with a copy of the entire collection of records for examination and comparison later.

Of course this would be most useful for the historian digging through and analyzing archival records if Zotero was able to pull down metadata beyond that of a standard citation and retain any hierarchical information or relationships among the records.

Now on Dead Reckoning‘s post on Zotero RDFa is mentioned. I don’t know anything about RDFa beyond what I have read in the last few hours, so it is not clear to me how complicated the metadata can be – perhaps it can support a full digital object XML record of some kind. So maybe the trick isn’t so much getting Zotero to do things it wasn’t designed to do – but rather the slow migration of sites to using the software packages and standards listed here.

I don’t want anyone to think that I am not excited about Zotero and all the neat things it is likely to do. I suspect I will rapidly become a frequent Zotero user verging on a zealot – but it is fun to daydream. I think it is most fun to daydream now, before I start using it and get lost in all the great stuff it CAN do. I definitely will post more after I get a chance to take it for a spin in early October.

Category: access