Pondering Structured Data About Archives: Archives Wiki, Freebase and OCLC’s World Map & WikiD

This week both Dan Cohen’s blog and ArchivesNext posted about the new Archives Wiki sponsored by the American Historical Association (AHA). The AHA blog summarizes the goals of this wiki as:

…we hope that by harnessing this (relatively) new technology for collaboration on the web, we can draw on the collective interests of thousands of researchers and archivists to develop a rich resource for anyone venturing into new archives for the first time.

The AHA post goes on to express the hope that the wiki “will provide a deeper level of information than the rather general information on most archival web sites”. Setting aside the question of if this wiki will reach critical mass with regard to contributions, the idea of collecting lots of information about archives and their collections got me thinking again about Freebase.com.

My earlier post, Metadata World Building: Freebase.com and OpenLibrary.org, considers the potential of using Freebase to build a set of structured data about archival institutions. I believe in the spirit behind the Archives Wiki, but I wish that the rich set of information that is going to be captured was being stuck into multiple attributes rather than free-form wiki text. I know that they have contributor guidelines , but that isn’t enough for me.

Why Structured Data Is So Cool

Why am I so hung up about structured data? This is the sort of thing that needs a good example – and thanks to Level 1 Librarian‘s post OCLC maps the world, I found my way to the amazing OCLC WorldMap project. The WorldMap itself is a Flash based application that lets you explore data about both WorldCat holdings and other related statistics for countries around the world.

Once you get into the application, click on any two countries (I chose Russia and Australia) and then click on the ‘Compare’ button. For those especially interested in Archives, click on the ‘Cultural Institutions’ button (3rd from the bottom). If you move your mouse over each of the bars on the bar graphs you can see the actual numbers driving them. For example, on my Cultural Institutions comparison chart for Russia vs Australia I can see that Russia has 112 Archives while Australia has 42. The data source for both of these numbers is listed as the International Directory of Archives. To see the sources for the data, click on one of the tabs labeled with a country name and then click on any number to see it’s source. If my instructions are lacking, take a look at the beautiful and thorough Key to the WorldMap.

If people are going to go to all this effort entering data about archives and their collections, I wish it were being collected in such a way that we could then build new and more fabulous tools for accessing, manipulating and exploring the information.

As an example – if we collected the hours of each archives in a structured way we could figure out how many hours a week the Illinois State Archives is open (Monday–Friday, 8:00 a.m.–4:30 p.m.; Saturday, 8:00 a.m.–3:30 p.m. = 50 hours) and contrast that with the weekly hours of the Missouri State Archives (8 a.m. to 5 p.m. Monday through Wednesday and Friday; 8 a.m. to 9 p.m. on Thursday; and 8:30 a.m. to 3:30 p.m. on Saturday = 56 hours) . We could figure out how many state archives have evening hours or weekend hours. How about a map that showed the historian visiting a new city which archives were open late on the one night he has off from his conference? This is just a tiny example – but I hope it lights a spark for people about the promise of collecting this simple kind of data in a structured way.

Freebase’s whole point is to build data-sets that can drive interesting applications – like WorldMap. This just makes me want to race back to Freebase and figure out how to capture what I wish were being captured by the ArchivesWiki within the confines of Freebase’s model.

OCLC’s WikiD

I was just about to end this post when I realized I ought to check to see if someone has already tackled this problem of adding structured data to a wiki. I found my way to OCLC’s WikiD (Wiki/Data) project. The project’s home page states: “WikiD (Wiki/Data) extends the Wiki model to support multiple WikiCollections containing arbitrary schemas of XML records with minimal additional complexity.” From a brief look around, I am not clear how this would integrate in with the more traditional wiki style of the Archives Wiki – nor am I convinced that this project is still moving forward (the most recent dates I see on presentations are from 2006) – but the idea of a wiki that includes structured data is definitely there. Anyone out there have any more information about WikiD or any other tool that supports wiki style ease with the ability to structured data?

Final Thoughts

Again, I love the vision inherent in the Archives Wiki. I know that even getting a project like this off the ground is a big deal. I found this old AHA blog post from October of 2006 that discusses the original proposal for it and why it should be done. All the reasons are sound. But (you knew there was a but) the database geek in me just goes nuts when I see structured data being typed in free-form.

http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/digg_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/reddit_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/stumbleupon_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/delicious_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/blinklist_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/newsvine_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/technorati_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/google_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/facebook_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/yahoobuzz_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/sphinn_48.png http://www.spellboundblog.com/wp-content/plugins/sociofluid/images/twitter_48.png
Related Posts:

Posted on 13th February 2008
Under: historical research, information visualization, metadata, outreach, software, virtual collaboration, what if | 6 Comments » | Print This Post Print This Post

6 Responses to “Pondering Structured Data About Archives: Archives Wiki, Freebase and OCLC’s World Map & WikiD”

  1. Jeff Young Says:

    I’m the author of the WikiD project. Thanks for the plug.

    WikiD was a research prototype and has since been re-engineered from the ground up. It is currently being used by OCLC in several production applications such as our institution registry (http://worldcat.org/registry/Institutions) and the GDFR project (http://hul.harvard.edu/gdfr/).

    I haven’t had time to put together an open-source distribution of the new version. That has been our our plan all along, but there hasn’t been enough time. In the mean time, I will be happy to answer questions, though.

    Jeff

  2. Jeanne Says:

    Jeff,

    Thank you so much for the WikiD update! It is exciting to know that it is still alive and moving forward.

    Jeanne

  3. Tom Cobbaert .eu » links for 2008-02-20 Says:

    [...] Pondering Structured Data About Archives: Archives Wiki, Freebase and OCLC’s World Map & WikiD… (tags: Archief2.0 wiki) [...]

  4. Sam Wilson Says:

    There is the Semantic MediaWiki, also: http://ontoworld.org/wiki/Semantic_MediaWiki

  5. Jeanne Says:

    Sam,

    Thank you so much for the link. That looks fabulous – and like it has the best of both worlds.

    Jeanne

  6. Recent Links Tagged With "librarytech" - JabberTags Says:

    [...] public links >> librarytech [from alexandrasarkozy] Pondering Structured Data About Archives:… Saved by Pabl1to on Tue 16-9-2008 Say? Saved by Karmataxi on Fri [...]

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>