<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Spellbound Blog &#187; EAD</title>
	<atom:link href="http://www.spellboundblog.com/category/ead/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.spellboundblog.com</link>
	<description>Archives, Digital Humanities, Cultural Heritage, Technology</description>
	<lastBuildDate>Sat, 14 Aug 2010 03:54:17 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>ArchivesZ Data Challenges: University of Texas at San Antonio</title>
		<link>http://www.spellboundblog.com/2009/05/13/archivesz-data-challenges-university-of-texas-san-antonio/</link>
		<comments>http://www.spellboundblog.com/2009/05/13/archivesz-data-challenges-university-of-texas-san-antonio/#comments</comments>
		<pubDate>Wed, 13 May 2009 06:28:53 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[ArchivesZ]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[metadata]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/?p=534</guid>
		<description><![CDATA[Mark Shelstad, head of Archives and Special Collections at University of Texas at San Antonio, sent me a link to the TARO (Texas Archival Resources Online) page for UTSA&#8217;s Archives and Special Collections finding aids in XML format. With the current scripts, these are the fun tag stats: 1,684 total tags extracted 75% (1,266 tags) [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/05/13/archivesz-data-challenges-university-of-texas-san-antonio/">ArchivesZ Data Challenges: University of Texas at San Antonio</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a title="USTA Archives and Special Collections" href="http://www.lib.utexas.edu/taro/browse/browse_utsa1.html"><img class="alignright size-full wp-image-535" title="University of Texas San Antonio Archives and Special Collections" src="http://www.spellboundblog.com/wp-content/uploads/2009/05/logo-utsa.gif" alt="University of Texas San Antonio Archives and Special Collections" width="205" height="101" /></a></p>
<p><a title="Mark Shelstad" href="http://www.linkedin.com/pub/dir/mark/shelstad">Mark Shelstad</a>, head of <a title="Archives and Special Collections at University of Texas at San Antonio" href="http://www.lib.utsa.edu/archives/">Archives and Special Collections at University of Texas at San Antonio</a>, sent me a link to the <a title="TARO: UTSA" href="http://www.lib.utexas.edu/taro/utsa/utsa_xml.html">TARO</a> (Texas Archival Resources Online) page for <a title="USTA Archives and Special Collections" href="http://www.lib.utexas.edu/taro/browse/browse_utsa1.html">UTSA&#8217;s Archives and Special Collections finding aids</a> in XML format.</p>
<p>With the current scripts, these are the fun tag stats:</p>
<ul>
<li>1,684 total tags extracted</li>
<li>75% (1,266 tags) are associated with only one finding aid</li>
<li>3% (51 tags) are associated with 10 or more finding aids</li>
</ul>
<p><strong>Collection Size</strong></p>
<p>235 out of tne 253 collections ended up with a collection size of 0.</p>
<p>Consider the encoding of the collection size in the <a title="A Guide to the Women's Overseas Service League Records, 1910-2007" href="http://www.lib.utexas.edu/taro/utsa/00008/utsa-00008.html">Guide to the Women&#8217;s Overseas Service League Records, 1910-2007</a>:</p>
<pre>&lt;physdesc label="Extent:" encodinganalog="300$a"&gt;
    77 linear feet (approximately 44,000 items)
&lt;/physdesc&gt;</pre>
<p>Contrast this with one of the examples where the size of the collection was extracted properly by the current script:</p>
<pre>&lt;physdesc label="Extent:" encodinganalog="300$a"&gt;
    &lt;extent&gt;8.4 linear feet&lt;/extent&gt;
    (14 boxes)
&lt;/physdesc&gt;</pre>
<p>Sometimes it feels like a game of Where&#8217;s Waldo. In this case we are simply missing the set of &lt;extent&gt; tags  from the first example. Off I went to the EAD tag descriptions to find the <a title="LOC: physdesc tag library description" href="http://www.loc.gov/ead/tglib/elements/physdesc.html">guidelines for use of the &lt;physdesc&gt; tag</a>, where I found this overview of the tag:</p>
<p style="padding-left: 30px;">A wrapper element for bundling information about the appearance or construction   of the described materials, such as their dimensions, a count of their quantity   or statement about the space they occupy, and terms describing their genre,   form, or function, as well as any other aspects of their appearance, such as   color, substance, style, and technique or method of creation. The information   may be presented as plain text, or it may be divided into the &lt;dimension&gt;, &lt;extent&gt;, &lt;genreform&gt;,   and &lt;physfacet&gt; subelements.</p>
<p>Bad news for my script logic &#8211; both versions are valid! This is a great example of how valid encoding can still present challenges. While in this example it seems just as easy to parse the version with the &lt;extent&gt; tags as without, it will only be through examination of a much broader sample of data that we can determine how much of a problem we have on our hands with this scenario of size data included in the &lt;physdesc&gt; tags without enclosing &lt;extent&gt; or &lt;dimension&gt; tags.</p>
<p><strong>Inclusive Dates</strong></p>
<p>Twenty of the UTSA collections came through with no years. When I examined the data, I found an assortment of &lt;unitdate&gt; formats that my current script could not parse properly, including the examples below:</p>
<ul>
<li>1917-1980 (bulk 1920-1945)</li>
<li>1876-1903, 1914-1919, 1940-2002</li>
<li>1940s, 1970s-1990s</li>
</ul>
<p>Another encoding approach that could not be parsed was the one used for the finding aid of the <a title="Church Women United of San Antonio Records" href="http://www.lib.utexas.edu/taro/utsa/00046/utsa-00046.html">Church Women United of San Antonio Records</a>. In this case the &lt;unitdate&gt; tag is within the &lt;unittitle&gt; tag as seen here:</p>
<pre style="padding-left: 30px;">&lt;unittitle label="Title:" encodinganalog="245"&gt;
Church Women United of San Antonio Records,
&lt;unitdate label="Dates:" encodinganalog="245$a"&gt;1961-2005&lt;/unitdate&gt;
&lt;/unittitle&gt;</pre>
<p>Among the finding aids for which I did extract a range of inclusive date years, I also found issues with values like 1950s-1990s. The current script interpreted this to represent 1950 through 1990, but I believe it would be more properly translated as representing 1950 through 1999.</p>
<p><strong>General Code Fixes</strong></p>
<p>The University of Texas at San Antonio’s finding aids have provided additional examples of the following data and encoding issues already identified in earlier data sets:</p>
<ul>
<li>Inconsistent repository titles (26 different variations of &#8220;The University of Texas at San Antonio Library&#8221;)</li>
<li>Titles with embedded and tagged dates</li>
<li>Carriage return and tab characters that need to be removed</li>
<li>Emphasis within a title or abstract added via a tag (such as &lt;emph render=&#8221;italic&#8221;&gt;Storyletters&lt;/emph&gt; seen in <a title="A Guide to the Storyletters Records, 1991-2000" href="http://www.lib.utexas.edu/taro/utsa/00021/utsa-00021.html">A Guide to the Storyletters Records, 1991-2000</a>) which interrupts extraction of text at that point</li>
</ul>
<p><strong>Next Steps</strong></p>
<p>This is the last data set I am analyzing before tackling actual updates to the ArchivesZ data extraction script. My next step is to review and prioritize my long to do list for updates to this script. Most of what I have found in my examination of the data sets are ways in which my script was not smart enough to handle valid variations in encoding and the tabs, carriage returns, formatting tags and special characters found throughout everyone&#8217;s XML. Yes, there are some cases in which the data itself is less than optimal (such as non-standardized repository titles) or the values challenging (so many ways to describe the size of a collection!), but overall I am optimistic about how much more I can improve the extraction script before I have to resort to hand correcting records in the database.</p>
<p>Thanks to everyone for your patience with these data analysis posts. Onward to programming!</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/05/13/archivesz-data-challenges-university-of-texas-san-antonio/">ArchivesZ Data Challenges: University of Texas at San Antonio</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2009/05/13/archivesz-data-challenges-university-of-texas-san-antonio/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>ArchivesZ Data Challenges: Forest History Society</title>
		<link>http://www.spellboundblog.com/2009/05/06/archivesz-data-challenges-forest-history-society/</link>
		<comments>http://www.spellboundblog.com/2009/05/06/archivesz-data-challenges-forest-history-society/#comments</comments>
		<pubDate>Wed, 06 May 2009 21:30:48 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[ArchivesZ]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[metadata]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/?p=470</guid>
		<description><![CDATA[Amanda Ross, project archivist for the Forest History Society, sent me 57 EAD finding aids to include in the ArchivesZ project. These are the data challenges that the current data extraction script does not address: Titles with embedded tags or punctuation. Generally the script drops anything after it hits either, so rather than a title [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/05/06/archivesz-data-challenges-forest-history-society/">ArchivesZ Data Challenges: Forest History Society</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a title="Forest History Society" href="http://www.foresthistory.org"><img class="alignright size-full wp-image-471" title="The Forest History Society" src="http://www.spellboundblog.com/wp-content/uploads/2009/04/fhs_logo_small.jpg" alt="The Forest History Society" width="82" height="130" /></a><a title="Amanda Ross" href="http://fhsarchives.wordpress.com/author/amandatross/">Amanda Ross</a>, project archivist for the <a title="Forest History Society" href="http://www.foresthistory.org/">Forest History Society</a>, sent me 57 EAD finding aids to include in the ArchivesZ project. These are the data challenges that the current data extraction script does not address:</p>
<ul>
<li>Titles with embedded tags or punctuation. Generally the script drops anything after it hits either, so rather than a title like <a title="William E. Towell Papers, 1941 - 1988" href="http://foresthistory.org/ead/Towell_William_E.html">William E. Towell Papers, 1941 &#8211; 1988</a>, my database ended up only with &#8220;William E Towell Papers,&#8221; based on this encoding:  &lt;titleproper&gt;Inventory of the William E. Towell Papers, &lt;date normal=&#8221;1941/1988&#8243;&gt;1941 &#8211; 1988&lt;/date&gt;&lt;/titleproper&gt;</li>
<li>Need to handle a conversion factor for  a size of  &#8220;1 folder&#8221; (as found in the <a title="Inventory of the Biltmore Forest School Images, 1890 - 1988" href="http://foresthistory.org/ead/Biltmore_Forest_School_Images.html">Inventory of the Biltmore Forest School Images, 1890 &#8211; 1988</a>)</li>
<li>My script chokes on the Inclusive Year format &#8220;1910 and 1931 &#8211; 1937&#8243; (as found in the <a title="Inventory of the Alfred Cunningham Papers, 1910 and 1931 - 1937" href="http://foresthistory.org/ead/Cunningham_Alfred.html">Inventory of the Alfred Cunningham Papers, 1910 and 1931 &#8211; 1937</a>)</li>
<li>The presence of a &lt;lb/&gt; character within the &lt;extent&gt; tag, used to force a line break, is preventing my script from extracting any size information at all (as found in the <a title="Inventory of the DeWitt Nelson Papers, 1940 - 1976" href="http://foresthistory.org/ead/Nelson_DeWitt.html">Inventory of the DeWitt Nelson Papers, 1940 &#8211; 1976</a>)</li>
<li>Within the &lt;abstract&gt; tag, my script drops everything after an &lt;emph render=&#8221;doublequote&#8221;&gt; tag (making for a very short abstract in the case of the <a title="Inventory of the Arthur Bernard Recknagel Auxiliary Photograph Collection, 1911 - 1947" href="http://foresthistory.org/ead/Recknagel_Arthur_Bernard.html">Inventory of the Arthur Bernard Recknagel Auxiliary Photograph Collection, 1911 &#8211; 1947</a>).</li>
</ul>
<p>The most dramatic issue, seen across all the finding aids in this set, is that <strong>no</strong> subject data was extracted from any of the finding aids. My working theory for the moment is that this is due to the use of &lt;list&gt; and &lt;item&gt; tags as shown here:</p>
<pre>&lt;controlaccess&gt;
&lt;head&gt;Subject Headings&lt;/head&gt;
&lt;list type="simple"&gt;
&lt;item&gt;&lt;genreform source="lcnaf" encodinganalog="655"&gt;Audiotapes&lt;/genreform&gt;&lt;/item&gt;
&lt;item&gt;&lt;persname source="lcnaf" encodinganalog="600"&gt;Ainsworth, John H., 1909-&lt;/persname&gt;&lt;/item&gt;
&lt;item&gt;&lt;subject source="lcnaf" encodinganalog="650"&gt;Businessmen -- United States&lt;/subject&gt;&lt;/item&gt;</pre>
<p>This is in contrast with this example of encoding from <a title="ArchivesZ Data Challenges: Syracuse University Special Collections Research Center" href="http://www.spellboundblog.com/2009/03/07/archivesz-data-syracuse-university-archives/">Syracuse University</a>:</p>
<pre>&lt;controlaccess&gt;
&lt;head&gt;Subject and Genre Headings&lt;/head&gt;
&lt;subject encodinganalog="650" source="local"&gt;Adult education&lt;/subject&gt;
&lt;persname encodinganalog="600" source="lcnaf"&gt;Adolphson, L. H.&lt;/persname&gt;
&lt;persname encodinganalog="600" source="lcnaf"&gt;Bradford, Leland Powers, 1905-&lt;/persname&gt;</pre>
<p>Or this sample from <a title="ArchivesZ Data Challenges: Oregon State University Archives" href="http://www.spellboundblog.com/2009/02/22/archivesz-data-challenges-oregon-state-university/">Oregon State University</a>:</p>
<pre>&lt;controlaccess id="a12"&gt;
	 &lt;controlaccess&gt;
		  &lt;persname encodinganalog="600" source="local" rules="aacr2"
		  role="subject"&gt;Aitken, Frances Alva, 1889-1970.&lt;/persname&gt;
	 &lt;/controlaccess&gt;
	 &lt;controlaccess&gt;
		  &lt;corpname encodinganalog="610" source="local" role="subject"
		  rules="aacr2"&gt;Oregon Agricultural College. Class of 1910.&lt;/corpname&gt;
		  &lt;corpname source="lcnaf" encodinganalog="610" role="subject"&gt;Oregon
				Agricultural College--Students.&lt;/corpname&gt;
	 &lt;/controlaccess&gt;
	 &lt;controlaccess&gt;
		  &lt;geogname source="lcsh" role="subject" encodinganalog="651"&gt;Corvallis
				(Or.)&lt;/geogname&gt;
	 &lt;/controlaccess&gt;
	 &lt;controlaccess&gt;
		  &lt;subject encodinganalog="650" source="lcsh"&gt;Student
				activities--Oregon--Corvallis.&lt;/subject&gt;
	 &lt;/controlaccess&gt;</pre>
<p>Both the Syracuse and OSU examples are handled by the current state of the data extract script.</p>
<p>Amanda pointed me to the <a title="NCEAD Best Practice Guidelines for EAD 2002" href="http://www.ncecho.org/dig/ead2002.shtml">NCEAD Best Practice Guidelines for EAD 2002</a>. Down in <a title="APPENDIX G: HOW DO I ENCODE...?" href="http://www.ncecho.org/dig/ead2002.shtml#appendixG">Appendex G: How Do I Encode&#8230;</a>, the second question down is &#8220;What if I have multi-part scope notes, biographical notes or subject headings?&#8221; followed by exactly the &lt;list&gt; and &lt;item&gt; tag usage as is being done for the Forest History Society finding aids. This format clearly should be handled.</p>
<p>So, no fun tag stats for this run &#8211; but I hope to fix my ruby script so that the Forest History Society finding aids can be incorporated into the data set I use for testing version 2 of ArchivesZ. My ruby script to do list is getting quite long!</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/05/06/archivesz-data-challenges-forest-history-society/">ArchivesZ Data Challenges: Forest History Society</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2009/05/06/archivesz-data-challenges-forest-history-society/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>ArchivesZ Data Challenges: Utah Government Archives &amp; Records Service</title>
		<link>http://www.spellboundblog.com/2009/04/26/archivesz-data-challenges-utah-government-archives-records-service/</link>
		<comments>http://www.spellboundblog.com/2009/04/26/archivesz-data-challenges-utah-government-archives-records-service/#comments</comments>
		<pubDate>Sun, 26 Apr 2009 05:33:17 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[ArchivesZ]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[metadata]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/?p=424</guid>
		<description><![CDATA[Gina Strack of the Utah State Archives and Records Service provided me with access to the XML of 1,196 EAD encoded finding aids. These EAD 2.0 XML files are a product of a grant funded project completed last year to migrate from EAD 1.0 finding aids. Their website includes a detailed account of the EAD [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/04/26/archivesz-data-challenges-utah-government-archives-records-service/">ArchivesZ Data Challenges: Utah Government Archives &#038; Records Service</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a title="Utah State Archives and Records Service" href="http://www.archives.state.ut.us/"><img class="alignright size-full wp-image-425" title="Utah dot Gov Logo" src="http://www.spellboundblog.com/wp-content/uploads/2009/03/utahgovlogoglow.png" alt="Utah dot Gov Logo" width="87" height="66" /></a><a title="Gina Strack" href="http://ginastrack.com/">Gina Strack</a> of the <a title="Utah State Archives and Records Service" href="http://www.archives.state.ut.us/"><span class="il">Utah</span> State Archives and Records Service</a> provided me with access to the XML of 1,196 EAD encoded finding aids. These EAD 2.0 XML files are a product of a grant funded project completed last year to migrate from <a title="EAD verion 1 finding aids" href="http://historyresearch.utah.gov/inventories/inventories-ac.htm">EAD 1.0 finding aids</a>. Their website includes a <a title="Utah State Archives EAD Project" href="http://archives.utah.gov/research/inventories/ead.html">detailed account of the EAD Project</a>.</p>
<p>These finding aids have helped me identify three types of ArchivesZ data challenges:</p>
<ul>
<li>strange characters</li>
<li>broad composite subjects</li>
<li>determination of accurate collection size</li>
</ul>
<p><strong>Strange and mysterious characters!</strong></p>
<p>These finding aids use a special character in the place of the standard Library of Congress double dash which normally appears between subsections of the subject heading.</p>
<p>An example subject from the Utah Government XML looks like this:</p>
<p style="padding-left: 30px;">Women—Suffrage—Utah.</p>
<p>Viewing the same subject in a pure text editor (such as <a title="Wikipedia: vi" href="http://en.wikipedia.org/wiki/Vi">vi</a>):</p>
<p style="padding-left: 30px;">Women&amp;#8212;Suffrage&amp;#8212;Utah.</p>
<p>By the time it gets into my database and is pulled out via a query in MySQL Query Browser it looks like this:</p>
<p style="padding-left: 30px;">Women√¢‚Ç¨‚ÄùSuffrage√¢‚Ç¨‚ÄùUtah.</p>
<p>Rather than just stripping out all instances of &amp;#8212;,  my plan is to replace them with the standard Library of Congress double dash. This will ensure that the existing code that breaks the subjects down to tags will still work.</p>
<p><strong>Composite Subjects</strong></p>
<p>When I say &#8220;composite subject&#8221; what I mean is a subject that includes multiple very disparate terms. Rather than the Library of Congress style subjects, all aspects of which relate to the collection in question, these composite subjects cover multiple subjects which are grouped together for convenience.</p>
<p>This is a list of some of the most popular subjects for the Utah Gov collections:</p>
<ul>
<li>Politics, Government, and Law</li>
<li>Business, Industry, Labor, and Commerce</li>
<li>Science, Technology, and Health</li>
<li>Arts, Humanities, and Social Sciences</li>
</ul>
<p>These subjects throw a monkey wrench into my theories about decomposing subjects based on commas. The collections to which these subjects are assigned likely fit in only one of the component themes. For example, the &#8220;Inventory of Publications from Department of Technology Services, 1993-2008&#8243; is assigned the subject &#8220;Science, Technology, and Health&#8221;. If I divide this subject into 3 separate tags, the Science and Health tags would be quite misleading.</p>
<p>So that leaves me a bit trapped. If I want to divide subjects such as &#8220;Art, Cuban, 20th century&#8221;, as I discuss in <a title="ArchivesZ Data Challenges: Syracuse University Special Collections Research Center" href="http://www.spellboundblog.com/2009/03/07/archivesz-data-syracuse-university-archives/">my Syracuse University post</a>, then I end up also dividing these umbrella subjects which separate such very divergent terms with commas.</p>
<p>This issue goes on my list of reasons to add a repository configuration file for use by the data extraction script.</p>
<p><strong>Accurate Collection Size</strong></p>
<p>In my quest to convert all sizes to linear feet &#8211; sizes such as these are challenging:</p>
<ul>
<li>0.20 cubic foot and 1 microfilm reel</li>
<li>0.35 cubic foot and 2 microfilm reels</li>
</ul>
<p class="label">I also have situations of sizes be specified in multiple sections of the finding aid. The <a title="Inventory of ALERT Foundation records from Governor Bangerter, 1986-1991." href="http://images.archives.utah.gov/cdm4/item_viewer.php?CISOROOT=/ead&amp;CISOPTR=991&amp;CISOBOX=1&amp;REC=1">Inventory of ALERT Foundation records from Governor Bangerter, 1986-1991</a> has a collection level size of &#8220;0.50 cubic foot and 2 microfilm reels&#8221;, but further down in this finding aid I see this:</p>
<p class="label"><em><span class="label">series: </span>ALERT Foundation records </em></p>
<ul>
<li><span class="label">box 1, folder 1: </span>Documentary: &#8220;&#8221;Letters from our Children,&#8221;" Motion picture film reel, 16mm</li>
<li> <span class="label">box 1, folder 2: </span>Documentary: &#8220;&#8221;Letters from our Children,&#8221;" VHS videocassette</li>
<li> <span class="label">box 1, folder 3: </span>Documentary: &#8220;&#8221;Letters from our Children,&#8221;" VHS videocassette</li>
<li> <span class="label">box 1, folder 4: </span>Documentary: &#8220;&#8221;Letters from our Children,&#8221;" VHS videocassette</li>
</ul>
<p>When they said 2 microfilm reels &#8211; do they really mean a 16mm motion picture film reel and a VHS videocassette? Is there 1 VHS videocassette or 3? How sizes are specified in a specific repository&#8217;s finding aids is another possible candidate for a repository level configuration script.</p>
<p><strong>Tagging Statistics</strong></p>
<p>Finally, here are a few tag stats:</p>
<ul>
<li>Only 31 tags (1.5% of all Utah Government tags) are associated with 10 or more collections</li>
<li>1404 tags  (71.5%) are assigned to only a single collection</li>
<li>107 collections have been assigned only 1 tag</li>
<li>10 collections have no subjects</li>
</ul>
<p>Of course these statistics are based on the current incarnation of the data extraction script. After I modify the script, there will be a greater number of tags and (hopefully) more overlap of tags across multiple collections. These types of statistics should help me gauge how well my data extraction logic is working.</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/04/26/archivesz-data-challenges-utah-government-archives-records-service/">ArchivesZ Data Challenges: Utah Government Archives &#038; Records Service</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2009/04/26/archivesz-data-challenges-utah-government-archives-records-service/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>ArchivesZ Data Challenges: Princeton University</title>
		<link>http://www.spellboundblog.com/2009/03/23/archivesz-data-challenges-princeton-university/</link>
		<comments>http://www.spellboundblog.com/2009/03/23/archivesz-data-challenges-princeton-university/#comments</comments>
		<pubDate>Mon, 23 Mar 2009 04:13:54 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[ArchivesZ]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[metadata]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/?p=395</guid>
		<description><![CDATA[I received a zip file of 1,771 EAD encoded finding aids from the kind EAD enthusiasts at the Seely G. Mudd Manuscript Library. These finding aids came from five divisions within Princeton&#8217;s Library: University Archives Public Policy Papers Manuscript Division Latin American Ephemera Collection Engineering Library So onward to the data issues and what they [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/03/23/archivesz-data-challenges-princeton-university/">ArchivesZ Data Challenges: Princeton University</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a title="Princeton University Seely G. Mudd Manuscript Library" href="http://www.princeton.edu/mudd/"><img class="aligncenter size-full wp-image-406" title="Princeton University Seeley G. Mudd Manuscript Library" src="http://www.spellboundblog.com/wp-content/uploads/2009/03/princeton-mudd.jpg" alt="Princeton University Seeley G. Mudd Manuscript Library" width="501" height="96" /></a><br />
I received a zip file of 1,771 EAD encoded finding aids from the kind EAD enthusiasts at the <a title="Seely G. Mudd Manuscript Library" href="http://www.princeton.edu/~mudd/">Seely G. Mudd Manuscript Library</a>. These finding aids came from five divisions within <a title="Princeton University Library" href="http://library.princeton.edu/">Princeton&#8217;s Library</a>:</p>
<ul>
<li><span style="font-family: Arial; font-size: x-small;"><span style="font-size: 10pt; font-family: Arial;"><a title="Princeton University Archives" href="http://www.princeton.edu/~mudd/finding_aids/archives.html">University Archives</a><br />
</span></span></li>
<li><a title="Princeton University: Public Policy Papers" href="http://www.princeton.edu/~mudd/finding_aids/policy.html"><span style="font-family: Arial; font-size: x-small;"><span style="font-size: 10pt; font-family: Arial;">Public Policy Papers</span></span></a></li>
<li><span style="font-family: Arial; font-size: x-small;"><span style="font-size: 10pt; font-family: Arial;"><a title="Princeton Manuscript Division" href="http://www.princeton.edu/~rbsc/department/manuscripts/index.shtml">Manuscript Division</a><br />
</span></span></li>
<li><a title="Princeton: Latin American Ephemera Collection" href="http://firestone.princeton.edu/latinam/ephemera.php"><span style="font-family: Arial; font-size: x-small;"><span style="font-size: 10pt; font-family: Arial;">Latin American Ephemera Collection</span></span></a></li>
<li><span style="font-family: Arial; font-size: x-small;"><span style="font-size: 10pt; font-family: Arial;"><a title="Princeton Engineering Library" href="http://libblogs.princeton.edu/englib/?s=">Engineering Library</a><br />
</span></span></li>
</ul>
<p>So onward to the data issues and what they mean for my ever growing &#8216;script fix to-do list&#8217;.</p>
<p><strong>Repository Names</strong></p>
<p>As we saw with the Oregon State University finding aids, the finding aids from Princeton University had a wide range of different values for repository names. In the list below we spot some issues. Some end in periods, some do not. One has extra space (probably a carriage return) in the middle. One does not include Princeton in the repository name. Once we have many repositories&#8217; finding aids in ArchivesZ, a repository name of &#8216;Engineering Library&#8217; does not tell the user enough about where those collections can be found.</p>
<p>Here is the list of repository titles my script extracted:</p>
<ul>
<li><span class="il">Princeton</span> University Library. Department of Rare Books and Special Collections.</li>
<li>Engineering Library</li>
<li><span class="il">Princeton</span> University Library</li>
<li><span class="il">Princeton</span> University Library. Department of Rare                    Books and Special Collections.</li>
<li><span class="il">Princeton</span> University Library.</li>
</ul>
<p>My script can handle the extra period and the extra spaces, but the non-specific name would need to ultimately be fixed on the source side.</p>
<p><strong>Collection Size</strong></p>
<p>The current script assumes that there is only one extent value specified to express the size of the collection. Princeton&#8217;s finding aids showed me examples of multiple extent values. For example, the <a title="Christina Georgina Rossetti Collection" href="http://diglib.princeton.edu/ead/getEad?eadid=C0222">Christina Georgina Rossetti Collection</a> has both a collection level size of 0.4 linear feet (1 archival box) as well as a 2nd extent specification corresponding to a specific folder with the value of (1 poem, 3 drawings, 1 photo, 1 incomplete article). The script must be modified to only consider the collection level size.</p>
<p><strong>Complicated Titles</strong></p>
<p>The current script logic apparently does not handle what I would call &#8216;complicated collection titles&#8217;. For example, I ended up with &#8220;Edward Livingston Papers, &#8221; as the title for a collection with a full title of <a title="Edward Livingston Papers, 1683-1877 (bulk 1764-1836)" href="http://http://diglib.princeton.edu/ead/getEad?eadid=C0280">Edward Livingston Papers, 1683-1877 (bulk 1764-1836)</a>. This is the way that this title is encoded:<code><br />
&lt;unittitle encodinganalog="245$a" label="Title and dates: "&gt;Edward Livingston Papers, &lt;unitdate encodinganalog="245$f" normal="1683/1877" type="inclusive"&gt;1683-1877&lt;/unitdate&gt; (bulk &lt;unitdate encodinganalog="245$g" normal="1764/1836" type="bulk"&gt;1764-1836&lt;/unitdate&gt;)&lt;/unittitle&gt;</code></p>
<div id=":1aw" class="ii gt">
<p><strong>Too Many Tags</strong></p>
<p>The Engineering Library&#8217;s <a title="Department of Mechanical and Aerospace Engineering Technical Reports: Finding Aid" href="http://diglib.princeton.edu/ead/getEad?id=ark:/88435/qf85nb33h">Department of Mechanical and Aerospace Engineering Technical Reports: Finding Aid</a> has 522 tags assigned to it! Almost all of these are the names of the authors of the individual reports. This scenario goes on the list of reasons why I might choose to not include (at least for this version) persname subjects. The other option for handling this situation is to only use subjects assigned at the collection level and ignoring subjects assigned at lower unit/container levels. Without the author tags, this single collection ends up with this nice, reasonable list of tags:</p>
<ul>
<li>Fluid mechanics</li>
<li>Mechanical engineering</li>
<li>Combustion</li>
<li>Aerospace engineering</li>
<li>Propulsion systems</li>
</ul>
<p><strong>Year Challenges</strong><br />
I found two different issues related to year ranges:</p>
<ul>
<li><a title="Women in Argentina, VI, 1989-2001: Finding Aid" href="http://diglib.princeton.edu/ead/getEad?id=ark:/88435/2z10wq25w">Women in Argentina, VI, 1989-2001: Finding Aid</a>: The current script does not properly extract the inclusive dates which are encoded within the titleproper tags, but rather assumes that it will be encoded using a unitdate tag.</li>
<li>An assortment of finding aids include subjects which have year spans as part of the subject. When these subjects are decomposed into tags, we end up with tags like &#8217;1850-1950&#8242;. Since we have the time period communicated via the inclusive dates, I will likely just drop these portions of the subjects rather than create a tag for each unique year span.</li>
</ul>
<p><strong>General Code Fixes</strong></p>
<p>It is reassuring at this point to spot the same issues with data from multiple repositories. Here are data and code logic issues that I have seen elsewhere that are revalidated by Princeton&#8217;s finding aids:</p>
<ul>
<li>Need to strip /n &amp; /t characters</li>
<li>Need to break subjects up based on commas</li>
<li>Need to drop final periods from repository names, subjects and titles</li>
<li>The designation of size in volumes, as in &#8220;793 volumes&#8221;. I need to pick an approach for translating from volumes to linear feet</li>
</ul>
<p>The script to-do list is still getting longer, but I am not done cycling through new institutions&#8217; XML files to find new issues. Want to share your institution’s EAD finding aids in XML format with the ArchivesZ project? Please drop me a line via <a title="Contact Jeanne" href="../contact/">my contact form</a>.</p>
<p><em>Image Credit: Top image from the <a title="Seeley G. Mudd Manuscript Library" href="http://www.princeton.edu/mudd/">Seeley G. Mudd Manuscript Library homepage</a>.</em></div>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/03/23/archivesz-data-challenges-princeton-university/">ArchivesZ Data Challenges: Princeton University</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2009/03/23/archivesz-data-challenges-princeton-university/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ArchivesZ Data Challenges: Syracuse University Special Collections Research Center</title>
		<link>http://www.spellboundblog.com/2009/03/07/archivesz-data-syracuse-university-archives/</link>
		<comments>http://www.spellboundblog.com/2009/03/07/archivesz-data-syracuse-university-archives/#comments</comments>
		<pubDate>Sat, 07 Mar 2009 04:48:44 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[ArchivesZ]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[metadata]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/?p=353</guid>
		<description><![CDATA[The Syracuse University Special Collections Research Center has also been so kind as to provide the XML source files for their finding aids for use in the ArchivesZ project. I loaded 572 finding aids and no errors were generated during the parsing of the XML files. My scripts extracted 6632 unique &#8216;tags&#8217; from the subjects [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/03/07/archivesz-data-syracuse-university-archives/">ArchivesZ Data Challenges: Syracuse University Special Collections Research Center</a></p>
]]></description>
			<content:encoded><![CDATA[<p>The <a title="Syracuse University Special Collections Research Center" href="http://library.syr.edu/information/spcollections/"><img class="alignright size-full wp-image-357" title="Syracuse University" src="http://www.spellboundblog.com/wp-content/uploads/2009/03/syracuse-university.jpg" alt="Syracuse University" width="200" height="200" /></a><a title="Syracuse University Special Collections Research Center" href="http://library.syr.edu/information/spcollections/">Syracuse University Special Collections Research Center</a> has also been so kind as to provide the XML source files for their finding aids for use in the ArchivesZ project. I loaded 572 finding aids and no errors were generated during the parsing of the XML files.</p>
<p>My scripts extracted 6632 unique &#8216;tags&#8217; from the subjects assigned to the finding aids. As part of the data parsing and loading of data for use in the visualizations, the script divides up compound subjects into tags. For example, in the subjects we find assigned to Syracuse University finding aids we find these values (number shown is number of finding aids to which that subject is assigned):</p>
<ul>
<li>Art &#8212; American &#8212; 20th century (1)</li>
<li>Art &#8212; Cartoonists (68)</li>
<li>Art &#8212; Cartoonists. (3)</li>
<li>Art &#8212; Exhibitions. (1)</li>
<li>Art &#8212; Illustrators (36)</li>
<li>Art &#8212; Illustrators. (1)</li>
<li>Art &#8212; Painters (77)</li>
<li>Art &#8212; Philosophy. (1)</li>
<li>Art &#8212; Sculpture (33)</li>
</ul>
<p>As well as subjects, where the components are separated by commas such as these (number listed indicates total finding aids assigned that subject):</p>
<ul>
<li>Art, American (33)</li>
<li>Art, American. (46)</li>
<li>Art, American, 20th century (28)</li>
<li>Art, American, 20th century. (31)</li>
<li>Art, Cuban, 20th century (1)</li>
<li>Art, Modern (1)</li>
<li>Art, French, 20th century. (1)</li>
</ul>
<p>The goal is to capture the core ideas &#8211; to capture the overlap in subject matter among diverse collections. All of the collections with any of these subjects are about Art. With the current script, the tag Art is associated with 179 collections from Syracuse University. You can see from this tiny subset of subjects that other themes would be revealed when these subjects were decomposed more completely &#8211; and this just scratches the surface.</p>
<p>Out of the 6676 subjects, 5658 subjects are assigned to single collections. Out of the 6632 tags the current script extracted from those subjects, 5594 tags are assigned to single collections. Not much improvement with the current state of the script.</p>
<p>While currently the script does a good job with the Library of Congress double dash separation pattern, the Syracuse University data has shown me a number of other standard patterns that need to be handled which can be seen in the small sampling of art related subjects shown above. The easy one is removing periods and stripping spaces from the end of subject values.  The harder change will be to implement smart separation of subjects into tags based on commas. This would need the code to only break up &lt;subject&gt; values while leaving &lt;persname&gt; and &lt;corpname&gt; alone. I will also need to examine &lt;geogname&gt; values from across various institutions to decide if it is better to break them up or leave them be.</p>
<p>Other than these subject issues, there are a few other script modification that I will need to make based on scenarios the data in the Syracuse finding aids have shown me:</p>
<ul>
<li>Syracuse University uses an entity to populate the repository values &#8211; the current script does not handle this at all.</li>
<li>Ensure that single item collections are assigned a size of .25 linear feet</li>
<li>Linear ft must be added as another recognized abbreviation for linear feet</li>
</ul>
<p>All these issues are being added to my master &#8216;to do&#8217; list for updating the EAD parsing script. Onward to the next data set.</p>
<p>Want to share your institution’s EAD finding aids in XML format with the ArchivesZ project? Please drop me a line via <a title="Contact Jeanne" href="../contact/">my contact form</a>.</p>
<p><em>Image Credit: Syracuse University image above from <a title="Syracuse University Special Collections Research Center" href="http://library.syr.edu/information/spcollections/">Syracuse University Special Collections Research Center</a> home page.<br />
</em></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/03/07/archivesz-data-syracuse-university-archives/">ArchivesZ Data Challenges: Syracuse University Special Collections Research Center</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2009/03/07/archivesz-data-syracuse-university-archives/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>ArchivesZ Data Challenges: Oregon State University Archives</title>
		<link>http://www.spellboundblog.com/2009/02/22/archivesz-data-challenges-oregon-state-university/</link>
		<comments>http://www.spellboundblog.com/2009/02/22/archivesz-data-challenges-oregon-state-university/#comments</comments>
		<pubDate>Sun, 22 Feb 2009 07:48:45 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[ArchivesZ]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/?p=344</guid>
		<description><![CDATA[The Oregon State University Archives has generously contributed 356 of their finding aids in EAD format for use in the development of version 2 of ArchivesZ. This is my first post in a what will likely be a series of looks behind the scenes at the challenges facing a project like ArchivesZ on the data [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/02/22/archivesz-data-challenges-oregon-state-university/">ArchivesZ Data Challenges: Oregon State University Archives</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://osulibrary.oregonstate.edu/archives/"><img class="alignright size-full wp-image-346" title="OSU Archives" src="http://www.spellboundblog.com/wp-content/uploads/2009/02/osu_archives_home1.jpg" alt="OSU Archives" width="233" height="179" /></a>The <a title="Oregon State University Archives" href="http://osulibrary.oregonstate.edu/archives/archive/">Oregon State University Archives</a> has generously contributed 356 of their finding aids in EAD format for use in the development of version 2 of <a title="ArchivesZ" href="http://www.archivesz.com/">ArchivesZ</a>. This is my first post in a what will likely be a series of looks behind the scenes at the challenges facing a project like ArchivesZ on the data level.</p>
<p>Version one of ArchivesZ only used finding aids from the University of Maryland and the Library of Congress. This was definitely a case of the path of least resistance. I attend the University of Maryland and the Library of Congress has a very convenient <a title="Library of Congress Finding Aid Source" href="http://lcweb2.loc.gov/faid/source.html">page providing links to all their Finding Aids source XML files</a>. A very key aspect of creating version 2 of ArchivesZ is making sure that the scripts that pull data from EAD XML files is robust enough to handle the encoding practices of a very diverse range of institutions.</p>
<p>Please keep in mind that OSU is likely to bear the brunt of many basic data issues that I would have unearthed with whatever data sets I tried first!</p>
<p>There are 3 crucial data elements on which the visualizations of ArchivesZ depend: subject, inclusive dates, and collection size. Each element presents unique challenges. The script parsing issues I am uncovering with the OSU finding aids are currently worst for collection size. In order to make pretty charts which let people compare the quantity of materials in each collection (or record group  &#8211; please forgive that I use the term &#8216;collection&#8217; to mean any set of records for which a finding aid has been created), we need to be able to assign a single number to represent the size of each collection. Based on the values used in the LOC and UMD finding aids, we chose to go with linear of feet as our standard unit of measurement. So the trick is to translate whatever archivists choose to put into the &lt;physdesc&gt; element of their finding aid into some number of linear feet.</p>
<p>These are the size conversion rules we implemented for version 1 of ArchivesZ:</p>
<ul>
<li> 1 microfilm reel = 1 linear foot</li>
<li> Collections represented only by a number of items will be represented as .25 linear feet</li>
<li> If size only specified in number of boxes, then 1 box = .5 linear feet</li>
<li> When the size is given in some different types of units, they are prioritized in the following order: linear feet &gt; boxes &gt; microfilm reels &gt; items</li>
</ul>
<p>This works reasonably well when the physical description values are simple &#8211; it starts to fall apart when what is entered is more complicated. Here are some examples of the physical descriptions in the OSU finding aids:</p>
<p><a title="OSU Archives: Guide to the Phi Kappa Phi-OSU Chapter Records " href="http://nwda-db.wsulibs.wsu.edu/findaid/ark:/80444/xv95428">Guide to the Phi Kappa Phi-OSU Chapter Records</a>: The display in the &#8216;pretty&#8217; version of the finding aid  online shows this: 5.5 cubic feet (9 boxes, including 2 		  oversize boxes) (3 microfilm reels)</p>
<p>The version in the XML file is this:</p>
<pre>&lt;physdesc&gt;
  &lt;extent&gt;5.5 cubic feet&lt;/extent&gt;
  &lt;extent&gt;9 boxes, including 2 oversize boxes&lt;/extent&gt;
  &lt;extent&gt;3 microfilm reels&lt;/extent&gt;
&lt;/physdesc&gt;</pre>
<p>With the current algorithm, this finding aid would be marked as being 3 linear feet in size. At a bare minimum, I must add &#8216;cubic feet&#8217; as another unit to be converted. More difficult to discern is if I should have a value of  5.5 linear feet (assuming 1 cubic foot = 1 linear foot for the purposes of these comparisons) or a value of 8.5 linear feet (5.5 + 3 linear feet for the 3 microfilm reels). There is never going to be a perfect answer here, but clearly my logic needs to be more sophisticated than it is now.</p>
<p><a title="Harvey L. McAlister Collection" href="http://osulibrary.oregonstate.edu/archives/archive/mss/documents/OREmcalister.pdf">Harvey L. McAlister Collection</a>: The display in the pretty version of this finding aid online is this: 1 cubic foot, including 26 photographs (4 boxes, including 2 oversize boxes, and 1 map folder)</p>
<p>The version in the XML file is this:</p>
<pre>&lt;physdesc&gt;
  &lt;extent encodinganalog="300$a"&gt;1 cubic foot, including 26 photographs&lt;/extent&gt;
  &lt;extent encodinganalog="300$a"&gt;4 boxes, including 2 oversize boxes, and 1 map folder&lt;/extent&gt;
&lt;/physdesc&gt;</pre>
<p>With the current algorithm, this finding aid would be marked as being 1 linear foot in size. From looking at these two examples, it would seem that this would be fine and in fact &#8211; for the purposes of calculating a comparable size &#8211; only looking at the first &lt;extent&gt; value might be the way to go &#8211; at least for OSU finding aids.</p>
<p>There are some other simpler issues relating to standardization in the way that certain values are entered. For example, after ingesting 173 finding aids from OSU (the number I got through before my script flat out choked on a size designation), I ended up with five different repositories added to my REPOSITORIES table. I had expected only one. Each of these was entered as repository name &#8212; and I have included the length of each value to show how extra spaces are causing part of the problem:</p>
<ul>
<li>Oregon State University                Libraries &#8211; length 36</li>
<li>Oregon State University    &#8211; length 23</li>
<li>Oregon State UniversityLibraries    &#8211; length 32</li>
<li>Oregon State University             Libraries  &#8211; length 36</li>
<li>Oregon State University Libraries    &#8211; length 33</li>
</ul>
<p>Some of these I can handle by adding smarter trimming of trailing spaces &#8211; but in this case it is clear that typos and inconsistency are also a challenge. I checked and each of these different &lt;corpname&gt; values, within the &lt;repository&gt; element is used by at least 10 finding aids. Perhaps they have been inherited over time from a template?</p>
<p>I have considered creating a repository definition file that could be used when loading finding aids from one repository at a time. This would remove dependence on perfect replication of these sorts of values while still supplying the data needed to let people limit their searches by a named repository.</p>
<p>The last issue is the most minor. There are many /n and /t characters throughout the XML documents. These I plan to simply strip out as the script parses the XML file.</p>
<p>A big thank you to <a title="Elizabeth Nielsen" href="http://osulibrary.oregonstate.edu/staff/nielseel">Elizabeth Nielsen</a>, Senior Staff Archivist at OSU Archives. Her response to my query about OSU&#8217;s comfort with my taking apart their finding aids in public on my blog was &#8220;Bring it on – we’re tough!&#8221;.</p>
<p>It is fascinating to dig into new finding aids and see how the parsing script handles what it finds. I plan to test the existing script on XML from more sources to see all the things that must be fixed. Then I get to wrap my head around code that someone else wrote (another member of the original ArchivesZ team wrote the version 1 ruby script). For those of you who are not programmers, you can skim through my <a title="Book Review of Dreaming in Code" href="http://www.spellboundblog.com/2007/05/24/book-review-dreaming-in-code-a-book-about-why-software-is-hard/">Book Review of Dreaming in Code</a> to get a handle on why this can be harder than it sounds like it should be.</p>
<p>Want to share your institution&#8217;s EAD finding aids in XML format with the ArchivesZ project? Please drop me a line via <a title="Contact Jeanne" href="http://www.spellboundblog.com/contact/">my contact form</a>.</p>
<p><em>Image Credit: OSU Archives image above from the <a title="OSU Archives" href="http://osulibrary.oregonstate.edu/archives/">OSU Archives Home Page</a>.</em></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2009/02/22/archivesz-data-challenges-oregon-state-university/">ArchivesZ Data Challenges: Oregon State University Archives</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2009/02/22/archivesz-data-challenges-oregon-state-university/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Susa 2.0: Max Evans&#8217; Finding Aid Prototype</title>
		<link>http://www.spellboundblog.com/2008/12/08/susa-20-max-evans-finding-aid-prototype/</link>
		<comments>http://www.spellboundblog.com/2008/12/08/susa-20-max-evans-finding-aid-prototype/#comments</comments>
		<pubDate>Mon, 08 Dec 2008 06:44:50 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[EAD]]></category>
		<category><![CDATA[SAA2008]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/12/08/susa-20-max-evans-finding-aid-prototype/</guid>
		<description><![CDATA[As part of his portion of our SAA 2008 panel in San Francisco, Max Evans demonstrated his prototype for a new way to view an EAD finding aid. You can download his presentation from the SAA&#8217;s site: Finding Aids for the 21st Century: The Next Evolution. Max&#8217;s prototype of Susa 2.0 is now online! He [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/12/08/susa-20-max-evans-finding-aid-prototype/">Susa 2.0: Max Evans&#8217; Finding Aid Prototype</a></p>
]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.spellboundblog.com/wp-content/uploads/2008/12/gates_susa_young.jpg" alt="Susa Young Gates" align="right" />As part of his portion of our <a href="http://www.ibiblio.org/saawiki/2008/index.php/Session_602:_After_the_Revolution:_Unleashing_the_Power_of_EAD" title="Session 602: After the Revolution: Unleashing the Power of EAD">SAA 2008 panel in San Francisco</a>, Max Evans demonstrated his prototype for a new way to view an EAD finding aid. You can download his presentation from the SAA&#8217;s site: <a href="http://www.archivists.org/conference/sanfrancisco2008/docs/session602-EvansM2.ppt" title="Max Evans: Finding Aids for the 21st Century: The Next Evolution">Finding Aids for the 21st Century: The Next Evolution</a>.</p>
<p>Max&#8217;s prototype of <a href="http://www.spellboundblog.com/susa2/a1.html" title="Susa 2.0">Susa 2.0</a> is now online! He asked that I make sure you know it works best (showing all the intended mouse over text for links) with Internet Explorer version 6.0. The prototype presents the finding aid of <font class="three">the <a href="http://history.utah.gov/findAids/B00095/B0095FF.XML" title="Utah State Historical Society: Susa Young Gates Papers">Susa Young Gates Papers</a> from the <a href="http://history.utah.gov/" title="Utah State Historical Society">Utah State Historical Society</a>. His design tackles the major issues that plague large finding aids normally displayed in traditional single page layouts. Anyone who has looked at a large finding aid online has had the experience of being scrolled down somewhere in the middle and realizing they have no idea what they are looking at. What folder is this item in? What box is this folder in? Am I reading through a list of letters from 1950 or are these the ones from 1970?</font></p>
<p><font class="three">Context is hard to communicate when you are dealing with long lists of folders that stretch longer than the length of the screen. Max&#8217;s design uses a three column approach to provide context from left to right. His design also gives users a way to look at the full list of either items or folders, independent of their originating containers &#8211; each list then sortable in three different ways: &#8216;as arranged&#8217;, alphabetically or by date. I love <a href="http://www.spellboundblog.com/susa2/b3-1-1-2.html" title="Susa 2.0: Scanned Document example">this page</a> which shows how a scanned document might be displayed within the proper context of the collection &#8211; in this case, page 2 of document 1 of the General Correspondence from 1886-1909. All of these ideas get at the heart of giving researchers more control over how to tackle the records in a collection while making sure that they don&#8217;t loose the tools that ordered documents in a folder would provide them in the research room. </font></p>
<p><font class="three">His prototype takes a step beyond just changing how the finding aid itself is presented &#8211; but also considers how the work flow of a researcher can be improved while also simplifying the record request processes. </font><font class="three">The prototype gives the patron the option to request the scanning of specific folders or items. They can also add records to their &#8216;research cart&#8217; to either request the proper boxes be retrieved or to store the records in a personal research area within the archives website &#8211; both possibilities sound useful to me. </font></p>
<p>Max&#8217;s prototype is such a great example of rethinking how people are expected to work with archival records within the confines of the information we already have available in finding aids as they exist today. I highly recommend you give <a href="http://www.spellboundblog.com/susa2/a1.html" title="Susa 2.0">Susa 2.0</a> a look. It is a testament to Max&#8217;s incredible patience that he was able to create this prototype using over 200 separate HTML files &#8211; but it also sets the bar high for what we could be doing with our interface design!</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/12/08/susa-20-max-evans-finding-aid-prototype/">Susa 2.0: Max Evans&#8217; Finding Aid Prototype</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/12/08/susa-20-max-evans-finding-aid-prototype/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>NEH Digital Humanities Startup Grant News: Visualizing Archival Collections</title>
		<link>http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/</link>
		<comments>http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/#comments</comments>
		<pubDate>Fri, 12 Sep 2008 05:23:28 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[ArchivesZ]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/</guid>
		<description><![CDATA[As of August 22nd, 2008 it was official. There is even a blog post over on the NEH Office of Digital Humanities updates page to prove it. The University of Maryland was granted a Level I NEH Digital Humanities Startup Grant to fund work on the &#8216;Visualizing Archival Collections&#8217; project. The official one liner is [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/">NEH Digital Humanities Startup Grant News: Visualizing Archival Collections</a></p>
]]></description>
			<content:encoded><![CDATA[<p align="left"><a title="ArchivesZ" href="http://www.archivesz.com"></a></p>
<p style="text-align: center"><a title="ArchivesZ" href="http://www.archivesz.com"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/09/archivesz-ng.jpg" alt="archivesz ng" width="450" height="130" /></a></p>
<p>As of August 22nd, 2008 it was official. There is even a <a title="NEH ODH: Announcement of Awardees" href="http://www.neh.gov/ODH/ODHHome/tabid/36/EntryID/81/Default.aspx">blog post over on the NEH Office of Digital Humanities</a> updates page to prove it. The <a title="University of Maryland" href="http://www.umd.edu">University of Maryland</a> was granted a Level I <a title="NEH Digital Humanities Startup Grant" href="http://www.neh.gov/grants/guidelines/digitalhumanitiesstartup.html">NEH Digital Humanities Startup Grant</a> to fund work on the &#8216;Visualizing Archival Collections&#8217; project. The official one liner is that the project will support &#8220;The development of visualization tools for assessing information contained in electronic archival finding aids created with Encoded Archival Description (EAD)&#8221;. Why did I wait so long to announce this on the blog? I wanted to have something fun to announce at the end of my SAA presentation out in San Francisco!</p>
<p>The project director is <a title="Dr. Jennifer Golbeck" href="http://www.cs.umd.edu/~golbeck/index.shtml">Dr. Jennifer Golbeck</a>. I also have the support of University of Maryland&#8217;s Jennie Levine, <a title="Dr. Bruce Ambacher" href="http://ischool.umd.edu/people/ambacher/">Dr. Bruce Ambacher</a>, and <a title="Dr. Doug Oard" href="http://www.glue.umd.edu/~oard/">Dr. Doug Oard</a>. This amazing set collaborators should help me stay on the right track and make sure I keep the sometimes competing issues relating to archives, information retrieval and interface design in balance.</p>
<p>I will be collecting EAD encoded finding aids over the next few months. My goal is to gather a broad sample of English language finding aids from a wide range of institutions and work on the script that extracts this data into a database. Once we have the data extracted I get to look at what we have, do some data cleanup and start thinking about what sorts of visualizations might work with our real world data. During the spring term we will design and build a 2nd generation prototype of <a title="ArchivesZ" href="http://www.archivesz.com">ArchivesZ</a>.</p>
<p>Want your data to be part of this? If you would like to contribute EAD finding aids in XML format to the project, please send me the following information:</p>
<ol>
<li>Archives Name</li>
<li>Archives Parent Institution (if applicable)</li>
<li>Archives Location</li>
<li>Contact at Archives for questions about the finding aids (name, email and phone number)</li>
<li>Estimate of # of finding aids being offered</li>
<li>Controlled Vocabulary or Thesaurus used for Subject values (as many as are used)</li>
<li>Method of finding aid delivery (sending me a zip file? pointing me at a directory online? some other way?)</li>
<li>Do I have your permission to post a discussion of the data issues I may find in your finding aids here on Spellbound Blog? (Please see the <a title="OSU ArchivesZ Data Challenges" href="http://www.spellboundblog.com/2009/02/22/archivesz-data-challenges-oregon-state-university/">OSU Archives</a> post as an example of they types of issues I discuss)</li>
</ol>
<p>You can either put this into the form on my <a title="Contact Jeanne" href="http://www.spellboundblog.com/contact/">Contact Page</a> or send email directly to jeanne AT spellboundblog dot com.</p>
<p>Thank you to everyone for their enthusiasm about the ArchivesZ project. It is very exciting to have the opportunity to take all these shiny ideas to the next level.</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/">NEH Digital Humanities Startup Grant News: Visualizing Archival Collections</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>MIT&#8217;s SIMILE Project: Innovations in Metadata Interaction and Analysis</title>
		<link>http://www.spellboundblog.com/2008/01/13/mits-simile-project-innovations-in-metadata-interaction-and-analysis/</link>
		<comments>http://www.spellboundblog.com/2008/01/13/mits-simile-project-innovations-in-metadata-interaction-and-analysis/#comments</comments>
		<pubDate>Sun, 13 Jan 2008 06:37:16 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[EAD]]></category>
		<category><![CDATA[access]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/01/13/mits-simile-project-innovations-in-metadata-interaction-and-analysis/</guid>
		<description><![CDATA[Well-formed Data&#8217;s post on Exhibit led me to explore what was available from MIT&#8216;s Semantic Interoperability of Metadata and Information in unLike Environments (SIMILE) project. I took a little time to examine some of the SIMILE project tools with an eye to how they could impact interaction with archival records and metadata, as well as [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/01/13/mits-simile-project-innovations-in-metadata-interaction-and-analysis/">MIT&#8217;s SIMILE Project: Innovations in Metadata Interaction and Analysis</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://simile.mit.edu/" title="MIT Simile Project"><img align="left" src="http://www.spellboundblog.com/wp-content/uploads/2008/01/logo.png" alt="MIT SIMILE project" title="MIT SIMILE project" /></a><a href="http://well-formed-data.net/archives/119/exhibit" title="Well-formed Data: Exhibit">Well-formed Data&#8217;s post on Exhibit</a> led me to explore what was available from <a href="http://mit.edu/" title="Massachusetts Institute of Technology">MIT</a>&#8216;s <a href="http://simile.mit.edu/" title="Semantic Interoperability of Metadata and Information in unLike Environments">Semantic Interoperability of Metadata and Information in unLike Environments</a> (SIMILE) project. I took a little time to examine some of the SIMILE project tools with an eye to how they could impact interaction with archival records and metadata, as well as how they might support the work of archivists. All the tools appear to be available via an open source <a href="http://www.opensource.org/licenses/bsd-license.php" title="Open Source: BSD License">BSD license</a>.</p>
<p><strong>Babel</strong></p>
<p><a href="http://simile.mit.edu/babel/" title="SIMILE: Babel">Babel</a> converts files from one format to another. I did a test to see if it would convert one of the <a href="http://lcweb2.loc.gov/faid/source.html" title="LOC: EAD Finding Aids in XML format">Library of Congress EAD Finding Aids</a> from XML to some other format &#8211; but it gave me an error (&#8216;unqualified attribute &#8216;repositoryencoding&#8217; not allowed&#8217;). I love the idea that I could just point this at an EAD finding aid and get something useful out the other side &#8211; but apparently that is a bit on the wishful thinking side &#8211; at least for the moment.</p>
<p><strong>Exhibit 2.0</strong></p>
<p><a href="http://simile.mit.edu/exhibit/" title="SIMILE: Exhibit 2.0">Exhibit 2.0</a><strong> </strong>is described on the Exhibit homepage as follows:</p>
<blockquote>
<p class="blurb">Exhibit is a <em>three-tier web application framework</em> written in Javascript, which you can include like you would include Google Maps. If you just want to show a few hundred records of data on maps, timelines, scatter plots, interactive tables, etc., why bother learning SQL, ASP, PHP, CGI, or whatever when you can just use Exhibit? To use Exhibit, you write: a simple data file, and an HTML file in which you specify how the data should be shown. Data + Presentation. That&#8217;s all there is to publishing, as it should be.</p>
</blockquote>
<p>Sounds fabulous, doesn&#8217;t it? I wish I had a week to play with this tool. They have a whole slew of <a href="http://simile.mit.edu/wiki/Exhibit/Examples" title="SIMILE: Exhibit Examples">examples</a>, but I think the two I list below do a fine job of showing what you can create (not to mention being fairly thematic for those of you paying attention to the US Presidential Primaries news coverage):</p>
<ul>
<li><a href="http://ryanlee.org/2007/08/decide.html" title="2008 Presidential Election Candidates on the Issues">2008 Presidential Election Candidates on the Issues</a></li>
<li><a href="http://simile.mit.edu/exhibit/examples/presidents/presidents.html" title="US Presidents (in Exhibit)">US Presidents</a></li>
</ul>
<p><strong>Gadget</strong></p>
<p><a href="http://simile.mit.edu/wiki/Gadget" title="SIMILE: Gadget">Gadget</a><strong> </strong>is an XML inspector designed to create useful summaries of vast pools of XML data. I didn&#8217;t download and play with this one &#8211; but it sounds like something that might be very interesting to pump a big pile of EAD XML format finding aids into to see what could be discovered from an <a href="http://www.prjunction.com/" style="color: #000; font-weight: normal;">aggregate</a> point of view.</p>
<p><strong>Longwell &amp; RDFizers</strong></p>
<p><a href="http://simile.mit.edu/wiki/Longwell" title="SIMILE: Longwell">Longwell</a> is a <a href="http://simile.mit.edu/wiki/Faceted_Browser" title="faceted browser definition">faceted browser</a> for <a href="http://en.wikipedia.org/wiki/Resource_Description_Framework" title="Wikipedia: RDF">RDF</a> formatted data, while <a href="http://simile.mit.edu/wiki/RDFizers" title="SIMILE: RDFizers">RDFizers</a> is actually a directory of tools which convert other data formats into the RDF format. It doesn&#8217;t exist now, but if there was an RDFizer that went from EAD to RDF then Longwell would become more interesting to archivists.</p>
<p>That said, they already do have both a <a href="http://simile.mit.edu/wiki/MARC/MODS_RDFizer" title="SIMILE: MARC/MODS RDFizer">MARC/MODS RDFizer</a> and an <a href="http://simile.mit.edu/wiki/OAI-PMH_RDFizer" title="SIMILE: OAI-PMH RDFizer">OAI-PMH RDFizer</a>. I suspect that many archivists could put their hands on archival data in one of these two formats &#8211; which makes experimenting with Longwell more plausible in the near term.</p>
<p><strong>Final Thoughts </strong></p>
<p>There are lots other tools that are part of the SIMILE project (<a href="http://simile.mit.edu/wiki/Solvent" title="SIMILE: Solvent">screen scrapers</a> and <a href="http://simile.mit.edu/timeplot/" title="SIMILE: Timeplot">timeplotters</a> and <a href="http://simile.mit.edu/wiki/Referee" title="SIMILE: Referee">more</a>), but the ones listed above most ignited my imagination. Surely there are geek archivists even now rolling up their sleeves to figuring out how to leverage free open source tools like these, both to improve access to records and increase understanding of what we have and how well it is (or isn&#8217;t) documented.</p>
<p>I hope to find time to play with each of these over the next few months &#8211; but I would love to know if anyone else out there has already tried any of these tools. Have suggestions for likely datasets? Have knowledge of existing archive related applications using these tools? Please post your comments below or drop me a line via <a href="http://www.spellboundblog.com/contact/" title="Spellbound Blog Contact Form">my contact form</a>!</p>
<p><em>Image Credit: The Simile Project logo displayed above is from MIT&#8217;s <a href="http://simile.mit.edu/" title="MIT Simile Project">Simile Project website</a>. </em></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/01/13/mits-simile-project-innovations-in-metadata-interaction-and-analysis/">MIT&#8217;s SIMILE Project: Innovations in Metadata Interaction and Analysis</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/01/13/mits-simile-project-innovations-in-metadata-interaction-and-analysis/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>SAA2008 Here I Come! After the Revolution: Unleashing the Power of EAD</title>
		<link>http://www.spellboundblog.com/2008/01/10/saa2008-here-i-come-after-the-revolution-unleashing-the-power-of-ead/</link>
		<comments>http://www.spellboundblog.com/2008/01/10/saa2008-here-i-come-after-the-revolution-unleashing-the-power-of-ead/#comments</comments>
		<pubDate>Thu, 10 Jan 2008 04:05:17 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[ArchivesZ]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[SAA2008]]></category>
		<category><![CDATA[access]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[interface design]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/01/10/saa2008-here-i-come-after-the-revolution-unleashing-the-power-of-ead/</guid>
		<description><![CDATA[I got the word just before the holidays &#8211; the panel proposal of which I was a part has been accepted for SAA 2008 in San Francisco . The title of the panel is &#8216;After the Revolution: Unleashing the Power of EAD&#8217; and the working title for my paper/presentation is &#8216;Visualizing Archival Collections: Leveraging the [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/01/10/saa2008-here-i-come-after-the-revolution-unleashing-the-power-of-ead/">SAA2008 Here I Come! After the Revolution: Unleashing the Power of EAD</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.archivists.org/conference/sanfrancisco2008/index.asp" title="ARCHIVES 2008: Archival R/Evolution &amp; Identities"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/01/sanfran2008.jpg" title="SAA2008" alt="SAA2008" align="right" /></a> I got the word just before the holidays &#8211; the panel proposal of which I was a part has been accepted for <a href="http://www.archivists.org/conference/sanfrancisco2008/index.asp" title="SAA 2008 Annual Meeting">SAA 2008 in San Francisco</a> . The title of the panel is &#8216;After the Revolution: Unleashing the Power of EAD&#8217; and the working title for my paper/presentation is &#8216;Visualizing Archival Collections: Leveraging the Power of EAD&#8217;.</p>
<p>My co-presenters are Max Evans (currently of the <a href="http://www.archives.gov/press/press-releases/2008/nr08-01.html" title="Max Evans Retiring from NHPRC">NHPRC</a>, soon to be of the <a href="http://www.lds.org/churchhistory/library" title="LDS: Church History Library &amp; Archives">LDS Church Historical Department</a>) and <a href="http://www.si.umich.edu/people/faculty-detail.htm?sid=247" title="Elizabeth Yankel">Elizabeth Yakel</a> (of <a href="http://www.si.umich.edu/" title="School of Information, University of Michigan">University of Michigan, School of Information</a>). Jodi Allison-Bunnell from <a href="http://nwda.wsulibs.wsu.edu/" title="Northwest Digital Archives">Northwest Digital Archives</a>, <a href="http://www.orbiscascade.org/" title="Orbis Cascade Alliance">Orbis Cascade Alliance</a> is our panel Chair.</p>
<p>This is the description of our panel that we submitted with our proposal:</p>
<blockquote><p>Encoded Archival Description (EAD) was created in 1995 to increase uniformity and interoperability of data about archival collections to facilitate discovery. It has yet to realize that goal: most online finding aids merely recreate paper documents. Speakers will demonstrate how the structured, standardized nature of EAD can form the basis of user-friendly interfaces and finding aids that can accommodate multiple perspectives and utilize graphical and visual interfaces&#8211;while faithfully recording and presenting the context, structure, and content of the collection. Panelists will also address the challenges of unleashing the power of EAD, including normalizing XML, the lack of standard values for cross-institutional aggregation of data, and different approaches to subject terms, with a discussion of the technological and practical issues that surround them. The session relates to the SAA strategic priorities of technology and public awareness and engages elemental questions of revolutionary and evolutionary change.</p></blockquote>
<p>My portion of the panel will focus on my <a href="http://www.spellboundblog.com/2007/05/13/archivesz-visualizing-archival-collections/" title="ArchivesZ: Visualizing Archival Collections">ArchivesZ information visualization project</a>. I will be discussing both the power of this type of graphical interface to archival collections as well as addressing the roadblocks to their practical implementation. My plan is to continue the work I started last Spring over the course of this Spring and Summer &#8211; and show off a new version of ArchivesZ in San Francisco (as well as online here of course!).</p>
<p>Here are the descriptions of Max, Elizabeth and Jodi&#8217;s planned contributions (cribbed from our proposal submission):</p>
<ul>
<li>Max Evans will explore the fundamental purposes of finding aids and explore what  can be done to leverage EAD&#8217;s structure to render graphical, informative,  and elegant finding aids online.</li>
<li>Elizabeth Yakel will discuss usability test findings and how these were incorporated into the EAD-based Polar Bear Expedition Digital Collections to allow communities to engage with collections in new ways.</li>
<li>Jodi Allison-Bunnell brings a lively interest in user-centered presentations of finding aids that emerge from her work as manager of a five-state EAD consortium.</li>
</ul>
<p>I am so pleased and excited. So &#8211; who is planning on going to San Fransisco in August? I hope to see you there.</p>
<p><span style="font-style: italic">Image Credit: Society of American Archivists, <a href="http://www.archivists.org/conference/sanfrancisco2008/index.asp" title="ARCHIVES 2008: Archival R/Evolution &amp; Identities">ARCHIVES 2008: Archival R/Evolution &amp; Identities</a> web page.</span></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/01/10/saa2008-here-i-come-after-the-revolution-unleashing-the-power-of-ead/">SAA2008 Here I Come! After the Revolution: Unleashing the Power of EAD</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/01/10/saa2008-here-i-come-after-the-revolution-unleashing-the-power-of-ead/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>
