<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Spellbound Blog &#187; information visualization</title>
	<atom:link href="http://www.spellboundblog.com/category/information-visualization/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.spellboundblog.com</link>
	<description>Archives, Digital Humanities, Cultural Heritage, Technology</description>
	<lastBuildDate>Mon, 06 Feb 2012 14:49:35 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>SXSW Interactive: Data and Revelations</title>
		<link>http://www.spellboundblog.com/2011/03/13/sxsw-interactive-data-revelations/</link>
		<comments>http://www.spellboundblog.com/2011/03/13/sxsw-interactive-data-revelations/#comments</comments>
		<pubDate>Sun, 13 Mar 2011 22:55:57 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[born digital records]]></category>
		<category><![CDATA[digital humanities]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[SXSW]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/?p=1098</guid>
		<description><![CDATA[I am typing on a laptop in the Samsung blogger lounge at SXSW. Given this easy opportunity to blog, I wanted to share the overarching theme for my experience so far (3 days in) to SXSW Interactive. Data. It is all about data. APIs exposing data. People visualizing data. Using data to make business and [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2011/03/13/sxsw-interactive-data-revelations/">SXSW Interactive: Data and Revelations</a></p>
]]></description>
			<content:encoded><![CDATA[<p>I am typing on a laptop in the Samsung blogger lounge at SXSW. Given this easy opportunity to blog, I wanted to share the overarching theme for my experience so far (3 days in) to SXSW Interactive. Data. It is all about data. APIs exposing data. People visualizing data. Using data to make business and policy decisions. Graphing data to keep track of web site and application performance. Privacy of data. Crowdsourcing data. Data about social media behavior. And on and on!</p>
<p>It has been a common thread I have traced from session to session, conversation to conversation. I expect someone with less of a database and metadata fixation might see something else as the overall meme, but I have a purse full of cards pointing me to new data sources and a notebook full of URLs to track down later to defend my view.</p>
<p>I keep catching myself giving mini-lessons on archives and preservation of electronic records like some sort of envoy from another universe. While I feel like a strongoverall  tech person at an archives conference, I feel like a data and visualization person here. This morning two of my sessions were over in the same hotel that SAA in Austin was hosted in and it was strange to be in that hotel with such a different group of people. I have managed to connect with an assortment of digital humanities folks. Someone even managed to find space for and plan an informal event for tomorrow night: <a title="Innovating and Developing with Libraries, Archives, and Museums" href="http://bit.ly/hWZniW ">Innovating and Developing with Libraries, Archives, and Museums</a>.</p>
<p>My list of tech to learn (HTML5, NoSQL) and projects to contemplate and move forward (mostly ideas for visualizations using all the data everyone is sharing) is getting longer by the hour. It has been a process to figure out how to get the most I can out of SXSW. It is definitely more a space for inspiration than for deep diving into specifics. Letting go of the instict that I am supposed to &#8216;learn new skills&#8217; at a conference is fabulous!</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2011/03/13/sxsw-interactive-data-revelations/">SXSW Interactive: Data and Revelations</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2011/03/13/sxsw-interactive-data-revelations/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Creative Funding for Text-Mining and Visualization Project</title>
		<link>http://www.spellboundblog.com/2011/01/16/creative-funding-text-mining-visualization/</link>
		<comments>http://www.spellboundblog.com/2011/01/16/creative-funding-text-mining-visualization/#comments</comments>
		<pubDate>Sun, 16 Jan 2011 15:51:29 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[digital humanities]]></category>
		<category><![CDATA[funding]]></category>
		<category><![CDATA[GIS]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/?p=1074</guid>
		<description><![CDATA[The Hip-Hop word count project on Kickstarter.com caught my eye because it seems to be a really interesting new model for funding a digital humanities project. You can watch the video below &#8211; but the core of the project tackles assorted metadata from 40,000 rap songs from 1979 to the present including stats about each [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2011/01/16/creative-funding-text-mining-visualization/">Creative Funding for Text-Mining and Visualization Project</a></p>
]]></description>
			<content:encoded><![CDATA[<p><iframe frameborder="0" height="380px" align="right" src="https://www.kickstarter.com/projects/1801076626/the-hip-hop-word-count-a-searchable-rap-almanac/widget/card.html" width="220px"></iframe>The<a href="http://kck.st/g3M9lv"> Hip-Hop word count project</a> on <a href="http://www.kickstarter.com">Kickstarter.com</a> caught my eye because it seems to be a really interesting new model for funding a digital humanities project. You can watch the video below &#8211; but the core of the project tackles assorted metadata from 40,000 rap songs from 1979 to the present including stats about each song (word count, syllables, education level, etc), individual words, artist location and date. This information aims to become a public online almanac fueled by visualizations.</p>
<p>I am a backer of this project, and you can be too. As of the original writing of this post, they are currently 47% funded twenty-eight days out from their deadline. For those of you not familiar with <a href="http://www.kickstarter.com">Kickstarter</a>, people can post <a href="https://www.kickstarter.com/help/faq#WhoCanFundTheiProjOnKick">creative projects</a> and provide rewards for their funders. The funding only goes through if they reach their goal within the time limit &#8211; otherwise nothing happens, a model they call &#8216;all-or-nothing funding&#8217;.</p>
<p>What will the money be spent on?</p>
<ul>
<li>45% for PHP programmers who have been coding the custom web interface</li>
<li>35% for interface designers</li>
<li>10% for data acquisition &amp; data clean up</li>
<li>10% for hosting bills</li>
</ul>
<p>They aim for a five month time-line to move from their existing functional prototype to something viable to release to the public.</p>
<p>I am also intrigued by ways that the work on this project might be leveraged in the future to support similar text-mining projects that tie in location and date. How about doing the same thing with civil war letters? How about mining the lyrics from Broadway musical songs? </p>
<p>If this all sounds interesting, take a look at the video below and read more on the <a href="http://kck.st/g3M9lv">Hip-Hop Word Count Kickstarter home page</a>. If half the people who follow my RSS feed pitch in $10, this project would be funded. Take a look and consider pitching in. If this project doesn&#8217;t speak to you &#8211; take a look around <a href="http://www.kickstarter.com">Kickstarter</a> for something else you might want to support.</p>
<p><center><iframe frameborder="0" height="410px" src="https://www.kickstarter.com/projects/1801076626/the-hip-hop-word-count-a-searchable-rap-almanac/widget/video.html" width="480px"></iframe></center></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2011/01/16/creative-funding-text-mining-visualization/">Creative Funding for Text-Mining and Visualization Project</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2011/01/16/creative-funding-text-mining-visualization/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Gridworks: Super Data Cleanup and Exploration Tool</title>
		<link>http://www.spellboundblog.com/2010/05/29/gridworks-data-cleanup-exploration-tool/</link>
		<comments>http://www.spellboundblog.com/2010/05/29/gridworks-data-cleanup-exploration-tool/#comments</comments>
		<pubDate>Sat, 29 May 2010 06:26:31 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[electronic records]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[learning technology]]></category>
		<category><![CDATA[MARAC]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/?p=987</guid>
		<description><![CDATA[In my presentation at the Spring 2010 Mid-Atlantic Regional Archives Conference (MARAC), Whirlwind Tour of Visualization-Land,  I showed some screenshots of a tool called Gridworks. At the time, Gridworks was not available to the general public. The good news is that earlier this month Gridworks 1.0 was officially released and you can get Gridworks right [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2010/05/29/gridworks-data-cleanup-exploration-tool/">Gridworks: Super Data Cleanup and Exploration Tool</a></p>
]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><a href="http://code.google.com/p/freebase-gridworks/"><img class="size-full wp-image-988  aligncenter" title="ridworks" src="http://www.spellboundblog.com/wp-content/uploads/2010/05/gridworks.jpg" alt="" width="400" height="100" /></a></p>
<p>In my presentation at the Spring 2010 <a title="MARAC" href="http://www.marac.info">Mid-Atlantic Regional Archives Conference</a> (MARAC), <a title="Whirlwind Tour of Visualization-Land" href="http://www.slideshare.net/JKramerSmyth/marac-2010-visualization">Whirlwind Tour of  Visualization-Land</a>,  I showed some screenshots of a tool called Gridworks. At the time, Gridworks was not available to the general public. The good news is that earlier this month <a title="Gridworks 1.0 Announcment" href="http://blog.freebase.com/2010/05/10/announcing-the-release-of-freebase-gridworks-1-0/">Gridworks 1.0 was officially released</a> and you can <a title="Gridworks on Google Code" href="http://code.google.com/p/freebase-gridworks/">get Gridworks right now</a>.</p>
<p>For those of you who didn&#8217;t see my presentation, Gridworks is tool you run locally on your computer via a web browser. It permits you to load &#8216;grid-shaped data&#8217; for examination, filtering and data cleanup. That makes is sound so much less exciting than it is. The best way to get a sense of what you can do is to watch the <a title="Gridworks Videos" href="http://vimeo.com/groups/gridworks/videos">Gridworks Videos</a>.</p>
<p>What sort of data do I think there is in archives to be pumped  into Gridworks? How about collection descriptive data and electronic  record datasets? Since all the data is kept locally, you don&#8217;t need to worry about uploading your data to some anonymous server in order to work with it. It all stays safely on your local computer the whole time.</p>
<p>A quick list of things that Gridworks can do:</p>
<ul>
<li>Cluster data to find values that are almost the same so you can normalize your data (for example &#8211; NYC vs N.Y.C.)</li>
<li>Create instant facetted browsing based on any column in your data</li>
<li>Provide scatterplots of the values from any two numeric columns as well as a way to spot the most interesting combinations across many possible columns</li>
<li>Reconcilliation and validation of values based on data from within <a title="Freebase.com" href="http://www.freebase.com/">Freebase.com</a></li>
<li>Pull data from Freebase.com based on a matched column &#8211; such as the population of a country, if you have a column in your dataset with country specified</li>
<li>Splitting data within a cell based on a specified delimiter</li>
<li>Application of <a title="Wikipedia: Regular Expressions" href="http://en.wikipedia.org/wiki/Regular_expression">regular expressions</a> and other simple code to data to create new columns</li>
</ul>
<p>This list just scratches the surface, but it should give you a decent idea of the power of Gridworks. Even if the only feature you ever use is the one which lets you cluster and update your data to remove the &#8216;almost the same&#8217; values, Gridworks can save you hours of painstaking data cleanup.</p>
<p>Why is data cleanup exciting? Because once you have nice clean data with all the attributes that are usefull to have for your data set &#8211; then you can start playing with the data in visualization tools! So go watch some <a title="Gridworks Videos" href="http://vimeo.com/groups/gridworks/videos">Gridworks Videos</a>, <a title="Gridworks on Google Code" href="http://code.google.com/p/freebase-gridworks/">get Gridworks for yourself</a> and start playing with data. It is free and it makes working with data fun!</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2010/05/29/gridworks-data-cleanup-exploration-tool/">Gridworks: Super Data Cleanup and Exploration Tool</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2010/05/29/gridworks-data-cleanup-exploration-tool/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google Tackles Magazine Archives</title>
		<link>http://www.spellboundblog.com/2008/12/10/google-tackles-magazine-archives/</link>
		<comments>http://www.spellboundblog.com/2008/12/10/google-tackles-magazine-archives/#comments</comments>
		<pubDate>Wed, 10 Dec 2008 06:13:51 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[digitization]]></category>
		<category><![CDATA[future-proofing]]></category>
		<category><![CDATA[historical research]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[journalism]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/12/10/google-tackles-magazine-archives/</guid>
		<description><![CDATA[As has been reported around the web today, Google is now digitizing and adding magazines to Google Book Search. This follows on the tails of the recent Google Life Photo archive announcement. I took a look around to see what I could see. I was intrigued by the fact that I couldn&#8217;t see a list [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/12/10/google-tackles-magazine-archives/">Google Tackles Magazine Archives</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a title="Google Book Search: Popular Mechanics Jan 1905" href="http://books.google.com/books?id=S98DAAAAMBAJ&amp;printsec=frontcover&amp;source=gbs_summary_r&amp;cad=0_0"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/12/popmech.JPG" alt="Google Book Search: Popular Mechanics Jan 1905 Cover Image" width="275" height="395" align="right" /></a>As has been reported around the web today, Google is now digitizing and adding magazines to <a title="Google Book Search" href="http://books.google.com/books">Google Book Search</a>. This follows on the tails of the recent <a title="Google Life Photo Archive Blog Post" href="http://www.spellboundblog.com/2008/11/22/life-photo-archive-digitized-and-put-online-by-google/">Google Life Photo archive</a> announcement.</p>
<p>I took a look around to see what I could see. I was intrigued by the fact that I couldn&#8217;t see a list of all the magazines in their collection. So I went after the information the hard way and kept reloading the Google Book Search home page until I didn&#8217;t see any new titles displayed in their highlighted magazine section. This is what I came up with, roughly grouped by general topic groupings.</p>
<p>Science and technology:</p>
<ul>
<li><a title="The Bulletin of Atomic Scientists Archive" href="http://books.google.com/books?id=swoAAAAAMBAJ">The Bulletin of the Atomic Scientists</a>: which started out as the Bulletin of the Atomic Scientists of Chicago in December of 1945 through November of 1998</li>
<li><a title="CIO Magazine Archives" href="http://books.google.com/books?id=jwsAAAAAMBAJ">CIO: The Magazine for Information Executives</a>: back to Volume 1, Number 1 from Sept/Oct 1987</li>
<li><a title="Maximum PC Magazine Archives" href="http://books.google.com/books?id=cAIAAAAAMBAJ">Maximum PC</a>: October 1998 through the present</li>
<li><a title="Popular Science Magazine Archive" href="http://books.google.com/books?id=NkmBuPFIfaMC">Popular Science</a>: stretching back to an issue for March of 1872 when it was known as Popular Science Monthly through to February 2008</li>
<li><a title="Popular Mechanics Magazine Archive" href="http://books.google.com/books?id=UtMDAAAAMBAJ">Popular Mechanics</a>: January 1905 through November 2005</li>
</ul>
<p>Lifestyle and city themed:</p>
<ul>
<li><a title="New York Magazine Archive" href="http://books.google.com/books?id=hBgAAAAAMBAJ">New York Magazine</a>:  April 1968 through December 1997. Fascinating that some of the magazines still have the original mailing label on them (see this example from <a title="New York Magazine July 1969 Cover" href="http://books.google.com/books?id=jMcDAAAAMBAJ&amp;printsec=frontcover">a July 1969 issue of New York</a> )</li>
<li><a title="Cincinnati Magazine Archive" href="http://books.google.com/books?id=QB8DAAAAMBAJ">Cincinnati Magazine</a>: January 1971 through December 2005, at which point it seems to switch to being an annual city guide titled Cincinnati USA</li>
<li><a title="Atlanta Magazine Archive" href="http://books.google.com/books?id=ng8AAAAAMBAJ">Atlanta</a>: January 2003 through August 2008 &#8211; and mis-titled &#8216;Atlants&#8217;</li>
<li><a title="Indianapolis Monthly Archive" href="http://books.google.com/books?id=POsCAAAAMBAJ">Indianapolis Monthly</a>: January 1995 to the present</li>
<li><a title="Cruise Travel Magazine Archive" href="http://books.google.com/books?id=jjEDAAAAMBAJ">Cruise Travel</a>: June 1979 through December 2007</li>
</ul>
<p>African American:</p>
<ul>
<li><a title="Ebony Jr! Magazine Archive" href="http://books.google.com/books?id=wr4DAAAAMBAJ">Ebony Jr!</a>: May 1973 through October 1985</li>
<li><a title="Jet Magazine Archive" href="http://books.google.com/books?id=87MDAAAAMBAJ">Jet</a>: November 1961 through October 2008</li>
<li><a title="Black Digest Magazine Archive" href="http://books.google.com/books?id=MbIDAAAAMBAJ">Black Digest</a>: Named &#8216;Negro Digest&#8217; from November 1961 through April 1970, then Black Digest from May 1970 through April 1976.</li>
</ul>
<p>Health, nutrition and organic:</p>
<ul>
<li><a title="Women's Health Magazine Archive" href="http://books.google.com/books?id=wMUDAAAAMBAJ">Women&#8217;s Health</a> and <a title="Men's Health Magazine Archive" href="http://books.google.com/books?id=McgDAAAAMBAJ">Men&#8217;s Health</a>: January 2006 through present. I found it very amusing to be able to scan the covers of all the issues so easily &#8211; true for all of these magazines of course, but funny to see cover after cover of almost identically clad men and women exercising.</li>
<li><a title="Prevention Magazine Archive" href="http://books.google.com/books?id=YccDAAAAMBAJ">Prevention</a>: January 2006 through the present</li>
<li><a title="Better Nutrition Magazine Archive" href="http://books.google.com/books?id=lAUAAAAAMBAJ">Better Nutrition</a>: January 1999 through December 2004</li>
<li><a title="Organic Gardening Magazine Archive" href="http://books.google.com/books?id=esMDAAAAMBAJ">Organic Gardening</a>: November 2005 to the present</li>
<li><a title="Vegetarian Times Magazine Archive" href="http://books.google.com/books?id=FQQAAAAAMBAJ">Vegetarian Times</a>: March1981 through November 2004</li>
</ul>
<p>Sports and the outdoors:</p>
<ul>
<li><a title="Baseball Digest Archive" href="http://books.google.com/books?id=8SsDAAAAMBAJ">Baseball Digest</a>: July 1945 through October 2007</li>
<li><a title="American Cowboy Magazine Archive" href="http://books.google.com/books?id=XeoCAAAAMBAJ">American Cowboy</a>: May 1994 through August 2008</li>
<li><a title="Bicycling Magazine Archive" href="http://books.google.com/books?id=rMUDAAAAMBAJ">Bicycling</a>, <a title="Mountain Bike Magazine" href="http://books.google.com/books?id=ZcQDAAAAMBAJ">Mountain Bike</a> and <a title="Runner's World Magazine Archive" href="http://books.google.com/books?id=FMgDAAAAMBAJ">Runner&#8217;s World</a>: January 2006 through present</li>
</ul>
<p>They of course promise more magazines on the way, so if you are reading this long after mid December 2008  I would assume there are more magazines and more issues available now. I hope that they make it easier to browse just magazines. Once they have a broader array of titles &#8211; how neat would it be to build a virtual news stand for a specific week in history? Shouldn&#8217;t be hard &#8211; they have all the metadata and cover images they need.</p>
<p>I love being able to read the magazine &#8211; advertising and all. They display the covers in batches by decade or 5 year period depending on the number of issues. I also like the Google map provided on each magazines &#8216;about&#8217; page that shows &#8216;Places mentioned in this magazine&#8217; and easily links you directly to the article that mentions the location marked on the map.</p>
<p>I think it is interesting that Google went with more of a PDF single scrolling model rather than an interface that mimics turning pages. In many issues (maybe all?) they have hot-linked the table of contents so that you can scroll down to that section instantly. You can also search within the magazine, though from my short experiments it seems that only the articles are text indexed and the advertisements are not.</p>
<p>Google&#8217;s current model for search is to return results for magazines mixed in with books in Google Book Search results &#8211; but they do let you limit your results to only magazines from their <a title="Advanced Google Book Search" href="http://books.google.com/advanced_book_search">Advanced Search page within Google Book Search</a>. See these results for a quick <a title="Google Book Search: sunscreen in magazines" href="http://books.google.com/books?as_q=sunscreen&amp;num=10&amp;btnG=Google+Search&amp;as_epq=&amp;as_oq=&amp;as_eq=&amp;as_brr=0&amp;as_pt=MAGAZINES&amp;lr=&amp;as_vt=&amp;as_auth=&amp;as_pub=&amp;as_sub=&amp;as_drrb=c&amp;as_miny=&amp;as_maxy=&amp;as_isbn=&amp;as_issn=">search on sunscreen in magazines</a>.</p>
<p>Overall I mark this as a really nice step forward in access to old magazines. As with many visualizations, seeing the about page for any of these magazines made me ask myself new questions.  It will be interesting to see how many magazines sign on to be included and how the interface evolves.</p>
<p>To read more about Google&#8217;s foray into magazine digitization and search take a look at:</p>
<ul>
<li><a title="Tech Crunch: Google Adds Print Magazines To Book Search" href="http://www.techcrunch.com/2008/12/09/google-adds-print-magazines-to-book-search/">Tech Crunch: Google Adds Print Magazines To Book Search </a></li>
<li><a title="Official Google Blog: Search and Find Magazines on Google" href="http://googleblog.blogspot.com/2008/12/search-and-find-magazines-on-google.html">Official Google Blog: Search and Find Magazines on Google</a></li>
<li><a title="Venture Beat Digital Media: Google Book Search: now with magazines!" href="http://venturebeat.com/2008/12/09/google-book-search-now-with-magazines/">Venture Beat Digital Media: Google Book Search: now with magazines!</a></li>
<li><a title="Washington Post/AP: Google updates search index with old magazines" href="http://www.washingtonpost.com/wp-dyn/content/article/2008/12/10/AR2008121000908.html?sub=AR">Washington Post/AP: Google updates search index with old magazines</a></li>
</ul>
<p>For a really nice analysis of the information that Google provides on the magazine pages see <a title="Search Engine Land: Google Book Search Puts Magazines Online" href="http://searchengineland.com/google-book-search-puts-magazines-online-15762.php">Search Engine Land&#8217;s Google Book Search Puts Magazines Online</a>.</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/12/10/google-tackles-magazine-archives/">Google Tackles Magazine Archives</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/12/10/google-tackles-magazine-archives/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>NEH Digital Humanities Startup Grant News: Visualizing Archival Collections</title>
		<link>http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/</link>
		<comments>http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/#comments</comments>
		<pubDate>Fri, 12 Sep 2008 05:23:28 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[ArchivesZ]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/</guid>
		<description><![CDATA[As of August 22nd, 2008 it was official. There is even a blog post over on the NEH Office of Digital Humanities updates page to prove it. The University of Maryland was granted a Level I NEH Digital Humanities Startup Grant to fund work on the &#8216;Visualizing Archival Collections&#8217; project. The official one liner is [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/">NEH Digital Humanities Startup Grant News: Visualizing Archival Collections</a></p>
]]></description>
			<content:encoded><![CDATA[<p align="left"><a title="ArchivesZ" href="http://www.archivesz.com"></a></p>
<p style="text-align: center"><a title="ArchivesZ" href="http://www.archivesz.com"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/09/archivesz-ng.jpg" alt="archivesz ng" width="450" height="130" /></a></p>
<p>As of August 22nd, 2008 it was official. There is even a <a title="NEH ODH: Announcement of Awardees" href="http://www.neh.gov/ODH/ODHHome/tabid/36/EntryID/81/Default.aspx">blog post over on the NEH Office of Digital Humanities</a> updates page to prove it. The <a title="University of Maryland" href="http://www.umd.edu">University of Maryland</a> was granted a Level I <a title="NEH Digital Humanities Startup Grant" href="http://www.neh.gov/grants/guidelines/digitalhumanitiesstartup.html">NEH Digital Humanities Startup Grant</a> to fund work on the &#8216;Visualizing Archival Collections&#8217; project. The official one liner is that the project will support &#8220;The development of visualization tools for assessing information contained in electronic archival finding aids created with Encoded Archival Description (EAD)&#8221;. Why did I wait so long to announce this on the blog? I wanted to have something fun to announce at the end of my SAA presentation out in San Francisco!</p>
<p>The project director is <a title="Dr. Jennifer Golbeck" href="http://www.cs.umd.edu/~golbeck/index.shtml">Dr. Jennifer Golbeck</a>. I also have the support of University of Maryland&#8217;s Jennie Levine, <a title="Dr. Bruce Ambacher" href="http://ischool.umd.edu/people/ambacher/">Dr. Bruce Ambacher</a>, and <a title="Dr. Doug Oard" href="http://www.glue.umd.edu/~oard/">Dr. Doug Oard</a>. This amazing set collaborators should help me stay on the right track and make sure I keep the sometimes competing issues relating to archives, information retrieval and interface design in balance.</p>
<p>I will be collecting EAD encoded finding aids over the next few months. My goal is to gather a broad sample of English language finding aids from a wide range of institutions and work on the script that extracts this data into a database. Once we have the data extracted I get to look at what we have, do some data cleanup and start thinking about what sorts of visualizations might work with our real world data. During the spring term we will design and build a 2nd generation prototype of <a title="ArchivesZ" href="http://www.archivesz.com">ArchivesZ</a>.</p>
<p>Want your data to be part of this? If you would like to contribute EAD finding aids in XML format to the project, please send me the following information:</p>
<ol>
<li>Archives Name</li>
<li>Archives Parent Institution (if applicable)</li>
<li>Archives Location</li>
<li>Contact at Archives for questions about the finding aids (name, email and phone number)</li>
<li>Estimate of # of finding aids being offered</li>
<li>Controlled Vocabulary or Thesaurus used for Subject values (as many as are used)</li>
<li>Method of finding aid delivery (sending me a zip file? pointing me at a directory online? some other way?)</li>
<li>Do I have your permission to post a discussion of the data issues I may find in your finding aids here on Spellbound Blog? (Please see the <a title="OSU ArchivesZ Data Challenges" href="http://www.spellboundblog.com/2009/02/22/archivesz-data-challenges-oregon-state-university/">OSU Archives</a> post as an example of they types of issues I discuss)</li>
</ol>
<p>You can either put this into the form on my <a title="Contact Jeanne" href="http://www.spellboundblog.com/contact/">Contact Page</a> or send email directly to jeanne AT spellboundblog dot com.</p>
<p>Thank you to everyone for their enthusiasm about the ArchivesZ project. It is very exciting to have the opportunity to take all these shiny ideas to the next level.</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/">NEH Digital Humanities Startup Grant News: Visualizing Archival Collections</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/09/12/neh-digital-humanities-startup-grant-news-visualizing-archival-collections/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Freebase Parallax Search Interface: Exploring Olympic Games Facts</title>
		<link>http://www.spellboundblog.com/2008/08/16/freebase-parallax-search-olympic-games-facts/</link>
		<comments>http://www.spellboundblog.com/2008/08/16/freebase-parallax-search-olympic-games-facts/#comments</comments>
		<pubDate>Sat, 16 Aug 2008 05:05:50 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/08/16/freebase-parallax-search-olympic-games-facts/</guid>
		<description><![CDATA[Well-formed data posted about a new Freebase project named Parallax. This new search interface takes faceted browsing another step &#8211; in this case making it easy to jump sideways from one dataset to another related dataset. Parallax still includes filters on the left side &#8211; but the twist comes from the opportunity to select what [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/08/16/freebase-parallax-search-olympic-games-facts/">Freebase Parallax Search Interface: Exploring Olympic Games Facts</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://well-formed-data.net/archives/153/parallax" title="Well-formed data post">Well-formed data posted</a> about a new <a href="http://www.freebase.com/" title="Freebase">Freebase</a> project named <a href="http://mqlx.com/~david/parallax/index.html" title="Parallax">Parallax</a>. This new search interface takes <a href="http://en.wikipedia.org/wiki/Faceted_browser" title="Wikipedia: Faceted Browser">faceted browsing</a> another step &#8211; in this case making it easy to jump sideways from one dataset to another related dataset. Parallax still includes filters on the left side &#8211; but the twist comes from the opportunity to select what are called &#8216;Connections&#8217; from the list in the upper right hand corner of the search results page.</p>
<p>This sort of thing makes the most sense when you can see examples. The creator of Parallax has published a great little <a href="http://www.vimeo.com/1513562" title="Vimeo: Freebase Parallax">video tour</a>, but I also wanted to show you some neat data sets that were very easy to discover and embed in my blog. Since so many people are thinking about the Olympics right now, I thought I would start by exploring the <a href="http://mqlx.com/~david/parallax/browse.html?state=!((d:(t:/user/jg/default_domain/olympic_games),s:(v:!((c:ThumbnailView,s:())),vi:0)))" title="Parallax: Olympic Games Collection">Olympic Games Collection</a> from Freebase. Below I have two data sets. On the left you will see a list of Olympic Games &#8211; and on the right you will see a list of Olympic event venues. <em>(NOTE: to those reading this through a feed reader &#8211; you will likely have to click through to view the lists)</em></p>
<p><iframe src="http://mqlx.com/~david/parallax/thumbnail-view-embed-thumbnails.html?thumbsize=80&amp;query=%7B%22id%22%3Anull%2C%22limit%22%3A100%2C%22type%22%3A%22%2Fuser%2Fjg%2Fdefault_domain%2Folympic_games%22%7D" width="48%" height="300"></iframe><iframe src="http://mqlx.com/~david/parallax/thumbnail-view-embed-thumbnails.html?thumbsize=80&amp;query=%7B%22id%22%3Anull%2C%22limit%22%3A100%2C%22!%2Folympics%2Folympic_games%2Fvenues%22%3A%5B%7B%22type%22%3A%22%2Fuser%2Fjg%2Fdefault_domain%2Folympic_games%22%2C%22id%22%3Anull%2C%22name%22%3Anull%7D%5D%7D" width="48%" height="300"></iframe></p>
<p>Now lets take a real sidestep and pull up a list of sports teams who use a former Olympic facility as a venue. This is the sort of question that you could figure out on your own, but it would be a pain in the neck to do by hand. See the list on the left below which took just as long to create as it took me to spot that Team (venue) was on the list of &#8216;more connections&#8217; when my list of Olympic Venues was being displayed. The frame on the right below displays the one Olympic Venue that Freebase knows to have won an award (in this case the <a href="http://www.freebase.com/view/guid/9202a8c04000641f80000000086e5a54">Structural Special Award</a>).</p>
<p><iframe src="http://mqlx.com/~david/parallax/thumbnail-view-embed-tiles.html?thumbsize=50&amp;query=%7B%22id%22%3Anull%2C%22limit%22%3A100%2C%22!%2Fsports%2Fsports_facility%2Fteams%22%3A%5B%7B%22!%2Folympics%2Folympic_games%2Fvenues%22%3A%5B%7B%22type%22%3A%22%2Fuser%2Fjg%2Fdefault_domain%2Folympic_games%22%7D%5D%2C%22id%22%3Anull%2C%22name%22%3Anull%7D%5D%7D" width="48%" height="300"></iframe><iframe src="http://mqlx.com/~david/parallax/thumbnail-view-embed-thumbnails.html?thumbsize=80&amp;query=%7B%22id%22%3Anull%2C%22limit%22%3A100%2C%22!%2Faward%2Faward_honor%2Fhonored_for%22%3A%5B%7B%22!%2Faward%2Faward_winning_work%2Fawards_won%22%3A%5B%7B%22!%2Folympics%2Folympic_games%2Fvenues%22%3A%5B%7B%22type%22%3A%22%2Fuser%2Fjg%2Fdefault_domain%2Folympic_games%22%7D%5D%2C%22id%22%3Anull%2C%22name%22%3Anull%7D%5D%7D%5D%7D" width="48%" height="300"></iframe></p>
<p>Of course the lists above are only as good as the data behind them, but you can see how interesting it could be to use Parallax to explore connected information. Now take this idea to the world of archives and libraries, <a href="http://en.wikipedia.org/wiki/OPAC" title="Wikipedia: OPAC">OPACs</a> and finding aids and imagine the sorts of questions you can start asking. Yes &#8211; it does depend on the data being connected, but that is happening more and more all the time. The promise of the semantic web is structured data everywhere we turn.</p>
<p>Go play with Parallax. Look at <a href="http://mqlx.com/~david/parallax/browse.html?state=!((d:(t:/venture_capital/venture_funded_company),s:(v:!((c:ThumbnailView,s:())),vi:0)))" title="Parallax: Venture Funded Companies">Venture Funded Companies</a> and then look at all the <a href="http://mqlx.com/~david/parallax/browse.html?state=!((d:(t:/venture_capital/venture_funded_company),s:(v:!((c:ThumbnailView,s:())),vi:0)),(d:(l:'Games%20Published',p:!((f:!t,p:/cvg/cvg_publisher/games_published))),s:(v:!((c:ThumbnailView,s:())),vi:0)))" title="Parallax: Games Developed by Venture Funded Companies">Games Developed by those companies</a>. Examine the list of  <a href="http://mqlx.com/~david/parallax/browse.html?state=!((d:(t:/user/joshuamclark/default_domain/bird),s:(v:!((c:ThumbnailView,s:())),vi:0)))" title="Parallax: Bird Species">Bird Species</a> and then see what <a href="http://mqlx.com/~david/parallax/browse.html?state=!((d:(t:/user/joshuamclark/default_domain/bird),s:(v:!((c:ThumbnailView,s:())),vi:0)),(d:(l:School,p:!((f:!t,p:/education/school_mascot/school))),s:(v:!((c:ThumbnailView,s:())),vi:0)))" title="Parallax: Schools with Bird Mascots">schools have bird mascots</a>&#8230; and THEN see a <a href="http://mqlx.com/~david/parallax/browse.html?state=!((d:(t:/user/joshuamclark/default_domain/bird),s:(v:!((c:ThumbnailView,s:())),vi:0)),(d:(l:School,p:!((f:!t,p:/education/school_mascot/school))),s:(v:!((c:ThumbnailView,s:())),vi:0)),(d:(l:Person,p:!((f:!t,p:/business/employer/employees),(f:!t,p:/business/employment_tenure/person))),s:(v:!((c:ThumbnailView,s:())),vi:0)))" title="Parallax: People who Attended Schools with Bird Mascots">list of famous people who went to schools that have bird mascots</a>.</p>
<p>Put in your own search from the <a href="http://mqlx.com/~david/parallax/" title="Parallax">Parallax homepage</a> and play with the available connections.  Map and timeline views are also available &#8211; though they only work if your data includes location and temporal data, respectively. If you find a great sequence of data sets &#8211; please share them!</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/08/16/freebase-parallax-search-olympic-games-facts/">Freebase Parallax Search Interface: Exploring Olympic Games Facts</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/08/16/freebase-parallax-search-olympic-games-facts/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Dipity: Easy Hosted Timelines</title>
		<link>http://www.spellboundblog.com/2008/07/20/dipity-easy-hosted-timelines/</link>
		<comments>http://www.spellboundblog.com/2008/07/20/dipity-easy-hosted-timelines/#comments</comments>
		<pubDate>Mon, 21 Jul 2008 03:41:33 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[GIS]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[learning technology]]></category>
		<category><![CDATA[virtual collaboration]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/07/20/dipity-easy-hosted-timelines/</guid>
		<description><![CDATA[I discovered Dipity via the Reuters article An open-source timeline of the virtual world. The article discusses the creation of a Virtual Worlds Timeline on the Dipity website. Dipity lets anyone create an account and start building timelines. In the case of the Virtual Worlds Timeline, the creator chose to permit others to collaborate on [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/07/20/dipity-easy-hosted-timelines/">Dipity: Easy Hosted Timelines</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.dipity.com/" title="Dipity"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/07/dipity_teaser2.png" alt="Dipity Logo" align="right" /></a>I discovered <a href="http://www.dipity.com/" title="Dipity">Dipity</a> via the Reuters article <a href="http://secondlife.reuters.com/stories/2008/07/08/an-open-source-timeline-of-the-virtual-world/" title="Reuters: An open-source timeline of the virtual world">An open-source timeline of the virtual world</a>. The article discusses the creation of a <a href="http://www.dipity.com/user/xantherus/timeline/Virtual_Worlds" title="Dipity: Virtual Worlds Timeline">Virtual Worlds Timeline</a> on the Dipity website. Dipity lets anyone create an account and start building timelines. In the case of the Virtual Worlds Timeline, the creator chose to permit others to collaborate on the timeline. Dipity also provides four ways of viewing any timeline: a classic left to right scrolling view, a flipbook, a list and a map.</p>
<p>I chose to experiment by <a href="http://www.dipity.com/user/jkramersmyth/timeline/Spellbound_Blog" title="Dipity: Spellbound Blog Timeline">creating a timeline for Spellbound Blog</a>. Dipity made this very easy &#8211; I just selected WordPress and provided my blog&#8217;s URL. This was supposed to grab my 20 most recent posts &#8211; but it seems to have taken 10 instead. I tried to provide a username/password so that Dipity could pull &#8216;more&#8217; of my posts (they didn&#8217;t say how many &#8211; maybe all of them?). I couldn&#8217;t get it to work as of this writing &#8211; but if I figure it out you will see many more than 10 posts.</p>
<p>I particularly like the way they use the images I include in my posts in the various views. I also appreciate that you can read the full posts in-place without leaving the timeline interface. I assume this is because I publish my full articles to my RSS feed. It was also interesting to note that posts that mentioned a specific location put a marker on a map &#8211; both within the single post &#8216;event&#8217; as well as the full map view.</p>
<p>Dipity also supports the streamlined addition of many other sources such as Flickr, Picasa, YouTube, Vimeo, Blogger, Tumblr, Pandora, Twitter and any RSS feed. They have also created some neat mashups. <a href="http://www.dipity.com/mashups/timetube" title="TimeTube: Dipity + YouTube">TimeTube</a> uses your supplied phrase to query YouTube and generates a timeline based on the video creation dates. <a href="http://www.dipity.com/mashups/tickr/" title="Tickr: Dipity + Flickr">Tickr</a> lets you generate an interactive timeline based on a keyword or user search of Flickr.</p>
<p>Why should archivists care? I always perk up anytime a new web service appears that makes it easy to present time and location sensitive information. I wrote a while ago about <a href="http://www.spellboundblog.com/2008/01/13/mits-simile-project-innovations-in-metadata-interaction-and-analysis/" title="Spellbound Blog: MIT’s SIMILE Project: Innovations in Metadata Interaction and Analysis">MIT&#8217;s SIMILE</a> project and I like their <a href="http://simile.mit.edu/timeline/" title="Simile: Timeline">Timeline </a>software, but in some ways hosted services like Dipity throw the net wider. I particularly appreciate the opportunity for virtual collaboration that Dipity provides. Imagine if every online archives exhibit included a Dipity timeline? Dipity provides embed code for all the timelines. This means that it should be easy to both feature the timeline within an online exhibit and use the timeline as a way to attract a broader audience to your website.</p>
<p>There has been discussion in the past about creating custom GoogleMaps to show off archival records in a new and different way.  During THATCamp there was a lot of enthusiasm for timelines and maps as being two of the most accessible types of visualizations. By anchoring information in time and/or location it gives people a way to approach new information in a predictable way.</p>
<p>Most of my initial thoughts about how archives could use Dipity related to individual collections and exhibits &#8211; but what if an archive created one of these timelines and added an entry for every one of their collections. The map could be used if individual collections were from a single location. The timeline could let users see at a glance what time periods were the focus of collections within that archives. A link could be provided in each entry pointing to the online finding aid for each collection or record group</p>
<p>Dipity is still in working out the kinks of some of their services, but if this sounds at all interesting I encourage you to go take a look at a few fun examples:</p>
<ul>
<li><a href="http://www.dipity.com/user/mad14/timeline/Top_100_most_influential_figures_in_America" title="Dipity: 100 Most Influential Americans">The 100 Most Influential Americans</a>: The Atlantic recently asked ten historians to compose their own lists of the <a href="http://www.theatlantic.com/doc/200612/influentials-main" title="Atlantic: 100 Most Influential Americans">100 most influential Americans</a>.<a href="http://www.theatlantic.com/doc/200612/influentials-main"></a></li>
<li><a href="http://www.dipity.com/user/cortex/timeline/Johnny_Cash_Appearances" title="Dipity: Johnny Cash Appearances">Johnny Cash Recorded Appearances</a>: Click on a few of these and you will see the amount of detail that has been added is amazing &#8211; video clips, map locations and set lists are included for most of these</li>
<li><a href="http://www.dipity.com/user/mtaftmtaft/timeline/The_Civil_Rights_Movement_Period_3" title="Dipity: Civil Rights Movement">Civil Rights Movement</a> &#8211; apparently created by students in &#8220;Taft&#8217;s thrilling third period American history class at USM&#8221;</li>
</ul>
<p>And finally I have embedded the <a href="http://www.dipity.com/user/tatercakes/timeline/Internet_Memes" title="Dipity: Internet Memes">Internet Memes timeline</a> below to give you a feel of what this looks like. Try clicking on any of the events that include a little film icon at the bottom edge and see how you can view the video right in place:</p>
<p><iframe src="http://www.dipity.com/user/tatercakes/timeline/Internet_Memes/embed_tl" style="border: 1px solid #cccccc" height="400" width="600"></iframe></p>
<p><em>Image Credit:  I found and &#8216;borrowed&#8217; the Dipity image above from <a href="http://www.dipity.com/about" title="Dipity: About">Dipity&#8217;s About page</a>.</em></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/07/20/dipity-easy-hosted-timelines/">Dipity: Easy Hosted Timelines</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/07/20/dipity-easy-hosted-timelines/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>THATCamp 2008: Day 1 Dork Short Lightening Talks</title>
		<link>http://www.spellboundblog.com/2008/06/14/thatcamp-2008-day-1-dork-short-lightening-talks/</link>
		<comments>http://www.spellboundblog.com/2008/06/14/thatcamp-2008-day-1-dork-short-lightening-talks/#comments</comments>
		<pubDate>Sun, 15 Jun 2008 03:09:28 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[information visualization]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[THATCamp2008]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/06/14/thatcamp-2008-day-1-dork-short-lightening-talks/</guid>
		<description><![CDATA[During lunch on the first day of THATCamp people volunteered to give lightning talks they called &#8216;Dork Shorts&#8217;. As we ate our lunch, a steady stream of folks paraded up to the podium and gave an elevator pitch length demo. These are the projects about which I managed to type URLs and some other info [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/06/14/thatcamp-2008-day-1-dork-short-lightening-talks/">THATCamp 2008: Day 1 Dork Short Lightening Talks</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://flickr.com/photos/thenss/2443187542/" title="Lightning by thenss (Christopher Cacho) via flickr"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/06/2443187542_af1a3fe851.jpg" alt="lightning" align="right" height="247" width="330" /></a>During lunch on the first day of THATCamp people volunteered to give <a href="http://en.wikipedia.org/wiki/Lightning_Talk" title="Wikipedia: lightning talk">lightning talks</a> they called &#8216;Dork Shorts&#8217;. As we ate our lunch, a steady stream of folks paraded up to the podium and gave an elevator pitch length demo. These are the projects about which I managed to type URLs and some other info into my laptop. If you are looking for examples of inspirational and innovative work at the intersection of technology and the humanities &#8211; these are a great place to start!</p>
<ul>
<li><a href="http://www.worlddigitallibrary.org/project/english/index.html" title="World Digital Library">World Digital Library</a> (<a href="http://www.loc.gov" title="Library of Congress">Library of Congress</a> )</li>
<li><a href="http://www.piclens.com/" title="PicLens">PicLens</a> + FireFox + any search results page from the <a href="http://digitalgallery.nypl.org/nypldigital/index.cfm" title="NYPL Digital Gallery">New York Public Library Digital Gallery</a> = a 3D experience of ALL the photos at one time. PicLens uses the RSS feed to retrieve the full set of images along with their captions and will work with any RSS feed of images &#8211; such as RSS image feeds from <a href="http://flickr.com/" title="Flickr">Flickr</a> or <a href="http://smugmug.com/" title="Smugmug">Smugmug</a> .</li>
<li><a href="http://historywired.si.edu/" title="HistoryWired">HistoryWired</a> (<a href="http://americanhistory.si.edu/" title="National Museum of American History">National Museum of American History</a>): A new spin on a <a href="http://www.cs.umd.edu/hcil/treemap/" title="about treemaps">treemap</a> visualization built on top of museum metadata. One box is displayed per item and the box size is based on popularity. The rest of its innovations are just easier to experience than describe.</li>
<li><a href="http://objectofhistory.org/" title="The Object of History">The Object of History</a> (<a href="http://americanhistory.si.edu/" title="National Museum of American History">National Museum of American History</a> + <a href="http://chnm.gmu.edu/" title="CHNM">CHNM</a> )</li>
<li><a href="http://omeka.org/" title="Omeka">Omeka</a> (<a href="http://chnm.gmu.edu/" title="CHNM">CHNM</a> )</li>
<li><a href="http://exhibitions.nypl.org/eminent/" title="Eminent Domain">Eminent Domain</a> (<a href="http://www.nypl.org/" title="New York Public Library">NYPL</a>Online Exhibition): built on Omeka</li>
<li><a href="http://nocoma.grainger.uiuc.edu/" title="American Social History Online">American Social History Online</a> (<a href="www.diglib.org" title="Digital Library Federation">Digital Library Federation</a>): Zotero enabled. They are on the <a href="http://wiki.dlib.indiana.edu/confluence/display/DLFAquifer/Collection+Submission" title="Collection Submission Guidelines">hunt for more MODS records</a>. Built on Ruby On Rails (RoR) and will be put out as open source software within a couple of months.</li>
<li><a href="http://www4.ncsu.edu/~dmrieder/typographia/" title="Typographia">Typographia</a>(David Rieder, NC State University)</li>
</ul>
<p>Have more links to projects I missed including? Please add them in the comments below.</p>
<p><em>Image credit: <a href="http://flickr.com/photos/thenss/2443187542/" title="Lightning by thenss (Christopher Cacho) via flickr">Lightning</a> by <a href="http://flickr.com/people/thenss/" title="Flickr: thenss">thenss</a> (Christopher Cacho) via flickr</em></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/06/14/thatcamp-2008-day-1-dork-short-lightening-talks/">THATCamp 2008: Day 1 Dork Short Lightening Talks</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/06/14/thatcamp-2008-day-1-dork-short-lightening-talks/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>THATCamp 2008: Text Mining and the Persian Carpet Effect</title>
		<link>http://www.spellboundblog.com/2008/06/01/thatcamp-2008-text-mining-and-the-persian-carpet-effect/</link>
		<comments>http://www.spellboundblog.com/2008/06/01/thatcamp-2008-text-mining-and-the-persian-carpet-effect/#comments</comments>
		<pubDate>Sun, 01 Jun 2008 04:58:24 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[digitization]]></category>
		<category><![CDATA[historical research]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[learning technology]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[text mining]]></category>
		<category><![CDATA[THATCamp2008]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/06/01/thatcamp-2008-text-mining-and-the-persian-carpet-effect/</guid>
		<description><![CDATA[I attended a THATCamp session on Text Mining. There were between 15 and 20 people in attendance. I have done my best to attribute ideas to their originators wherever possible &#8211; but please forgive the fact that I did not catch the names of everyone who was part of this session. What Is Text Mining? [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/06/01/thatcamp-2008-text-mining-and-the-persian-carpet-effect/">THATCamp 2008: Text Mining and the Persian Carpet Effect</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://flickr.com/photos/alarch/308587800/" title="Drift of Harrachov Mine by alarch via flickr"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/06/308587800_c8d0417f1e.jpg" alt="alarch: Drift of Harrachov mine (Flickr)" align="right" height="225" width="300" /></a>I attended a <a href="http://www.thatcamp.org" title="THATCamp">THATCamp</a> session on Text Mining. There were between 15 and 20 people in attendance. I have done my best to attribute ideas to their originators wherever possible &#8211; but please forgive the fact that I did not catch the names of everyone who was part of this session.</p>
<p><strong>What Is Text Mining?</strong></p>
<p>Text mining is an umbrella phrase that covers many different techniques and types of tools.</p>
<p>The <a href="http://chnm.gmu.edu/" title="CHNM">CHNM</a> NEH-funded text mining initiative defined text mining as needing to support these three research functions:</p>
<ul>
<li>Locating or finding: improving on search</li>
<li>Extraction: once you find a set of interesting documents, how do you extract information in new (and hopefully faster) ways? How do you pull data from unstructured bulk into structured sets?</li>
<li>Analysis: support analyzing the data, discovery of patterns, answering questions</li>
</ul>
<p>The group discussed that there were both macro and micro aspects to text mining. Sometimes you are trying to explore a collection. Sometimes you are trying to examine a single document in great detail. Still other situations call for using text mining to generate automated classification of content using established vocabularies. Different kinds of tools will be important during different phases of research.</p>
<p><strong>Projects, Tools, Examples &amp; Cool Ideas</strong></p>
<p><a href="http://thatcamp.org/camper/aeastmanmullins/" title="Andrea Eastman-Mullins">Andrea Eastman-Mullins</a>, from <a href="www.alexanderstreet.com" title="Alexander Street Press">Alexander Street Press</a>, mentioned the <a href="http://humanities.uchicago.edu./orgs/ARTFL/" title="University of Chicago: ARTFL Project">University of Chicago&#8217;s ARTFL Project</a> and these two tools:</p>
<ul>
<li><a href="http://philologic.uchicago.edu/" title="PhiloLogic">PhiloLogic</a>: An XML/SGML based full-text search, retrieval and analysis tool</li>
<li><a href="http://philologic.uchicago.edu/philomine/" title="PhiloMine">PhiloMine</a>: a extension being developed for PhiloLogic to provide support for &#8220;a variety of machine learning, text mining, and document clustering tasks&#8221;.</li>
</ul>
<p><a href="http://www.dancohen.org" title="Dan Cohen">Dan Cohen</a> directed us to his post about <a href="http://www.dancohen.org/2006/08/08/mapping-what-americans-did-on-september-11/" title="Mapping What Americans Did on September 11">Mapping What Americans Did on September 11</a> and to <a href="http://twistori.com" title="Twistori">Twistori</a> which text mines Twitter.</p>
<p>Other Projects &amp; Examples:</p>
<ul>
<li><a href="http://www.monkproject.org/" title="MONK Project">MONK project</a> (Metadata Offer New Knowledge)</li>
<li><a href="http://www.opencontentalliance.org/" title="Open Content Alliance">Open Content Alliance</a>(OCA)</li>
<li>Library of Congress <a href="http://www.loc.gov/chroniclingamerica/" title="Library of Congress: Chronicling America">Chronicling America</a> &#8211; newspaper pages from 1897-1910</li>
<li>Tanya Clement&#8217;s project <a href="http://www.mith2.umd.edu/events/911-digital-dialogue-tanya-clement-using-digital-tools-to-not-read-gertrude-steins-the-making-of-americans" title="Using Digital Tools to Not-Read Gertrude Stein’s The Making of Americans">&#8220;Using Digital Tools to Not-Read Gertrude Stein’s The Making of Americans&#8221;</a> at University of Maryland, College Park</li>
<li>Two other University of Maryland, College Park projects that were not mentioned during the session, but may be of interest are <a href="http://www.cs.umd.edu/hcil/textvis/featurelens/" title="FeatureLens">FeatureLens</a> and <a href="http://www.cs.umd.edu/hcil/textvis/basketlens/" title="BasketLens">BasketLens</a></li>
<li><a href="http://docs.google.com/" title="Google Docs">Google Docs</a> now includes <a href="http://en.wikipedia.org/wiki/Flesch-Kincaid_Readability_Test" title="Wikipedia: Flesch-Kincaid Readability Test">Flesch-Kincaid Readability Tests</a> and <a href="http://en.wikipedia.org/wiki/Automated_Readability_Index" title="Wikipedia: Automated Readability Index">Automated Readability Index</a> in the same window in which it shows you your Word Count</li>
<li><a href="http://en.wikipedia.org/wiki/Spam_filter" title="Wikipedia: Spam Filters">Spam filters</a> &#8211; such as <a href="http://en.wikipedia.org/wiki/Bayesian_spam_filtering" title="Wikipedia: Bayesian Spam Filtering">Bayesian spam filtering</a> using text mining to identify spam e-mails</li>
<li>Clustering &#8211; see my post on this: <a href="http://www.spellboundblog.com/2008/05/14/clustering-data-generating-organization-from-the-ground-up/" title="Clustering Data: Generating Organization from the Ground Up">Clustering Data: Generating Organization from the Ground Up</a> and also take a look at <a href="http://clusty.com/" title="Clusty.com">Clusty.com</a> and their &#8216;remix clusters&#8217; option.</li>
</ul>
<p>Some neat ideas that were mentioned for ways text mining could be used (lots of other great ideas were discussed &#8211; these are the two that made it into my notes):</p>
<ul>
<li>Train a tool with collections of content from individual time periods, then use the tool to assist in identification of originating time period for new documents. Also could use this same setup to identify shifts in patterns in text by comparing large data sets from specific date ranges</li>
<li>If you have a tool that has learned how to classify certain types of content well… then watch for when it breaks &#8211; this can give you interesting trails to things to investigate.</li>
</ul>
<p><strong>Barriers to Text Mining</strong></p>
<p>All of the following were touched upon as being barriers or challenges to text mining:</p>
<ul>
<li>access to raw text in gated collections (ie, collections which require payment to permit access to resources) such as <a href="http://www.jstor.org/" title="JSTOR">JSTOR</a> and <a href="http://muse.jhu.edu/" title="Project MUSE">Project MUSE</a> and others.</li>
<li>tools that are too difficult for non-programmers to use</li>
<li>questions relating to the validity of text mining as a technique for drawing legitimate conclusions</li>
</ul>
<p><strong>Next Steps</strong></p>
<p>These ideas were ones put forward as important to move forward the field of text mining in the humanities:</p>
<ul>
<li>develop and share best practices for use when cultural heritage institutions make digitization and transcription deals with corporate entities</li>
<li>create frameworks that enable individuals to reproduce the work of others and provide transparency into the assumptions behind the research</li>
<li>create tools and techniques that smooth the path from digitization to transcription</li>
<li>develop focused, easy-to-use tools that bridge the gap between computer programmers and humanities researchers</li>
</ul>
<p><strong>My thoughts<br />
</strong>During the session I drew a parallel between the information one can glean in the field of archeology from the air that cannot be realized on the ground. I discovered it has a name:</p>
<blockquote><p>&#8220;Archaeologists call it the <strong>Persian carpet effect</strong>. Imagine you&#8217;re a mouse running across an elaborately decorated rug. The ground would merely be a blur of shapes and colors. You could spend your life going back and forth, studying an inch at a time, and never see the patterns. Like a mouse on a carpet, an archaeologist painstakingly excavating a site might easily miss the whole for the parts.&#8221; <em>from Airborne Archaeology, Smithsonian magazine, December 2005 (emphasis mine)</em></p></blockquote>
<p>While I don&#8217;t see any coffee table books in the near future of text mining (such as <a href="http://www.amazon.com/gp/product/0892368756?ie=UTF8&amp;tag=spellboundblog-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0892368756">The Past from Above: Aerial Photographs of Archaeological Sites</a><img src="http://www.assoc-amazon.com/e/ir?t=spellboundblog-20&amp;l=as2&amp;o=1&amp;a=0892368756" style="border: medium none ; margin: 0px" border="0" height="1" width="1" />), I do think that this idea captures the promise that we have before us in the form of the text mining tools. Everyone in our session seemed to agree that these tools will empower people to do things that no individual could have done in a lifetime by hand. The digital world is producing <a href="http://en.wikipedia.org/wiki/Terabyte" title="Wikipedia: Terabyte">terabytes</a> of text. We will need text mining tools just to find our way in this blizzard of content. It is all well and good to know that each snowflake is unique &#8211; but tell that to the 21st century historian soon to be buried under the weight of blogs, tweets, wikis and all other manner of web content.</p>
<p><em>Image credit: <a href="http://flickr.com/photos/alarch/308587800/" title="Drift of Harrachov Mine by alarch via flickr">Drift of Harrachov Mine by </a><a href="http://flickr.com/photos/alarch/308587800/" title="Drift of Harrachov Mine by alarch via flickr">alarch via flickr</a></em></p>
<p><em>As is the case with all my session summaries from THATCamp 2008, please accept my apologies in advance for any cases in which I misquote, overly simplify or miss points altogether in the post above. These sessions move fast and my main goal is to capture the core of the ideas presented and exchanged. Feel free to contact me about corrections to my summary either via comments on this post or via</em> <a href="http://www.spellboundblog.com/contact/" title="contact Jeanne Kramer-Smyth"><em>my contact form</em></a><em>.</em></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/06/01/thatcamp-2008-text-mining-and-the-persian-carpet-effect/">THATCamp 2008: Text Mining and the Persian Carpet Effect</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/06/01/thatcamp-2008-text-mining-and-the-persian-carpet-effect/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Clustering Data: Generating Organization from the Ground Up</title>
		<link>http://www.spellboundblog.com/2008/05/14/clustering-data-generating-organization-from-the-ground-up/</link>
		<comments>http://www.spellboundblog.com/2008/05/14/clustering-data-generating-organization-from-the-ground-up/#comments</comments>
		<pubDate>Wed, 14 May 2008 05:37:47 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[information visualization]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[photography]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[virtual collaboration]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/05/14/clustering-data-generating-organization-from-the-ground-up/</guid>
		<description><![CDATA[My trip to the 2008 Information Architecture Summit (IA Summit) down in Miami has me thinking a lot about helping people find information. In this post I am going to examine clustering data. Flickr Tag Clusters Tag clusters are not new on Flickr &#8211; they were announced way back in August of 2005. The best [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/05/14/clustering-data-generating-organization-from-the-ground-up/">Clustering Data: Generating Organization from the Ground Up</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.flickr.com/photos/tags/water/clusters/" title="Flickr: water tag clusters"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/05/watercluster.JPG" alt="Flickr: water tag clusters" align="right" /></a>My trip to the <a href="http://www.iasummit.org/2008/" title="IA Summit 2008">2008 Information Architecture Summit</a> (IA Summit) down in Miami has me thinking a lot about helping people find information. In this post I am going to examine clustering data.</p>
<p><strong>Flickr Tag Clusters</strong><br />
Tag clusters are not new on <a href="http://www.flickr.com/" title="Flickr">Flickr</a> &#8211; they were <a href="http://blog.flickr.net/en/2005/08/01/the-new-new-things/" title="Flickr Blog: The New New Things">announced way back in August of 2005</a>. The best way to understand tag clusters is to look at a few. Some of my favorites are the <a href="http://www.flickr.com/photos/tags/water/clusters/" title="Flickr: Water Tag Clusters">water clusters</a> (shown in the image above). From this page you can view the <a href="http://www.flickr.com/photos/tags/water/clusters/reflection-nature-green/" title="Flickr Water Cluster: Reflection, Nature, Green">reflection/nature/green</a> cluster, the <a href="http://www.flickr.com/photos/tags/water/clusters/sky-lake-river/" title="Flickr Water Cluster: Sky, Lake, River">sky/lake/river</a> cluster, the <a href="http://www.flickr.com/photos/tags/water/clusters/blue-beach-sun/" title="Flickr Water Cluster: Blue, Beach, Sun">blue/beach/sun</a> cluster or the <a href="http://www.flickr.com/photos/tags/water/clusters/sea-sand-waves/" title="Flickr Water Cluster: Sea, Sand, Waves">sea/sand/waves</a> cluster.</p>
<p>So what is going on here? Basically Flickr is analyzing groupings of tags assigned to Flickr images and identifying common clusters of tags. In our water example above &#8211; they found four different sets of tags that occurred together and distinctly apart from other sets of tags. The proof is in the pudding &#8211; the groupings make sense. They get at very subtle differences even though the mass of data being analyzed is from many different individuals with many different perspectives.</p>
<p>Tag clusters are very powerful and quite different from <a href="http://en.wikipedia.org/wiki/Tag_cloud" title="Wikipedia: Tag Cloud">tag clouds</a>. Tag clouds, by their nature, are a blunt instrument. They only show you the most popular tags. Take a look at the <a href="http://www.flickr.com/photos/library_of_congress/tags/" title="Flickr: Library of Congress Tag Cloud">tag cloud for the Library of Congress photostream on Flickr</a>. I do learn something from this. I get a sense of the broad brush topics, time periods and locations. But if you look at the <a href="http://www.flickr.com/photos/library_of_congress/alltags/" title="Flickr: Library of Congress All Tags">full list of Library of Congress Flickr tags</a> you see what a small percentage the top 150 really are (and yes.. that page does takes a while to load). Who else is now itching to ask Flickr to generate clusters within the LOC tag set?</p>
<p><strong>Steve.Museum</strong><br />
Another example of cultural heritage images being tagged is the <a href="http://steve.museum" title="Steve">Steve Museum</a> Art Museum Social Tagging Project which lets individuals tag objects from museums via <a href="http://tagger.steve.museum" title="Steve Tagger">Steve Tagger</a>. It resembles the Library of Congress on Flickr project in that it includes existing metadata with each image and permits users to add any tags they deem appropriate. I think it would be fascinating to contrast the traffic of image taggers on Steve.Museum vs Flickr for a common set of images. Is it better to build a custom interface that users must seek out but where you have complete control over the user experience and collected data? Or is it better to put images in the already existing path of users familiar with tagging images? I have no answers of course. All I know is I wish I could see the tag clusters one could generate off the Steve.Museum tag database. Perhaps someday we will!</p>
<p><strong>Del.icio.us Tags</strong><br />
<a href="http://del.icio.us/tag/archives" title="Del.icio.us: archives related tags"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/05/delicious_archives_tag.JPG" alt="del.icio.us related tags" align="right" />Del.icio.us</a>, a web service for storing and tagging your bookmarks online, supports what they call &#8216;related tags&#8217; and &#8216;tag bundles&#8217;. If you view the <a href="http://del.icio.us/tag/archives" title="Del.icio.us tag: archives">page for the tag &#8216;archives&#8217;</a> &#8211; you will see to the far right a list of related tags like those shown in the image here. What is interesting is that if I look at my own personal tag page for archives I see a much longer list of related tags (big surprise that I have a lot of links tagged archives!) and I am given the option of selecting additional tags to filter my list of links via a combination of tags.</p>
<p>Del.icio.us&#8217;s &#8216;tag bundles&#8217; let me create my own named groupings of tags &#8211; but I must assemble these groups manually rather than have them generated or suggested. On the plus side, Del.icio.us is very open about publishing its data via APIs and therefore supporting <a href="http://del.icio.us/help/thirdpartytools" title="Del.icio.us third party tools">third party tools</a>. I think my favorite off that list for now has to be <a href="http://code.google.com/p/mysqlicious/" title="MySQLicious">MySQLicious</a> which mirrors your del.icio.us bookmarks into a MySQL database. Once those tags are in a database, all you need are the right queries to generate the clusters I want to see.</p>
<p><strong>Clusty: Clustered Search Results</strong><br />
<a href="http://www.clusty.com" title="Clusty.com"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/05/clusty_archives.JPG" alt="Clusty: clusters screen shot" align="right" /></a>An example of what this might look like for search results can be seen via the search engine <a href="http://clusty.com/" title="Clusty.com">Clusty.com</a> from the folks over at <a href="http://vivisimo.com/" title="Vivisimo">Vivisimo</a>. For example &#8211; try a search on the term <a href="http://clusty.com/search?input-form=clusty-simple&amp;query=archives" title="Clusty Search: archives">archives</a>. This is one of those search terms for which general web searching is usually just infuriating. Clusty starts us with the same top 2 results as a <a href="http://www.google.com/search?q=archives" title="Google Search: archives">search for archives on Google</a> does, but it also gives us a list of clusters on the left sidebar. You can click on any of those clusters to filter the search results.</p>
<p>Those groups don&#8217;t look good to you? Click the &#8216;remix&#8217; link in the upper right hand corner of the cluster list and you get a new list of clusters. In a blog post titled <a href="http://searchdoneright.com/2008/01/introducing-clustering-2.0/" title="Search Done Right: Introducing Clustering 2.0">Introducing Clustering 2.0</a> Vivisimo CEO Raul Valdes-Perez explains what happens when you click remix:</p>
<blockquote><p>With a single click, remix clustering answers the question: What other, subtler topics are there? It works by clustering again the same search results, but with an added input: ignore the topics that the user just saw. Typically, the user will then see new major topics that didn’t quite make the final cut at the last round, but may still be interesting.</p></blockquote>
<p>I played for a while.. clicking remix over and over. It was as if it was slicing and dicing the facets for me &#8211; picking new common threads to highlight. I liked that I wasn&#8217;t stuck with what someone else thought was the right way to group things. It gave me the control to explore other groupings.</p>
<p><strong>Ontology is Overrated</strong><br />
Clay Shirky&#8217;s talk <a href="http://www.shirky.com/writings/ontology_overrated.html" title="Clay Shirky's Talk: Ontology is Overrated: Categories, Links and Tags">Ontology is Overrated: Categories, Links and Tags</a> from the spring of 2005 ties a lot of these ideas together in a way that makes a lot of sense to me. I highly recommend you go read it through &#8211; but I am going to give away the conclusion here:</p>
<blockquote><p>It&#8217;s all dependent on human context. This is what we&#8217;re starting to see with del.icio.us, with Flickr, with systems that are allowing for and aggregating tags. The signal benefit of these systems is that they don&#8217;t recreate the structured, hierarchical categorization so often forced onto us by our physical systems. Instead, we&#8217;re dealing with a significant break &#8212; by letting users tag URLs and then aggregating those tags, we&#8217;re going to be able to build alternate organizational systems, systems that, like the Web itself, do a better job of letting individuals create value for one another, often without realizing it.</p></blockquote>
<p>I currently spend my days working with controlled vocabularies for websites, so please don&#8217;t think I am suggesting we throw it all away. And yes, you do need a lot of information to reach the critical mass needed to support the generation of useful clusters. But there is something here that can have a real and positive impact on users of cultural heritage materials actually finding and exploring information. We can&#8217;t know how everyone will approach our records. We can&#8217;t know what aspects of them they will find interesting.</p>
<p><strong>There Is No Box</strong><br />
Archivists already know that much of the value of records is in the picture they paint as a group. A group of records share a context and gives the individual records meaning. Librarians and catalogers have long lived in a world of shelves. A book must be assigned a single physical location. Much has been made (both in the <a href="http://www.shirky.com/writings/ontology_overrated.html" title="Ontology is Overrated: Categories, Links and Tags">Clay Shirky talk</a> and <a href="http://www.youtube.com/watch?v=-4CV05HyAbM" title="YouTube: Information R/evolution">elsewhere</a>) that on the web there is no shelf.</p>
<p>What if we take the analogy a step further and say that for an online archives there is no box? Of course, just as with books, we still need our metadata telling us who created this record originally (and when and why and which record comes before it and after it) &#8211; but picture a world where a single record can be virtually grouped many times over. Computer programs are only going to get better at generating clusters, be they of user assigned tags or search results or other metdata. From where I sit, the opportunity for leveraging clustering to do interesting things with archival records seems very high indeed.</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/05/14/clustering-data-generating-organization-from-the-ground-up/">Clustering Data: Generating Organization from the Ground Up</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/05/14/clustering-data-generating-organization-from-the-ground-up/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

