<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Spellbound Blog &#187; original order</title>
	<atom:link href="http://www.spellboundblog.com/category/original-order/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.spellboundblog.com</link>
	<description>Archives, Digital Humanities, Cultural Heritage, Technology</description>
	<lastBuildDate>Mon, 06 Feb 2012 14:49:35 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Concertina History Online Features Virtual Collaboration and Digitization</title>
		<link>http://www.spellboundblog.com/2010/01/10/concertinas-virtual-collaboration-digitization/</link>
		<comments>http://www.spellboundblog.com/2010/01/10/concertinas-virtual-collaboration-digitization/#comments</comments>
		<pubDate>Sun, 10 Jan 2010 04:56:01 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[audio]]></category>
		<category><![CDATA[digitization]]></category>
		<category><![CDATA[historical research]]></category>
		<category><![CDATA[learning technology]]></category>
		<category><![CDATA[original order]]></category>
		<category><![CDATA[virtual collaboration]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/?p=792</guid>
		<description><![CDATA[In the early 1960s, my father bought a Wheatstone concertina in London. He tells how he visited the factory where it was made to pick one out and recalls the ledger book in which details about the concertinas were recorded. After a recent retelling of this family classic, I was inspired to see what might [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2010/01/10/concertinas-virtual-collaboration-digitization/">Concertina History Online Features Virtual Collaboration and Digitization</a></p>
]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><a title="Flickr: Concertinas by user rocketlass" href="http://www.flickr.com/photos/rocketlass/470547134/"><img class="size-full wp-image-796  aligncenter" title="concertinas" src="http://www.spellboundblog.com/wp-content/uploads/2010/01/concertinas.jpg" alt="" width="508" height="380" /></a></p>
<p>In the early 1960s, my father bought a Wheatstone concertina in London. He tells how he visited the factory where it was made to pick one out and recalls the ledger book in which details about the concertinas were recorded. After a recent retelling of this family classic, I was inspired to see what might be online related to concertinas. I was amazed!</p>
<p>First I found the <a title="Concertina.com" href="http://www.concertina.com">Concertina Library</a> which presents itself as a &#8216;Digital Reference Collection for Concertinas&#8217;. With <a title="contributing authors to the concertina library" href="http://www.concertina.com/contributors/index.htm">fourteen contributing authors</a>, the site includes in depth articles on concertina <a title="Concertina History" href="http://www.concertina.com/history/index.htm">history</a>, <a title="Concertina Technology" href="http://www.concertina.com/technology/index.htm">technology</a>, <a title="Concertina Music" href="http://www.concertina.com/music/index.htm">music</a>, <a title="Concertina Research" href="http://www.concertina.com/research/index.htm">research</a> and a wide range of <a title="Concertina Systems" href="http://www.concertina.com/concertina-systems/index.htm">concertina systems</a>.</p>
<p>I particularly appreciate the reasons that Robert Gaskins, site creator, lists for the creation of the site on the <a title="About the Concertina Library" href="http://www.concertina.com/about/index.htm">about page</a>:</p>
<blockquote><p>(1) Almost all of the historical material about concertinas has been held in research libraries where access is limited, or in private collections where access may be non-existent.  The reason for this is not that the material is so valuable, but that in the past there was no way to make material of limited interest available to  everyone, so it stayed safely in archives.  The web has provided a way to make this material widely available—partly by the libraries themselves, and partly in collections such as this.</p>
<p>(2) There seems to be a growing number of people working again on the history of concertinas, perhaps in part because research materials are becoming available on the web.  These people are widely scattered, so they don&#8217;t get to meet and discuss their work in person.  But again the web has provided an answer, allowing people to work collaboratively and exchange information across miles and timezones,  and for the resulting articles the web offers worldwide publication at almost no cost.</p></blockquote>
<p>What an eloquent testimonial for the power of the internet to both provide access to once-inaccessible materials and support virtual collaboration within a geographically dispersed community.</p>
<p>Next, I found the <a title="Wheatstone Concertina Ledgers" href="http://horniman.info">Wheatstone Concertina Ledgers</a>. This site features business records (in the form of ledgers) of the C. Wheatstone &amp; Co. stretching from 1830 through 1974 (with some gaps). The originals are held at the Library of the <a title="Horniman Museum" href="http://www.horniman.ac.uk/">Horniman Museum</a> in London. It is a great reference website with a nice interface for paging through the ledgers. Armed with the serial number from my father&#8217;s concertina (36461) I found my way to <a title="Page 88: featuring my father's concertina" href="http://horniman.info/DKNSARC/SD03/PAGES/D3P0880S.HTM">page 88 of a Wheatstone Production Journal</a> from the Dickinson Archives. If I am reading that line properly, his concertina is a 3E model and was made (or maybe sold?) April 25, 1960. I wish that there was documentation online to explain how to read the ledgers. For example, I would love to know what &#8216;Bulletin 3052&#8242; means.</p>
<p>I liked the way that they retained the sense of turning pages in a ledger. Every page of each ledger is included, including front and back end pages and blank pages. I have total confidence that I am seeing the pages in the same order as I would in person.</p>
<p>You can read the <a title="Introduction to the Wheatstone Ledger Digitization Project" href="http://horniman.info/DOCUMNTS/INTRO.HTM">overview and introduction to the project</a>, but what intrigued me more was the very detailed narrative of how this digitization effort was accomplished. In <a title="How the Wheatstone Concertina Ledgers were Digitized" href="http://horniman.info/DOCUMNTS/HOWTO.HTM">How The Wheatstone Concertina Ledgers Were Digitized</a>, we find Robert Gaskins of  the <a title="Concertina.com" href="http://www.concertina.com/">Concertina Library</a> explaining how, with an older model IBM ThinkPad, a consumer grade scanner, and his existing software (Microsoft Office and Macromedia Fireworks), he created a website with 4,500 images and clean, simple navigation. From where I sit, this is a great success story &#8211; a single person&#8217;s dedication can yield fantastic results. You don&#8217;t need the latest and greatest technology to run a successful digitization project. One individual can go a long way through sheer determination and the clever leveraging of what they have on hand.</p>
<p>Back on the <a title="Concertina.com" href="http://www.concertina.com/">Concertina Library</a>&#8216;s about page we find &#8220;There is still a lot of material relevant to the study of concertinas and their history which should be digitized and placed on the web, but has not been so far. Ideas for additional contributors, items, and collections are very welcome.&#8221; If I am following the dates correctly, the Concertina Library has articles dating back to February of 2001, shortly before Mr. Gaskins started planning the ledger digitization project. At the same time as he was collaborating with other concertina enthusiasts to build the Concertina Library,  he was scanning ledgers and creating the <a title="Wheatstone Concertina Ledgers" href="http://horniman.info/">Wheatstone Concertina Ledgers</a> website. Three cheers to Mr. Gaskins for his obvious personal enthusiasm and dedication to virtual collaboration, digitization and well-built websites! Another three cheers for all those who joined the cause and collaborated to create great online resources to support ongoing concertina research from anywhere in the world.</p>
<p>All this started because my father owns a beautiful old concertina. I love it when an innocent web search leads me to find a wealth of online archival materials. Do you have a favorite online archival resource that you stumbled across while doing similar research for family or friends? Please share them in the comments below!</p>
<p><em>Image Credit: </em><a rel="cc:attributionURL" href="http://www.flickr.com/photos/rocketlass/">http://www.flickr.com/photos/rocketlass/</a> / <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/2.0/">CC BY-NC-SA 2.0</a></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2010/01/10/concertinas-virtual-collaboration-digitization/">Concertina History Online Features Virtual Collaboration and Digitization</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2010/01/10/concertinas-virtual-collaboration-digitization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SAA2008: Preservation and Experimentation with Analog/Digital Hybrid Literary Collections (Session 203)</title>
		<link>http://www.spellboundblog.com/2008/09/06/saa2008-preservation-and-experimentation-with-analogdigital-hybrid-literary-collections-session-203/</link>
		<comments>http://www.spellboundblog.com/2008/09/06/saa2008-preservation-and-experimentation-with-analogdigital-hybrid-literary-collections-session-203/#comments</comments>
		<pubDate>Sun, 07 Sep 2008 02:27:24 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[appraisal]]></category>
		<category><![CDATA[at risk records]]></category>
		<category><![CDATA[born digital records]]></category>
		<category><![CDATA[e-mail]]></category>
		<category><![CDATA[electronic records]]></category>
		<category><![CDATA[original order]]></category>
		<category><![CDATA[preservation]]></category>
		<category><![CDATA[SAA2008]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/09/06/saa2008-preservation-and-experimentation-with-analogdigital-hybrid-literary-collections-session-203/</guid>
		<description><![CDATA[The official title of Session 203 was Getting Our Hands Dirty (and Liking It): Case Studies in Archiving Digital Manuscripts. The session chair, Catherine Stollar Peters from the New York State Archives and Records Administration, opened the session with a high level discussion of the &#8220;Theoretical Foundations of Archiving Digital Manuscripts&#8221;. The focus of this [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/09/06/saa2008-preservation-and-experimentation-with-analogdigital-hybrid-literary-collections-session-203/">SAA2008: Preservation and Experimentation with Analog/Digital Hybrid Literary Collections (Session 203)</a></p>
]]></description>
			<content:encoded><![CDATA[<p style="text-align: center"><a title="Flickr: oh messy disks by blude" href="http://flickr.com/photos/blude/2665916336/in/photostream"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/09/floppy_photo.jpg" alt="floppy disks" width="337" height="253" /></a></p>
<p>The official title of Session 203 was <a title="Session 203: Getting Our Hands Dirty (and Liking It): Case Studies in Archiving Digital Manuscripts" href="http://www.ibiblio.org/saawiki/2008/index.php/Session_203:_Getting_Our_Hands_Dirty_(and_Liking_It):_Case_Studies_in_Archiving_Digital_Manuscripts">Getting Our Hands Dirty (and Liking It): Case Studies in Archiving Digital Manuscripts</a>. The session chair, Catherine Stollar Peters from the <a title="New York State Archives and Records Administration" href="http://www.archives.nysed.gov/aindex.shtml">New York State Archives and Records Administration</a>, opened the session with a high level discussion of the &#8220;Theoretical Foundations of Archiving Digital Manuscripts&#8221;. The focus of this panel was preserving hybrid collections of born digital and paper based literary records. The goal was to review new ways to apply archival techniques to digital records. The presenters were all archivists without IT backgrounds who are building on others work &#8230; and experimenting. She also mentioned that this also impacts researchers, historians, and journalists.For each of the presenters, I have listed below the top challenges and recommendations. If you attended the sessions, you can skip forward to <a title="my thoughts" href="http://www.spellboundblog.com/2008/09/07/saa2008-preservation-and-experimentation-with-analogdigital-hybrid-literary-collections-session-203#mythoughts">my thoughts</a>.</p>
<p><strong>Norman Mailer&#8217;s Electronic Records</strong></p>
<ul>
<li>Speaker: Gabriela Redwine from University of Texas at Austin&#8217;s <a title="Harry Ransom Center" href="http://www.hrc.utexas.edu/">Harry Ransom Center</a></li>
<li>Featured Collection: <a title="Norman Mailer Papers Finding Aid" href="http://www.hrc.utexas.edu/research/fa/mailer.hp.html">Norman Mailer Papers</a></li>
</ul>
<p><em>Challenges &amp; Questions:</em></p>
<ul>
<li>3 laptops and nearly 400 disks of correspondence</li>
<li>While the letters might have been dictated or drafted by Mailer, all the typing, organization and revisions done on the computer were done by his assistant Judith McNally. This brings into question issues of who should be identified as the record creator. How do they represent the interaction between Mailer &amp; McNally? Who is the creator? Co-Creators?</li>
<li>All the laptops and disks were held by Judith McNally. When she died all of her possessions were seized by county officials. All the disks from her apartment were eventually recovered over a year later &#8211; but it causes issues of provenance. There is no way to know who might have viewed/changed the records.</li>
</ul>
<p><em>Revelations and Recommendations:</em></p>
<p><em><span style="font-style: normal">What is accessioning and processing when dealing with electronic records? What needs to be done?</span></em></p>
<ul>
<li><em><span style="font-style: normal">gain custody</span></em></li>
<li><em><span style="font-style: normal">gather information about creator&#8217;s (or creators&#8217;) use of the electronic records. In March 2007 they interviewed Mailer to understand the process of how they worked together. They learned that the computers were entirely McNally&#8217;s domain.</span></em></li>
<li><em><span style="font-style: normal">number disks, computers (given letters), other digital media</span></em></li>
<li><em><span style="font-style: normal">create disk catalog &#8211; to reflect physical information of the disk. Include color of ink.. underlining..etc. At this point the disk has never been put into a computer. This captures visual &amp; spacial information</span></em></li>
<li><em><span style="font-style: normal">gather this info from each disk: file types, directory structure &amp; file names</span></em></li>
</ul>
<p><em><span style="font-style: normal">The ideal for future collections of this type is archivist involvement earlier &#8211; the earlier the better.<br />
</span></em></p>
<p><strong>Papers of Peter Ganick<br />
</strong></p>
<ul>
<li>Speaker: Melissa Watterworth</li>
<li>Featured Collection: Papers of Writer and Small Press Publisher Peter Ganick, <a title="Thomas J Dodd Research Center" href="http://www.lib.uconn.edu/online/research/speclib/ASC/">Thomas J Dodd Research Center</a>, University of Connecticut</li>
</ul>
<p><em>Challenges &amp; Questions:</em></p>
<ul>
<li><em><span style="font-style: normal">What are the primary sources of our modern world?</span></em></li>
<li><em><span style="font-style: normal">How do we acquire and preserve born digital records as trusted custodians?<br />
</span></em></li>
<li><em><span style="font-style: normal">How do we preserve participatory media &#8211; maybe we can learn from those who work on performance art?</span></em></li>
<li><em><span style="font-style: normal">How do we incrementally build our collections of electronic records? Should we be preserving the tools?</span></em></li>
<li><em><span style="font-style: normal">Timing of acquisition: How actively should we be pursuing personal archives? How can we build trust with creators and get them to understand the challenges?</span></em></li>
<li><em><span style="font-style: normal">Personal papers are very contextual &#8211; order matters. Does this hold true for born digital personal archives? What does the networking aspect of electronic records mean &#8211; how does it impact the idea of order?</span></em></li>
<li><em><span style="font-style: normal">First attempt to accession one of Peter Ganick&#8217;s laptops and the archivist found nothing she could identify as files.. she found fragments of text &#8211; hypertext work and lots of files that had questionable provenance (downloaded from a mailing list? his creations?). She had to sit down next to him and learn about how he worked.<br />
</span></em></li>
<li><em><span style="font-style: normal">He didn&#8217;t understand at first what her challenges were. He could get his head around the idea of metadata and issues of authenticity. He had trouble understanding what she was trying to collect.<br />
</span></em></li>
<li><em><span style="font-style: normal">How do we arrange and keep context in an online environment?</span></em></li>
<li><em><span style="font-style: normal">Biggest tech challenge: are we holding on for too long to ideas of original order and context?</span></em></li>
<li>Is there a greater challenge in collecting earlier in the cycle? What if the creator puts restrictions on groupings or chooses to withdraw them?</li>
<li>Do we want to create contracts with donors? Is that practical?</li>
</ul>
<p><em>Revelations and Recommendations:<br />
</em></p>
<ul>
<li><span style="font-style: normal">Collect materials that had high value as born digital works but were at a high risk of loss.<br />
</span></li>
<li><span style="font-style: normal">Build infrastructure to support preservation of born digital records.</span></li>
<li><span style="font-style: normal">Go back to the record creator to learn more about his creative process. They used to acquire records from Ganick every few years.. that wasn&#8217;t frequent enough. He was changing the tools he used and how he worked very quickly. She made sure to communicate that the past 30 years of policy wasn&#8217;t going to work anymore. It was going to have to evolve.</span></li>
<li><span style="font-style: normal">Created a &#8216;submission agreement&#8217; about what kinds of records should be sent to the archive. He submitted them in groupings that made sense to him. She reviewed the records to make sure she understood what she was getting.</span></li>
<li><span style="font-style: normal">Considering using PDFa to capture snapshot of virtual texts.</span></li>
<li><span style="font-style: normal">Looked to model of &#8216;self archiving&#8217; &#8211; common in the world of professors to do ongoing accruals.</span></li>
<li><span style="font-style: normal">What about &#8216;embedded archivists&#8217;? There is a history of this in the performing arts and NGOs and it might be happening more and more.</span></li>
</ul>
<p><strong>George Whitmore Papers</strong></p>
<ul>
<li><strong><span style="font-weight: normal">Speaker: Michael Forstrom: <a title="Beinecke Rare Book and Manuscript Library" href="http://www.library.yale.edu/beinecke/">Beinecke Rare Book and Manuscript Library</a>, Yale University</span></strong></li>
<li><strong><span style="font-weight: normal">Featured Collection: <a title="Beinecke: George Whitmore Papers" href="http://webtext.library.yale.edu/xml2html/beinecke.whitmore.nav.html">George Whitmore Papers</a></span></strong></li>
</ul>
<p><em>Challenges &amp; Questions:</em></p>
<ul>
<li>How do you establish identity in a way that is complete and uncorrupted? How do you know it is authentic? How do you make an authentic copy? Are these requirements as unreasonable and unachievable?</li>
</ul>
<p><em>Revelations and Recommendations:<br />
</em></p>
<ul>
<li>Refresh and replicate files on a regular schedule.</li>
<li>They have had good success using <a title="Quick View Plus" href="http://www.avantstar.com/Products/Quick_View_Plus/QuickViewPlusOverview">Quick View Plus</a> to enable access to many common file formats. On the downside, it doesn&#8217;t support everything and since it is proprietary software there are no long term guarantees.</li>
<li>In some cases they had to send <a title="Wikipedia: CP/M" href="http://en.wikipedia.org/wiki/CP/M">CP/M</a> files to a 3rd party to have them converted into WordStar and have the ascii normalized.</li>
<li>Varied acquisition notes.. and accession records.. loan form with the 3rd party who did the conversion that summarized the request.. they did NOT provide information about what software was used to convert from CP/M to DOS. This would be good information to capture in the future.</li>
<li>Proposed an expansion of the standards to include how electronic records were migrated in the &lt;processinfo&gt; processing notes.</li>
</ul>
<p><strong>Questions &amp; Answers</strong></p>
<p><strong>Question:</strong> As part of a writers community, what do we tell people who want to know what they can DO about their records. They want technical information.. they want to know what to keep. Current writers are aware they are creating their legacy.</p>
<p><strong>Answer:</strong> <em>Michael:</em> The single best resource is the <a title="interPARES Creator Guidelines" href="http://www.interpares.org/display_file.cfm?doc=ip2(pub)creator_guidelines_booklet.pdf">interPARES 2 Creator Guidelines</a>. The Beineke has adapted them to distrubute to authors. <em>Melissa:</em> Go back to your collection development policies and make sure to include functions you are trying to document (like process.. distribution networks). Also communities of practice (acid free bits) are talking about formats and guidelines like that <em>Gabriela:</em> People often want to address &#8216;value&#8217;. Right now we don&#8217;t know how to evaluate the value of electronic drafts &#8211; it is up to authors.</p>
<p><strong>Question:</strong> <em>Cal Lee:</em> Not a question so much as an idea: the world of digital forensics and security and the &#8216;order of volatility&#8217; dictate that everyone should always be making a full disk copy bit by bit before doing anything else.</p>
<p><strong>Comment: <span style="font-weight: normal">C</span></strong>omment on digital forensic tools &#8211; there is lots of historical and editing history of documents in the software&#8230; also delete files are still there.</p>
<p><strong>Question:</strong> Have you seen examples of materials that are coming into the archive where the digital materials are working drafts for a final paper version? This is in contrast to others are electronic experiments.</p>
<p><strong>Answer:</strong> Yes, they do think about this. It can effect arrangement and how the records are described. The formats also impact how things are preserved.</p>
<p><strong>Question:</strong> Access issues? Are you letting people link to them from the finding aids? How are the documents authenticity protected.</p>
<p><strong>Answer:</strong> DSpace gives you a new version anytime you want it (the original bitstream) .. lots of cross linking supports people finding things from more than one path. In some cases documents (even electronic) can only be accessed from within the on site reading room.</p>
<p><strong>Question:</strong> What is your relationship is like with your IT folks?</p>
<p><strong>Answer:</strong> <em>Gabriela:</em> Our staff has been very helpful. We use &#8216;legacy&#8217; machines to access our content. They build us computers. They are also not archivists, so there is a little divide about priorities and the kind of information that I am interested in.. but it has been a very productive conversation.</p>
<p><strong>Question:</strong> (For Melissa) Why didn&#8217;t you accept Peter&#8217;s email (Melissa had said they refused a submission of email from Peter because it didn&#8217;t have research value)?</p>
<p><strong>Answer:</strong> The emails that included personal medical emails were rejected. The agreement with Peter didn&#8217;t include an option to selectively accept (or weed) what was given.</p>
<p><strong>Question:</strong> In terms of gathering information from the creators.. do you recommend a formal/recorded interview? Or a more informal arrangement in which you can contact them anytime on an ongoing basis?</p>
<p><strong>Answer:</strong> <em>Melissa:</em> We do have more formal methods &#8211; &#8216;documentation study&#8217; style approaches. We might do literature reviews.. Ultimately the submission agreement is the most formal document we have. <em>Gabriela:</em> It depends on what the author is open to.. formal documentation is best.. but if they aren&#8217;t willing to be recorded, then you take what you can get!</p>
<h2 id="mythoughts">My Thoughts</h2>
<p><strong><span style="font-weight: normal">I am very curious to see how best practices evolve in this arena. I wonder how stories written using something like <a title="Google Docs" href="http://docs.google.com">Google Documents</a>, which auto-saves and preserves all versions for future examination, will impact how scholars choose to evaluate the evolution of documents. There have already been interesting examinations of the evolution of collaborative documents. Consider this <a title="Wikipedia Updates to Sarah Palin page" href="http://www.dancohen.org/wp/wp-content/uploads/2008/09/sarah_palin_wikipedia.pdf">visual overview of the updates to the Wikipedia entry for Sarah Palin</a> created by Dan Cohen and discussed in his blog post <a title="Dan Cohen: Sarah Palin, Crowdsourced" href="http://www.dancohen.org/2008/09/02/sarah-palin-crowdsourced/">Sarah Palin, Crowdsourced</a>. Another great example of this type of visual experience of a document being modified was linked to in the comments of that post: <a title="Heavy Metal Umlaut: The Movie" href="http://weblog.infoworld.com/udell/2005/01/22.html">Heavy Metal Umlaut: The Movie</a>. If you haven&#8217;t seen this before &#8211; take a few minutes to click through and watch the <a title="Heavy Metal Umlaut Screencast" href="http://weblog.infoworld.com/udell/gems/umlaut.html">screencast</a> which actually lets you watch as a Wikipedia page is modified over time.</span></strong></p>
<p><strong><span style="font-weight: normal">While I can imagine that there will be many things to sort out if we try to start keeping these incredibly frequent snapshot save logs (disk space? quantity of versions? authenticity? author preferences to protect the unpolished versions of their work?) &#8211; I still think that being able to watch the creative process this way will still be valuable in some situations. I also believe that over time new tools will be created to automate the generation of document evolution visualization and movies (like the two I link to above) that make it easy for researchers to harness this sort of information.</span></strong></p>
<p><strong><span style="font-weight: normal">Perhaps there will be ways for archivists to keep only certain parts of the auto-save versioning. I can imagine an author who does not want anyone to see early drafts of their writing (as is apparently also the case with architects and early drafts of their designs) &#8211; but who might be willing for the frequency of updates to be stored. This would let researchers at least understand the rhythm of the writing &#8211; if not the low level details of what was being changed.</span></strong></p>
<p>I love the photo I found for the top of this post. I admit to still having stacks of 3 1/2 floppy disks. I have email from the early days of <a title="Wikipedia: BITNET" href="http://en.wikipedia.org/wiki/BITNET">BITNET</a>.  I have poems, unfinished stories, old resumes and SQL scripts. For the moment my disks live in a box on the shelf labeled &#8216;Old Media&#8217;. Lucky me &#8211; I at least still have a computer with a floppy drive that can read them!</p>
<p><em>Image Credit: <a title="Flickr: oh messy disks by blude" href="http://flickr.com/photos/blude/2665916336/in/photostream">oh messy disks</a> by <a title="Flickr: Blude" href="http://flickr.com/people/blude/">Blude</a> via flickr.<br />
</em></p>
<p><strong><span style="font-weight: normal"><em>As is the case with all my session summaries from SAA2008, please accept my apologies in advance for any cases in which I misquote, overly simplify or miss points altogether in the post above. These sessions move fast and my main goal is to capture the core of the ideas presented and exchanged. Feel free to contact me about corrections to my summary either via comments on this post or via my <a title="Contact Jeanne" href="http://www.spellboundblog.com/contact/">contact form</a>.</em></span><br />
</strong></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/09/06/saa2008-preservation-and-experimentation-with-analogdigital-hybrid-literary-collections-session-203/">SAA2008: Preservation and Experimentation with Analog/Digital Hybrid Literary Collections (Session 203)</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/09/06/saa2008-preservation-and-experimentation-with-analogdigital-hybrid-literary-collections-session-203/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Of Pirates, Treasure Chests and Keys: Improving Access to Digitized Materials</title>
		<link>http://www.spellboundblog.com/2008/04/23/of-pirates-treasure-chests-and-keys-improving-access-to-digitized-materials/</link>
		<comments>http://www.spellboundblog.com/2008/04/23/of-pirates-treasure-chests-and-keys-improving-access-to-digitized-materials/#comments</comments>
		<pubDate>Thu, 24 Apr 2008 03:18:19 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[context]]></category>
		<category><![CDATA[digitization]]></category>
		<category><![CDATA[historical research]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[learning technology]]></category>
		<category><![CDATA[original order]]></category>
		<category><![CDATA[search]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/04/23/of-pirates-treasure-chests-and-keys-improving-access-to-digitized-materials/</guid>
		<description><![CDATA[Dan Cohen posted yesterday about what he calls The Pirate Problem. Basically the Pirate Problem can be summed up as &#8220;there are ways of acting and thinking that we can’t understand or anticipate.&#8221; Why is that a &#8216;Pirate Problem&#8217;? Because a pirate pub opened near his home and rather than folding shortly thereafter due to [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/04/23/of-pirates-treasure-chests-and-keys-improving-access-to-digitized-materials/">Of Pirates, Treasure Chests and Keys: Improving Access to Digitized Materials</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://flickr.com/photos/stokerstudios/2309630004/"><img src="http://www.spellboundblog.com/wp-content/uploads/2008/04/2309630004_921015b168_m.jpg" alt="Key to Anything by Stoker Studios (flickr)" align="right" /></a>Dan Cohen posted yesterday about what he calls <a href="http://www.dancohen.org/2008/04/22/the-pirate-problem/" title="The Pirate Problem">The Pirate Problem</a>. Basically the Pirate Problem can be summed up as &#8220;there are ways of acting and thinking that we can’t understand or anticipate.&#8221; Why is that a &#8216;Pirate Problem&#8217;? Because a pirate pub opened near his home and rather than folding shortly thereafter due to lack of interest from the &#8216;very serious professionals&#8217; who populate DC suburbs &#8211; the pub was a rousing success due to the pirate aficionados who came out of the woodwork to sing <a href="http://www.emusic.com/album/Various-Artists-Saydisc-Sea-Songs-Shanties-MP3-Download/10603415.html" title="go hear some sea shanties">sea shanties</a> and drink grog. This surprising turn of events highlighted for him the fact that there are many ways of acting and thinking (some people even know all the words to sea shanties without needing sheet music).</p>
<p>Dan recently delivered the keynote speech at a workshop at the <a href="http://www.unc.edu/" title="University of North Carolina at Chapel Hill">University of North Carolina at Chapel Hill</a>. The workshop brought together dozens of historians to talk about how the 16 million archival documents of the <a href="http://www.lib.unc.edu/mss/shc/index.html" title="Southern Historical Collection">Southern Historical Collection</a> (SHC) should be put online. He devoted his keynote &#8220;to prodding the attendees into recognizing that the future of archives and research might not be like the past&#8221; and goes on in his post to explain:</p>
<blockquote><p>The most memorable response from the audience was from an award-winning historian I know from my graduate school years, who said that during my talk she felt like “a crab being lowered into the warm water of the pot.” Behind the humor was the difficult fact that I was saying that her way of approaching an archive and understanding the past was about to be replaced by techniques that were new, unknown, and slightly scary.</p>
<p>This resistance to thinking in new ways about digital archives and research was reflected in the pre-workshop survey of historians. Extremely tellingly, the historians surveyed wanted the online version of the SHC to be simply a digital reproduction of the physical SHC.</p></blockquote>
<p>Much of the stress of Dan&#8217;s article is on fear of new techniques of analysis. The choppy waters of text mining and pattern recognition threaten to wash away traditional methods of actually reading individual pages and &#8220;most historians just want to do their research they way they’ve always done it, by taking one letter out of the box at a time&#8221;.</p>
<p>I certainly like the idea of new technologically based ways of analyzing large sets of cultural heritage materials, but I also believe that reading individual letters will always be important. The trick is finding the right letter!</p>
<p>And of course &#8211; we still need the context. It isn&#8217;t as if when we digitize major collections like the SHC that we are going to scan and <a href="http://en.wikipedia.org/wiki/Optical_character_recognition" title="Wikipedia: Optical Character Recognition">OCR</a> each page without regard to which box it came out of. We can&#8217;t slice and dice archival records and manuscripts into their component parts to feed into text analysis with no way back to the originals.</p>
<p>I like to imagine the combination of all the new technology (be it digitization, cross collection searching, text mining or pattern recognition) as creating keys to different treasure chests. Humanities scholars are treasure hunters. Some will find their gems through careful reading of individual passages. Others will discover patterns spread across materials now co-existing virtually that before digitization would have been widely separated by space and time. Both methods will benefit from the digitization of materials and the creation of innovative search and text analysis tools. Both still require an understanding of a material&#8217;s origin. The importance of context isn&#8217;t going anywhere &#8211; we still need to know which box the letter came from (and in a perfect world, which page came before and which came after). I want scholars to still be able to read one page from the box &#8211; I just want them to be able to do it from home in the middle of the night if they are so inclined with their travel budget no worse for wear.</p>
<p>Dan ties his post together by pointing out that:</p>
<blockquote><p>&#8230; in Chapel Hill I was the pirate with the strange garb and ways of behaving, and this is a good lesson for all boosters of digital methods within the humanities. We need to recognize that the digital humanities represent a scary, rule-breaking, swashbuckling movement for many historians and other scholars.</p></blockquote>
<p>In my opinion, the core message should be that we just found more locked treasure chests &#8211; and for those who are interested, we have some new keys that just might open those locks. I enjoyed the Pirate metaphor (obviously) and I appreciate that there are real issues here relating to strong discomfort with the fast changing landscape of technology, but I have to believe that if we do something that <em>prevents</em> historians from being able to read one letter at a time we are abandoning the treasure chests that are already open for the new ones for which we haven&#8217;t yet found the right keys. I am greedy. I want all the treasure!</p>
<p><em>Image credit: <a href="http://flickr.com/photos/stokerstudios/2309630004/in/set-72157604056406420" title="Key to Anything by Stoker Studios (flickr)">key to anything by Stoker Studios via flickr</a></em></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/04/23/of-pirates-treasure-chests-and-keys-improving-access-to-digitized-materials/">Of Pirates, Treasure Chests and Keys: Improving Access to Digitized Materials</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/04/23/of-pirates-treasure-chests-and-keys-improving-access-to-digitized-materials/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>ISSUU: Interesting Platform for Online Publishing</title>
		<link>http://www.spellboundblog.com/2008/03/02/issuu-interesting-platform-for-online-publishing/</link>
		<comments>http://www.spellboundblog.com/2008/03/02/issuu-interesting-platform-for-online-publishing/#comments</comments>
		<pubDate>Sun, 02 Mar 2008 04:27:10 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[copyright]]></category>
		<category><![CDATA[digitization]]></category>
		<category><![CDATA[original order]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/03/02/issuu-interesting-platform-for-online-publishing/</guid>
		<description><![CDATA[Issuu, with the tag line &#8216;Read the world. Publish the world.&#8217; and pronounced &#8216;issue&#8217;, gives anyone the ability to upload a PDF document and publish it as an online magazine. I am intrigued by the possibilities of using this service to publish digitized archival records &#8211; especially those that would lend themselves to a &#8216;book&#8217; [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/03/02/issuu-interesting-platform-for-online-publishing/">ISSUU: Interesting Platform for Online Publishing</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a href="http://issuu.com/" title="Issuu">Issuu</a>, with the tag line &#8216;Read the world. Publish the world.&#8217; and pronounced &#8216;issue&#8217;, gives anyone the ability to upload a PDF document and publish it as an online magazine. I am intrigued by the possibilities of using this service to publish digitized archival records &#8211; especially those that would lend themselves to a &#8216;book&#8217; style presentation (thinking here of a ledger or equivalent).</p>
<p>I am not sure I totally understand the implications of the <a href="http://issuu.com/about/terms" title="Issuu Terms">Issuu Terms of service</a>&#8230; especially this part:</p>
<blockquote><p>By distributing or disseminating Uploader Submissions through the Issuu Service, you hereby grant to Issuu a worldwide, non-exclusive, transferable, assignable, fully paid-up, royalty-free, license to host, transfer, display, perform, reproduce, distribute, and otherwise exploit your Uploader Submissions, in any media forms or formats, and through any media channels, now known or hereafter devised, including without limitation, RSS feeds, embeddable functionality, and syndication arrangements in order to distribute, promote or advertise your Uploader Submissions through the Issuu Service.</p></blockquote>
<p>If I am following that properly, all the rights you are granting to the Issuu Service are only for the purposes of their distribution of your uploaded PDF.</p>
<p>Issuu has a special <a href="http://issuu.com/about/copyright" title="Issuu Copyright FAQ">Copyright FAQ</a>, which in combination with <a href="http://www.loc.gov/section108/hirtle.html" title="Peter B. Hirtle's Biography">Peter Hirtle</a>&#8216;s page on <a href="http://www.copyright.cornell.edu/public_domain/" title="Copyright Term and the Public Domain in the United States">Copyright Term and the Public Domain in the United States</a>, should support those trying to figure out if they can upload what they want to upload without getting into copyright related hot water.</p>
<p>So how is it different from a plain old PDF? Take a look at the embedded Issuu viewer below showing a 1908 copy of <em>The Colonial Book of The Towle Manufacturing Company Silversmiths</em>.<br />
<center><object style="width: 425px; height: 301px"><param name="movie" value="http://static.issuu.com/webembed/viewers/style1/v1/IssuuViewer.swf?mode=preview&amp;previewLayout=white&amp;documentId=080209020757-9d92725889a94edea86a31464a8a8fbd"></param><param name="wmode" value="transparent"></param><param name="allowScriptAccess" value="always"></param><embed src="http://static.issuu.com/webembed/viewers/style1/v1/IssuuViewer.swf" type="application/x-shockwave-flash" allowscriptaccess="always" wmode="transparent" style="width: 425px; height: 301px" flashvars="mode=preview&amp;previewLayout=white&amp;documentId=080209020757-9d92725889a94edea86a31464a8a8fbd"></embed></object></p>
<p style="width: 425px; text-align: left"><a href="http://issuu.com" target="_blank"><img src="http://static.issuu.com/webembed/previewers/style1/v1/m1.gif" ismap="ismap" border="0" /></a><a href="http://issuu.com/viewer?mode=embed&amp;documentId=080209020757-9d92725889a94edea86a31464a8a8fbd" target="_blank"><img src="http://static.issuu.com/webembed/previewers/style1/v1/m2.gif" ismap="ismap" border="0" /></a><a href="http://issuu.com/embed/guide?documentId=080209020757-9d92725889a94edea86a31464a8a8fbd&amp;width=425&amp;height=301" target="_blank"><img src="http://static.issuu.com/webembed/previewers/style1/v1/m3.gif" ismap="ismap" border="0" /></a></p>
<p></center> I don&#8217;t think this would ever be the way you would want to give online access to digitized records in general &#8211; but I do think that this could be a great way to highlight a particularly impressive set or volume of documents. If an archives featured one of these a month on their homepage &#8211; would people subscribe to their RSS feed just to see the new one? On the <a href="http://issuu.com/silverlibrary/docs/towle_old_colonial/8" title="Issuu: Silver Library Towle Old Colonial">actual page on which I found the above document</a>, Issuu makes it easy to subscribe to the <a href="http://search.issuu.com/silverlibrary/docs/recent.rss" title="RSS Subscribe to silverlibrary on Issuu">RSS feed for the Issuu author &#8216;silverlibrary&#8217;</a>.</p>
<p>I don&#8217;t know why Issuu has decided that I must create an account before I may view document author <a href="http://issuu.com/silverlibrary" title="Issuu User Profile: silverlibrary">silverlibrary&#8217;s user profile</a>. I would hope that there was an elegant way for visitors to see a group of Issuu documents created by the same author without having to create an account first (or ever).</p>
<p>Want to know what others think? Take a look at <a href="http://www.techcrunch.com/2008/02/06/finally-a-web-based-pdf-viewer-that-does-not-suck-issuu/#comments" title="TechCrunch: Finally, a Web-based PDF Viewer That Does Not Suck (Issuu)">Finally, a Web-based PDF Viewer That Does Not Suck (Issuu)</a> over on <a href="http://www.techcrunch.com/" title="TechCrunch">TechCrunch</a>. One interesting tidbit I picked up from that review is that Issuu is based in Denmark. I wonder what impact that has on which copyright rules apply to the documents uploaded into Issuu.</p>
<p>Want to read more about their vision? Of course they have a press release in the form of an Issuu publication and I have embedded it below. I think my favorite line is that Issuu is intended to be &#8216;YouTube for Publications&#8217;.<br />
<center><object style="width: 425px; height: 301px"><param name="movie" value="http://static.issuu.com/webembed/viewers/style1/v1/IssuuViewer.swf?mode=preview&amp;previewLayout=white&amp;documentId=080206192117-aa46380ffd1b4a85a6561f1c7e139694"></param><param name="wmode" value="transparent"></param><param name="allowScriptAccess" value="always"></param><embed src="http://static.issuu.com/webembed/viewers/style1/v1/IssuuViewer.swf" type="application/x-shockwave-flash" allowscriptaccess="always" wmode="transparent" style="width: 425px; height: 301px" flashvars="mode=preview&amp;previewLayout=white&amp;documentId=080206192117-aa46380ffd1b4a85a6561f1c7e139694"></embed></object></p>
<p style="width: 425px; text-align: left"><a href="http://issuu.com" target="_blank"><img src="http://static.issuu.com/webembed/previewers/style1/v1/m1.gif" ismap="ismap" border="0" /></a><a href="http://issuu.com/viewer?mode=embed&amp;documentId=080206192117-aa46380ffd1b4a85a6561f1c7e139694" target="_blank"><img src="http://static.issuu.com/webembed/previewers/style1/v1/m2.gif" ismap="ismap" border="0" /></a><a href="http://issuu.com/embed/guide?documentId=080206192117-aa46380ffd1b4a85a6561f1c7e139694&amp;width=425&amp;height=301" target="_blank"><img src="http://static.issuu.com/webembed/previewers/style1/v1/m3.gif" ismap="ismap" border="0" /></a></p>
<p></center> I would love to see a highlighted section created for &#8216;cultural heritage materials&#8217; (or something like that anyway). Take a look around <a href="http://www.issuu.com">Issuu</a> and let me know what you think. Is this a viable tool for an archives or manuscript collection to use to highlight parts of their collection?</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/03/02/issuu-interesting-platform-for-online-publishing/">ISSUU: Interesting Platform for Online Publishing</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/03/02/issuu-interesting-platform-for-online-publishing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Capa&#8217;s Found Images and Thoughts on Digital Photographers&#8217; Sketchbooks</title>
		<link>http://www.spellboundblog.com/2008/02/01/capas-found-images-and-thoughts-on-digital-photographers-sketchbooks/</link>
		<comments>http://www.spellboundblog.com/2008/02/01/capas-found-images-and-thoughts-on-digital-photographers-sketchbooks/#comments</comments>
		<pubDate>Fri, 01 Feb 2008 19:59:36 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[at risk records]]></category>
		<category><![CDATA[born digital records]]></category>
		<category><![CDATA[future-proofing]]></category>
		<category><![CDATA[historical research]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[original order]]></category>
		<category><![CDATA[photography]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2008/02/01/capas-found-images-and-thoughts-on-digital-photographers-sketchbooks/</guid>
		<description><![CDATA[In the Washington Post article There Are No Black-and-White Answers in War &#8212; Then Lost Negatives Turn Up (February 1, 2008), we learn that three cardboard boxes of negatives were recently delivered to the International Center of Photography (ICP) &#8211; possibly including as many as 4,000 images. This collection of black-and-white film, consisting predominately of [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/02/01/capas-found-images-and-thoughts-on-digital-photographers-sketchbooks/">Capa&#8217;s Found Images and Thoughts on Digital Photographers&#8217; Sketchbooks</a></p>
]]></description>
			<content:encoded><![CDATA[<p><a title="Wikimedia Commons: Robert Capa by Gerda Taro (May 1937)" href="http://commons.wikimedia.org/wiki/Image:RobertCaprabyGerdaTaro.jpg"><img title="Robert Capa by Gerda Taro (May 1937)" src="http://www.spellboundblog.com/wp-content/uploads/2008/02/280px-robertcaprabygerdataro.jpg" alt="Robert Capa by Gerda Taro (May 1937)" align="right" /></a>In the Washington Post article <a title="Washington Post: There Are No Black-and-White Answers in War -- Then Lost Negatives Turn Up" href="http://www.washingtonpost.com/wp-dyn/content/article/2008/01/31/AR2008013103452.html">There Are No Black-and-White Answers in War &#8212; Then Lost Negatives Turn Up</a> (February 1, 2008), we learn that three cardboard boxes of negatives were recently delivered to the <a title="International Center of Photography" href="http://www.icp.org/">International Center of Photography</a> (ICP) &#8211; possibly including as many as 4,000 images. This collection of black-and-white film, consisting predominately of photos shot by <a title="Wikipedia: Robert Capa" href="http://en.wikipedia.org/wiki/Robert_Capa">Robert Capa</a> during the <a title="Wikipedia: Spanish Civil War" href="http://en.wikipedia.org/wiki/Spanish_Civil_War">Spanish Civil War</a>, was long thought lost during World War II and will join the already existing <a title="ICP: Robert Capa Archive" href="http://www.icp.org/site/c.dnJGKJNsFqG/b.871855/k.BB9C/Robert_Capa_Archive.htm">Robert Capa Archives</a>. The boxes also contain negatives from two other famed photographers associated with Capa, <a title="ICP: Gerda Taro" href="http://www.icp.org/site/c.dnJGKJNsFqG/b.2876511/k.1E74/Gerda_Taro.htm">Gerda Taro</a> and <a title="Wikipedia: David Seymour" href="http://en.wikipedia.org/wiki/David_Seymour">David Seymour</a> (known by the pseudonym Chim &#8211; pronounced shim).</p>
<p>There are many reasons these boxes are exciting for historians and Capa researchers. They hold the promise of answering some long standing questions. Where certain famous photos were staged? Are the current credits given for various photos are correct? But what caught my eye in this article was the following quote from ICP curator <a href="http://www.amazon.com/gp/search?ie=UTF8&amp;keywords=brian%20wallis&amp;tag=csectionrecov-20&amp;index=books&amp;linkCode=ur2&amp;camp=1789&amp;creative=9325">Brian Wallis</a><img style="border: medium none ; margin: 0px" src="http://www.assoc-amazon.com/e/ir?t=csectionrecov-20&amp;l=ur2&amp;o=1" border="0" alt="" width="1" height="1" />:</p>
<blockquote><p>&#8220;Capa was really adept at creating a whole story in one day: Here are the characters, here is the beginning, the action shots, the end, and the effect on civilians. If you look at his work not as great individual shots, but as stories, you get a completely different picture of him, and I think a more accurate and valuable picture.</p>
<p>&#8220;These negatives will further amplify that story, not just a few stories but dozens of stories that went out. It is like a sketchbook &#8212; he was trying out various ideas, and some worked and some didn&#8217;t.&#8221;</p></blockquote>
<p><strong>What About Digital Photographer&#8217;s Sketchbooks?</strong></p>
<p>If you have ever used a digital camera, you have almost certainly enjoyed the instant gratification of being able to preview your photos on the tiny screen. The next temptation is to click the delete button. Sometimes you delete because the photo is clearly not what you were after &#8211; other times you delete to make room for some much more crucial photo.</p>
<p>I don&#8217;t know what the standard best practices are for professional photographers. Part of me hopes that they keep everything &#8211; at least until they can view the images on a big screen. But there is clearly a much easier path to deleting the ideas that didn&#8217;t work out. It leaves me wondering what the scholars of the future will be missing by not being able to see the failed experiments. The &#8216;sketchbooks&#8217; of digital photographers could easily be perceived as at risk records. That said, many creative individuals (artists, architects, photographers&#8230; etc) do not care to share their failed experiments with the outside world. One of the issues facing those preserving digital records of the design community is the strong desire of designers to not share their work in progress and <em>only </em>share the final product (see my post <a title="SAA2007: Preserving Born Digital Records of the Design Community (Session 106)" href="http://www.spellboundblog.com/2007/09/08/saa2007-preserving-born-digital-records-of-the-design-community-session-106/">SAA2007: Preserving Born Digital Records of the Design Community (Session 106) </a>for more thoughts on this).</p>
<p><strong>Image Overload</strong></p>
<p>Of course there will be the photographers who <em>do</em> keep everything. Hard drive space is getting cheaper with every passing day. Perhaps my fears are misplaced and instead we should be worrying more about the flood of photographs that will overwhelm archivists and researchers. The time needed to discover the &#8216;good&#8217; and &#8216;important&#8217; photographs in a collection of thousands of images could be extreme.</p>
<p>I shoot all my photos digitally now. I no longer live in a world where there are only 36 shots on a single role &#8211; I don&#8217;t need to choose each photo carefully. I cheerfully tell my friends &#8220;Photos are free!&#8221;. Even that doesn&#8217;t stop me from deleting the ones that I really dislike. Sometimes the 2 GB card in my camera gets full before an event is done, so the on the spot weeding of photos occurs as well. But when I compare the number of &#8216;good&#8217; photos that I have uploaded to share online (currently 5,000+ and counting) with the number of photos I have on my hard drive (20,000+) it is clear to me that I am keeping plenty of &#8216;sketch photos&#8217;. It is also interesting to note that I will often realize that there are photos I really like now that I didn&#8217;t appreciate immediately after they were taken. While something at the time made me NOT include it as a photo to share, now I see something in the image that catches my eye in a new way. The more this happens, the less I delete as I download, organize and tag my photos.</p>
<p><strong>Metadata and the Exchangeable Image File Format (EXIF)</strong></p>
<p>Of course the situation with digital photographs is not all bad. When digital cameras record a photo, they also record a set of metadata in the <a title="Wikipedia: Exchangeable image file format" href="http://en.wikipedia.org/wiki/Exif">exchangeable image file format</a> (EXIF) format. The metadata recorded usually includes camera make and model, date, time, and camera settings. Some cameras can even record GPS generated location information. Because there is no way to know the time zone (at least without location information), the value of the time setting is more useful for relating photos from within a set to one another than in establishing the actual time a photo was taken.</p>
<p><a title="Adobe" href="http://www.adobe.com/">Adobe</a> has contributed their own proprietary metadata format called <a title="Wikipedia: Extensible Metadata Platform (XMP)" href="http://en.wikipedia.org/wiki/Extensible_Metadata_Platform">Extensible Metadata Platform</a> (XMP).</p>
<blockquote><p>The most common metadata tags recorded in XMP data are those from the <a title="Wikipedia: Dublin Core" href="http://en.wikipedia.org/wiki/Dublin_Core">Dublin Core Metadata Initiative</a>, which include things like title, description, creator, and so on. The standard is designed to be extensible, allowing users to add their own custom types of metadata into the XMP data. (Wikipedia Entry: <a title="Wikipedia: Extensible Metadata Platform (XMP)" href="http://en.wikipedia.org/wiki/Extensible_Metadata_Platform">Extensible Metadata Platform</a> )</p></blockquote>
<p>The magic of both XMP and EXIF is that the metadata is embedded in the file itself. There is no chance of losing the connection between a photo and the information about it &#8211; it is akin to writing on the back of an analog photograph. Embedded metadata provides the greatest tool for rediscovering the original order in which a series of photographs were taken, as well as providing access to metadata entered by the photographer at the image level.</p>
<p>The archivist of today accessioning born digital images must be comfortable with tools for viewing and updating embedded metadata. I mention updating because any information that is currently known about an image that could be added to the embedded metadata is more information that cannot later become accidentally separated from the images in question. This of course assumes that we will still have the proper technology in the future with which to access all this embedded metata.</p>
<p>Embedded metadata can be updated before it reaches the controlled environment of an archive. Data found as embedded metadata must be evaluated in the same manner that any information about photographs would be evaluated. For example, it would be a lot easier to modify metadata on digital photos to make the images appear to have been taken in a different order than it would be to do the same change with a strip of analog negatives. If this in fact was done &#8211; the fact that it was would likely be as interesting to researchers as the original order (assuming of course that you could ever figure out that such a modification had been made!).</p>
<p>Not all methods of organizing photos results in embedded metadata, so there is plenty of room for the standard challenges of old software that you can&#8217;t get to run but that holds the key to information about a hard drive of thousands of images. Sophisticated photograph management tools often now include workflow features that could also provide insight into the decision making and processing steps taken by a photographer. Much of this type of information is very unlikely to be embedded in the photos themselves &#8211; but still would represent interesting digital records related to the everyday work that a professional photographer performs.</p>
<p><strong>Final Thoughts</strong></p>
<p>I do feel that something is being lost via the ease with which one may delete experimental &#8216;sketchbook&#8217; photos, but I suspect that the lure of virtually infinite hard drive space, image organization/tagging software tools and the clues provided by embedded metadata will balance the scales. Those who study photographers and their work will certainly have more to say about far in the future. There will be hard choices over the next decades  &#8211; what can we do to guarantee access in that distant time to the full digital bodies of work of the Capas of today? I think the answers start with building strong lines of communication between prominent digital photographers and archivists. I know that this is just a special case of the challenges we see with digital records across professions, but each field adds its own special issues that must be sorted through and figured out one at a time. So, are there archivists out there working with professional digital photographers?</p>
<p><em>For more images related to this story, see the New York Time&#8217;s</em> <a title="New York Times: Robert Capa's Lost Negatives Slideshow" href="http://www.nytimes.com/slideshow/2008/01/27/arts/20080127_KENN_SLIDESHOW_index.html"><em>slideshow about Robert Capa&#8217;s Lost Negatives</em></a><em> [UPDATE: New images available in the slideshow <a title="Slideshow: Inside the Mexican Suitcase" href="http://www.nytimes.com/slideshow/2009/04/29/arts/20090429_SUITCASE_SLIDESHOW_index.html">Inside the Mexican suitcase</a>, posted April 29, 2009]<br />
</em></p>
<p><em>Image credit: Photographer Robert Capa during the Spanish civil war, May 1937. Photo by Gerda Taro. If the logic on <a title="Wikimedia Commons: Robert Capa by Gerda Taro" href="http://commons.wikimedia.org/wiki/Image:RobertCaprabyGerdaTaro.jpg">this Wikimedia Commons page</a> is to believed, this photo is in the public domain in the United States because the photographer died in 1937 (ie, more than 70 years ago).</em></p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2008/02/01/capas-found-images-and-thoughts-on-digital-photographers-sketchbooks/">Capa&#8217;s Found Images and Thoughts on Digital Photographers&#8217; Sketchbooks</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2008/02/01/capas-found-images-and-thoughts-on-digital-photographers-sketchbooks/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>SAA 2007 Session Proposal: Preserving Context and Original Order in a Digital World</title>
		<link>http://www.spellboundblog.com/2006/09/28/saa-2007-session-proposal-preserving-context-and-original-order-in-a-digital-world/</link>
		<comments>http://www.spellboundblog.com/2006/09/28/saa-2007-session-proposal-preserving-context-and-original-order-in-a-digital-world/#comments</comments>
		<pubDate>Fri, 29 Sep 2006 01:40:05 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[context]]></category>
		<category><![CDATA[digitization]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[GIS]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[original order]]></category>
		<category><![CDATA[SAA2006]]></category>
		<category><![CDATA[SAA2007]]></category>
		<category><![CDATA[software]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2006/09/28/saa-2007-session-proposal-preserving-context-and-original-order-in-a-digital-world/</guid>
		<description><![CDATA[Abby Adams, Assistant Access and Outreach Archivist of the Richard B. Russell Library for Political Research and Studies, University of Georgia, and I are putting together a proposal for a session at SAA 2007 in Chicago. She and I found each other via my poster at SAA 2006: Communicating Context in Online Collections. We have [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2006/09/28/saa-2007-session-proposal-preserving-context-and-original-order-in-a-digital-world/">SAA 2007 Session Proposal: Preserving Context and Original Order in a Digital World</a></p>
]]></description>
			<content:encoded><![CDATA[<p>Abby Adams, Assistant Access and Outreach Archivist of the <a href="http://www.libs.uga.edu/russell/" title="Richard B Russell Library">Richard B. Russell Library for Political Research and Studies, University of Georgia</a>, and I are putting together a proposal for a session at <a href="http://www.archivists.org/conference/chicago2007/index.asp" title="SAA 2007">SAA 2007</a> in Chicago. She and I found each other via my poster at SAA 2006: <a href="http://www.spellboundblog.com/poster/" title="Communicating Context in Online Collections">Communicating Context in Online Collections</a>. We have been pondering many of the same questions related to the effective communication of context and original order in online digitized collections.</p>
<p>Our proposal is for a traditional 3 presentation panel with the title &#8220;Preserving Context and Original Order in a Digital World&#8221;. All we need now is a 3rd presenter, the endorsement of an SAA section or roundtable and (of course) the approval of the session selection committee. (And some plane tickets!)</p>
<p>This is the current version of our description for the proposal (mostly composed by Abby) :</p>
<blockquote><p>Now that digitization projects have become more common in archival repositories, user and archivists alike have uncovered problems when it comes to understanding the context of online materials.  However, there are various ways to provide more contextual information, thus enhancing the use of digital archives.  But, archivists must confront the obstacles surrounding this task by developing best practices and incorporating new software into their digitization projects.  In order to simplify the problem, we should return to our traditional archival principles and draw connections to collection arrangement and description in a digital environment. Join three archivists to explore how to improve on &#8220;analog&#8221; techniques in the communication of context.  When done right, the digitization of a collection will not only retain all the same opportunities for communicating context that we are familiar with, it may revolutionize the way that archivists and users interact and understand our records.</p></blockquote>
<p>The short take on what we want to cover in our session&#8217;s presentations is:</p>
<ul>
<li>What should archivists be doing to not loose context and original order information in the transition from analog records to digitized records?</li>
<li>What can digitization give us the ability to do that we couldn&#8217;t do in the analog world?</li>
<li>What tools and standards are out there today to help archivists do both of the above? What information should archivists be capturing to permit them to take advantage of the opportunities to communicate context and original order that these tools and standards offer?</li>
</ul>
<p>Abby&#8217;s part of the session, titled &#8220;Where&#8217;s the Context? Enhancing Access to Digital Archives&#8221;, will examine the need for preserving context and original order when digitizing archival materials &#8211; focusing on how it enhances online use and access to archives.  How can new systems retain the existing ability to communicate context and original order when moving from “analog” to “digital”?</p>
<p>My portion, &#8220;Communicating Context: The Power of Digital Interfaces&#8221;, will discuss what archivists can do in the digital world they cannot do (or at least not easily) with analog records to communicate context and original order. I will focus on various innovative methods to do this including the use of GIS, hot-linking for ease of navigation, the ability to &#8216;collect&#8217; digital surrogates for examination and more. I plan to include a combination of exciting new interfaces doing great things alongside new ideas of what could be done. Keep your fingers crossed for us that there is internet access in the session rooms in Chicago.</p>
<p>We have a vision of a third speaker whose talk would consider what the leading standards and software tools are permitting people to do today. How can archivists leverage the existing and evolving standards (<a href="http://www.loc.gov/ead/" title="Encoded Archival Description (EAD)">EAD</a>, <a href="http://www.library.yale.edu/eac/" title="Encoded Archival Context (EAC)">EAC</a>, <a href="http://www.tei-c.org/" title="Text Encoding Initiative (TEI)">TEI</a> and other <a href="http://en.wikipedia.org/wiki/Document_Type_Definition" title="Document Type Definition (DTD)">DTD</a> s) to capture and communicate context and original order in the digital world? In addition, it would provide a high level review of common software packages (<a href="http://www.archon.org/" title="Archon">Archon</a> , <a href="http://archiviststoolkit.org/" title="Archivists' Toolkit">Archivists&#8217; Toolkit</a>, <a href="http://www.dimema.com/" title="ContentDM">ContentDM</a> , and others) and how they address original order and context. Finally we have a notion of a checklist of what to capture when digitizing to take advantage of what these tools and standards can provide for you.</p>
<p>Are you our mystery 3rd panelist that we are having so much trouble finding? Your first tip is that you have already mapped out 5 powerpoint slides in your head and started scribbling a rough draft of the &#8220;Archivists&#8217; Digitization Checklist for Preserving Context&#8221; on a scrap of paper near your computer.</p>
<p>Maybe you know someone who would be a great person to pitch this to? Or you have advice for us concerning who to pass our proposal along to in the great hunt for that elusive session endorsement?</p>
<p>The deadline looms large (October 9)! Please contact us either via email (jeanne AT spellboundblog DOT com and adamsabi AT uga DOT edu) or in the comments of this post.</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2006/09/28/saa-2007-session-proposal-preserving-context-and-original-order-in-a-digital-world/">SAA 2007 Session Proposal: Preserving Context and Original Order in a Digital World</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2006/09/28/saa-2007-session-proposal-preserving-context-and-original-order-in-a-digital-world/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Ideas about Zotero and Digitized Archives</title>
		<link>http://www.spellboundblog.com/2006/09/09/ideas-about-zotero-and-digitized-archives/</link>
		<comments>http://www.spellboundblog.com/2006/09/09/ideas-about-zotero-and-digitized-archives/#comments</comments>
		<pubDate>Sat, 09 Sep 2006 04:04:42 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[historical research]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[original order]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[software]]></category>
		<category><![CDATA[what if]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2006/09/09/ideas-about-zotero-and-digitized-archives/</guid>
		<description><![CDATA[Dan Cohen posted recently about the soon to be available, open-source, firefox plugin, research support software named Zotero . Looking at the quick start guide, I immediately spotted the icon to &#8220;add a new collection folder&#8221;. As the &#8220;archivist-in-training&#8221; that I am, my reaction now to the word &#8220;collection&#8221; is different than it would have [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2006/09/09/ideas-about-zotero-and-digitized-archives/">Ideas about Zotero and Digitized Archives</a></p>
]]></description>
			<content:encoded><![CDATA[<p>Dan Cohen <a title="Dan Cohne's post about Zotero" href="http://www.dancohen.org/blog/posts/introducing_zotero">posted recently</a> about the soon to be available, open-source, firefox plugin, research support software named <a title="Zotero" href="http://www.zotero.org/">Zotero</a> . Looking at the <a title="Zotero Quick Start Guide" href="http://www.zotero.org/docs/quick_start_guide">quick start guide</a>, I immediately spotted the icon to &#8220;add a new collection folder&#8221;. As the &#8220;archivist-in-training&#8221; that I am, my reaction now to the word &#8220;collection&#8221; is different than it would have been a year ago. Though I strongly suspect it will not be the case (at least not in the first released version) I immediately was daydreaming of browsing a digitized collection online &#8211; clicking the &#8220;add a new collection folder&#8221; icon &#8211; and ending up with a copy of the entire collection of records for examination and comparison later.</p>
<p>Of course this would be most useful for the historian digging through and analyzing archival records if Zotero was able to pull down metadata beyond that of a standard citation and retain any hierarchical information or relationships among the records.</p>
<p>Now on <a title="Dead Reckoning" href="http://zero2180.net/deadreckoning/">Dead Reckoning</a>&#8216;s post on <a title="Zotero Post" href="http://zero2180.net/deadreckoning/2006/09/05/zotero/">Zotero</a> RDFa is mentioned. I don&#8217;t know anything about <a title="RDFa" href="http://wiki.creativecommons.org/RDFa">RDFa</a> beyond what I have read in the last few hours, so it is not clear to me how complicated the metadata can be &#8211; perhaps it can support a full digital object XML record of some kind. So maybe the trick isn&#8217;t so much getting Zotero to do things it wasn&#8217;t designed to do &#8211; but rather the slow migration of sites to using the software packages and standards listed <a title="Zotero Supported Sites" href="http://www.zotero.org/docs/sites.php">here</a>.</p>
<p>I don&#8217;t want anyone to think that I am not excited about Zotero and all the neat things it is likely to do. I suspect I will rapidly become a frequent Zotero user verging on a zealot &#8211; but it is fun to daydream. I think it is most fun to daydream now, before I start using it and get lost in all the great stuff it CAN do. I definitely will post more after I get a chance to take it for a spin in early October.</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2006/09/09/ideas-about-zotero-and-digitized-archives/">Ideas about Zotero and Digitized Archives</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2006/09/09/ideas-about-zotero-and-digitized-archives/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>SAA 2006 Poster: Communicating Context in Online Collections</title>
		<link>http://www.spellboundblog.com/2006/08/15/saa-2006-poster-communicating-context-in-online-collections/</link>
		<comments>http://www.spellboundblog.com/2006/08/15/saa-2006-poster-communicating-context-in-online-collections/#comments</comments>
		<pubDate>Tue, 15 Aug 2006 04:49:44 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[context]]></category>
		<category><![CDATA[interface design]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[original order]]></category>
		<category><![CDATA[SAA2006]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2006/08/15/saa-2006-poster-communicating-context-in-online-collections/</guid>
		<description><![CDATA[I promised a number of people I spoke with at the SAA 2006 conference that I would post information from my poster. I have finally added it on a page here on the blog. For those of you who didn&#8217;t make the conference (or didn&#8217;t make it to my mini talk in front of my [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2006/08/15/saa-2006-poster-communicating-context-in-online-collections/">SAA 2006 Poster: Communicating Context in Online Collections</a></p>
]]></description>
			<content:encoded><![CDATA[<p>I promised a number of people I spoke with at the SAA 2006 conference that I would post information from my poster. I have finally added it <a title="SAA 2006: Poster: Communicating Context in Online Collections" href="http://www.spellboundblog.com/poster/" target="_blank">on a page </a> here on the blog. </p>
<p>For those of you who didn&#8217;t make the conference (or didn&#8217;t make it to my mini talk in front of my poster on Friday morning), my poster showed the results of my research into how the web interfaces to various digitized archival collections handled the issues of original order and communication of context. I was very interested to see to what degree websites for digitized collections were doing a good job helping the user understand the relationships between the records as well as the context of the records.</p>
<p>Most people asked me which was my &#8216;favorite&#8217; &#8211; and my answer was always that I liked something about each of the sites I showed on my poster. A perfect site would have the collection overview that the <a title="LOC American Memory: Browse Collections" href="http://memory.loc.gov/ammem/browse/index.html" target="_blank">Library of Congress American Memory – Browse Collections</a> page shows, the convenient search result resorting option shown on the <a title="Green &amp; Green Virtual Archives" href="http://cwis.usc.edu/dept/architecture/greeneandgreene/" target="_blank">Greene &amp; Greene Virtual Archives</a> search result page, the item details display option provided on the <a title="Irene Kaufman Settlement Photograph Collection" href="http://images.library.pitt.edu/cgi-bin/i/image/image-idx?sid=f9133b38664a9ff1fbd724d7464df385;page=index;c=iks;g=imls" target="_blank">Irene Kaufman Settlement Photograph Collection</a> &#8216;images with full record&#8217; search results page, the clear communication of hierarchy shown in the Yoshiko Uchida <a title="GenView Examples" href="http://sunsite.berkeley.edu/moa2/sampleobjs.html" target="_blank">example of the GenView MOA2 document viewer</a> and a rich use of audio, images and in-place historical context as is done on the <a title="Gilder Lehrman Wartime Love Letters" href="http://www.gilderlehrman.org/collection/battlelines/chapter3/chapter3_1a.html" target="_blank">Gilder Lehrman Wartime Love Letters</a> site. The big answer I found from all of this was that planning ahead was key. If you keep metadata related to the order of the records being digitized, it gives you the opportunity to do good things with that information when building your interface.</p>
<p>On the <a title="SAA 2006: Poster - Communicating Context in Online Collections" href="http://www.spellboundblog.com/poster/" target="_blank">&#8216;Poster page&#8217;</a> have included a list of links to the websites I used as my examples, my key points and a thumbnail of the poster with a link to download a BIG version (you will need to scroll around a good bit &#8211; but you should be able to read it in the large version).</p>
<p>If you have questions &#8211; just let me know. I can always be reached via email at jeanne AT spellboundblog.com.</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2006/08/15/saa-2006-poster-communicating-context-in-online-collections/">SAA 2006 Poster: Communicating Context in Online Collections</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2006/08/15/saa-2006-poster-communicating-context-in-online-collections/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thoughts on Archiving Web Sites</title>
		<link>http://www.spellboundblog.com/2006/07/26/thoughts-on-archiving-web-sites/</link>
		<comments>http://www.spellboundblog.com/2006/07/26/thoughts-on-archiving-web-sites/#comments</comments>
		<pubDate>Wed, 26 Jul 2006 22:47:45 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[born digital records]]></category>
		<category><![CDATA[context]]></category>
		<category><![CDATA[future-proofing]]></category>
		<category><![CDATA[internet archiving]]></category>
		<category><![CDATA[original order]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2006/07/26/thoughts-on-archiving-web-sites/</guid>
		<description><![CDATA[Shortly after my last post, a thread surfaced on the Archives Listserv asking the best way to crawl and record the top few layers of a website. This led to many posts suggesting all sorts of software geared toward this purpose. This post shares some of my thinking on the subject. Adobe Acrobat can capture [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2006/07/26/thoughts-on-archiving-web-sites/">Thoughts on Archiving Web Sites</a></p>
]]></description>
			<content:encoded><![CDATA[<p>Shortly after my last post, <a title="Capturing Websites Thread on ARCHIVES listserv" target="_blank" href="http://listserv.muohio.edu/scripts/wa.exe?A1=ind0607d&#038;L=archives&#038;O=T&#038;H=0&#038;D=0&#038;T=1#2">a thread</a> surfaced on the <a title="ARCHIVES ListServ" target="_blank" href="http://listserv.muohio.edu/scripts/wa.exe?A0=archives&#038;D=0&#038;H=0&#038;O=T&#038;T=1">Archives Listserv</a> asking the best way to crawl and record the top few layers of a website. This led to many posts suggesting all sorts of software geared toward this purpose. This post shares some of my thinking on the subject.</p>
<p>Adobe Acrobat can capture a website and convert it into a PDF. As pointed out in the thread above, that would loose the original source HTML &#8211; yet there are more issues than that alone. It would also loose any interaction other than links to other pages. It is not clear to me what would happen to a video or flash interface on a site being &#8216;captured&#8217; by Acrobat. Quoting <a target="_blank" title="Acrobat7 Working with the Web lesson" href="http://www.adobe.com/education/pdf/acrobat_curriculum7/acrobat7_lesson08.pdf">a lesson for Acrobat7 titled Working with the Web</a> : &#8220;Acrobat can download HTML pages, JPEG, PNG, SWF, and GIF graphics (including the last frame of animated GIFs), text files, image maps and form fields. HTML pages can include tables, linkes, frames, background colors, text colors, and forms. Cascading Stylesheets are supported. HTML links are turned into Web links, and HTML forms are turned into PDF forms.&#8221;</p>
<p>I looked at a few website HTML capture programs such as <a title="Heritrix" target="_blank" href="http://archive-crawler.sourceforge.net/">Heritrix</a>, <a title="Teleport Pro" target="_blank" href="http://www.tenmax.com/teleport/pro/home.htm">Teleport Pro</a>, <a title="HTTrack Web" target="_blank" href="http://www.httrack.com/">HTTrack Web</a> and the related <a title="Proxy Track" target="_blank" href="http://www.httrack.com/proxytrack/">ProxyTrack</a>. I hope to take the time to compare each of these options and discover what it does when confronted with something more complicated than HTML, images or cascading style sheets. It also got me thinking about HTML and versions of browsers. It think it safe to say that most people who browse the internet with any regularity have had the experience of viewing a page that just didn&#8217;t look right. Not looking right might be anything from strange alignment or odd fonts all the way to a page that is  completely illegible. If you are a bit of a geek (like me) you might have gotten clever and tried another browser to see if it looked any better. Sometimes it does &#8211; sometimes it doesn&#8217;t. Some sites make you install something special (flash or some other type of plugin or local program).</p>
<p>Where does this leave us when archiving websites? A website is much more than just it&#8217;s text. If the text were all we worried about I am sure you could crawl and record (or screen scrape) just the text and links and call it a day being fairly confident that text stored as a plain ASCII file (with some special notation for links) would continue to be readable even if browsers disappeared from the world. While keeping the words is useful, it also looses a lot of the intended meaning. Have you read full text journal articles online that don&#8217;t have the images? I have &#8211; and I hate it. I am a very visually oriented person. It doesn&#8217;t help me to know there WAS a diagram after the 3rd paragraph if I can&#8217;t actually see it. Keeping all the information on a webpage is clearly important. The full range of content (all the audio, video, images and text on a page) is important to viewing the information in its original context.</p>
<p>Archivists who work with non-print media records that require equipment for access are already in the practice of saving old machines hoping to ensure access to their film, video and audio records. I know there are recommendations for retaining older computers and software to ensure access to data &#8216;trapped&#8217; in &#8216;dead&#8217; programs (I will define a dead program  here as one which is no longer sold, supported or upgraded &#8211; often one that is only guaranteed to run on a dead operating system).  My fear is for the websites that ran beautifully on specific old browsers. Are we keeping copies of old browsers? Will the old browsers even run on newer operating systems? The internet and its content is constantly changing &#8211; even just keeping the HTML may not be enough. What about those plugins &#8211; what about the streaming video or audio. Do the crawlers pull and store that data as well?</p>
<p>One of the most interesting things about reading old newspapers can be the ads. What was being advertised at the time? How much was the sale price for laundry detergent in 1948? With the internet customizing itself to individuals or simply generating random ads how would that sort of snapshot of products and prices be captured? I wonder if there is a place for advertising statistics as archival records. What google ads were most popular on a specific day? Google already has <a target="_blank" title="Google Trends: katrina vs hurricane" href="http://www.google.com/trends?q=katrina%2C+hurricane&#038;ctab=0&#038;geo=all&#038;date=all">interesting graphs</a> to show the correspondence between specific keyword searches and news stories that google perceives as related to the event. The <a target="_blank" title="Internet Archive" href="http://www.archive.org/">Internet Archive</a> (IA) could be another interesting source for statistical analysis of advertising for those sites that permit crawling.</p>
<p>What about customization? Only I (or someone looking over my shoulder) can see my MyYahoo page. And it changes each time I view it. It is a conglomeration of the latest travel discounts, my favorite comics, what is on my favorite TV and cable channels tonight, the headlines of the newspapers/blogs I follow and a snapshot of my stock portfolio. Take even a corporate portal inside an intranet. Often a slightly less moving target &#8211; but still customizable to the individual. Is there a practical way to archive these customized pages &#8211; even if only for a specific user of interest? Would it be worthwhile to be archiving the personalized portal pages of an &#8216;important&#8217; or &#8216;interesting&#8217; person on a daily basis &#8211;  such that their &#8216;view&#8217; of the world via a customized portal could be examined by researchers later?</p>
<p>A wealth of information can be found on the website for the <a title="Joint Workshop on Future-proofing Institutional Websites" target="_blank" href="http://www.dcc.ac.uk/events/fpw-2006/">Joint Workshop on Future-proofing Institutional Websites</a> from January 2006. The one thing most of these presentations agree upon is that &#8216;future-proofing&#8217; is something that institutions should think about at the time of website design and creation. <a title="Powerpoint: Standards for creating future-proof websites" target="_blank" href="http://www.dcc.ac.uk/events/fpw-2006/fpw_2006_rachel_andrew.ppt">Standards for creating future-proof websites</a> directs website creators to use and validate against open standards. <a href="http://www.dcc.ac.uk/events/fpw-2006/fpw_2006_NARA.ppt">Preservation Strategies for institutional website content</a> shows insight into <a target="_blank" title="U.S. National Archives and Records Administration" href="http://www.archives.gov/">NARA</a>&#8216;s approach for archiving US government sites, the results of which can be viewed at  <a title="NARA WebHarvest" target="_blank" href="http://www.webharvest.gov/">http://www.webharvest.gov/</a>. A summary of the issues they found can be read in the tidy 11 page <a target="_blank" title="web survey" href="http://www.netpreserve.org/publications/iipc-r-001.pdf">web harvesting survey</a>.</p>
<p>I definitely have more work ahead of me to read through all the information available from the <a target="_blank" title="International Internet Preservation Consortium" href="http://www.netpreserve.org/">International Internet Preservation Consortium</a> and the National Library of Australia&#8217;s <a target="_blank" title="Preserving Access to Digital Information" href="http://www.nla.gov.au/padi/">Preserving Access to Digital Information (PADI)</a>. More posts on this topic as I have time to read through their rich resources.</p>
<p>All around, a lot to think about. Interesting challenges for researchers in the future. The choices archivists face today often will depend on the type of site they are archiving. Best practices are evolving both for &#8216;future-proofing&#8217; sites and for harvesting sites for archiving. Unfortunately, not everyone building a website that may be worth archiving is particularly concerned with validating their sites against open standards. Institutions that KNOW that they want to archive their sites are definitely a step ahead. They can make choices in their design and development to ensure success in archiving at a later date. It is the wild west fringe of the internet that are likely to present the greatest challenge for  archivists and researchers.</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2006/07/26/thoughts-on-archiving-web-sites/">Thoughts on Archiving Web Sites</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2006/07/26/thoughts-on-archiving-web-sites/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Paper Calendars, Palm Pilots and Google Calendar</title>
		<link>http://www.spellboundblog.com/2006/07/20/paper-calendars-palm-pilots-and-google-calendar/</link>
		<comments>http://www.spellboundblog.com/2006/07/20/paper-calendars-palm-pilots-and-google-calendar/#comments</comments>
		<pubDate>Fri, 21 Jul 2006 01:04:55 +0000</pubDate>
		<dc:creator>Jeanne</dc:creator>
				<category><![CDATA[access]]></category>
		<category><![CDATA[born digital records]]></category>
		<category><![CDATA[context]]></category>
		<category><![CDATA[original order]]></category>
		<category><![CDATA[preservation]]></category>

		<guid isPermaLink="false">http://www.spellboundblog.com/2006/07/20/paper-calendars-palm-pilots-and-google-calendar/</guid>
		<description><![CDATA[In my intro archives class (LBSC 605 Archival Principles, Practices, and Programs), one of the first ideas that made a light bulb go on over my head related to the theory that archivists want to retain the original order of records. For example, if someone choose to put a series of 10 letters together in [...]<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2006/07/20/paper-calendars-palm-pilots-and-google-calendar/">Paper Calendars, Palm Pilots and Google Calendar</a></p>
]]></description>
			<content:encoded><![CDATA[<p>In my intro archives class (<a target="_blank" href="http://www.clis.umd.edu/courses/course_descriptions.shtml#c605">LBSC 605 Archival Principles, Practices, and Programs</a>), one of the first ideas that made a light bulb go on over my head related to the theory that archivists want to retain the original order of records. For example, if someone choose to put a series of 10 letters together in a file &#8211; then they should be kept that way. A researcher may be able to glean more information from these letters when he/she sees them grouped that way &#8211; organized as the person who originally <em>used</em> them organized them.</p>
<p>Our professor went on to explain that seeing what the person who used the records saw was crucial to understanding the original purpose and usage of those records. That took my mind quickly to the world of calendars. Years ago, a CEO of some important organization would have a calendar or datebook of some sort &#8211; likely managed by an assistant. Ink or pencil was used to write on paper. Perhaps fresh daily schedules would be typed.</p>
<p>Fast forward to now and the universe of the <a target="_blank" href="http://en.wikipedia.org/wiki/Palm_Pilot">Palm Pilot</a> and other such handy-dandy hand held and totally customizable devices. If you have one (or have seen those of a friend) you know that how I choose to look at my schedule may be radically different from the way you choose to see your schedule. Mine might have my to-do list shown on the bottom half of the screen. Yours might have little colored icons to show you when you have a conference call. The archivist asked to preserve a born digital calendar will have a lot of hard choices to make.</p>
<p>These days I actually use <a target="_blank" href="http://www.google.com/calendar/">Google Calendar</a> more often than my Palm. While it has more of a fixed layout (for the moment) &#8211; I have the option of including many external calendars (see examples at <a target="_blank" href="http://www.icalshare.com/">iCalShare</a>). Right now I have listings of when new movies come out as well as the concert schedule for <a target="_blank" href="http://www.wolftrap.org/performances/schedule.html">summer 2006</a> for the <a target="_blank" href="http://www.wolftrap.org/">Wolf Trap National Park for the Performing Arts</a>. In the old style paper calendar, a researcher would be able to see related events that the user of the calendar cared about because they would be written down right there. If someone wanted to include my Google calendar in an archive someday (or that of someone much more important!), I suspect they would be left with JUST the records I had added myself into my calendar. When I choose to display the Wolf Trap summer schedule, Google calendar asks me to wait while it loads &#8211; presumably from an externally published iCalendar or other public Google calendar source.</p>
<p>This has many implications for the archivist tasked with preserving the records in that Palm Pilot or Google calendar (or any of a laundry list of scheduling applications). This post can do nothing other than list interesting questions at this stage (both &#8216;this stage&#8217; of my archival education as well as &#8216;this stage&#8217; of consideration of born digital records in the archival field).</p>
<ul>
<li>How important is it to preserve the appearance of the interface used by the digital calendar user?</li>
<li>Might printing or screen capturing a statistical sample (an entire month? an entire year?) help researchers in the future understand HOW the record creator in question interacted with their calendar &#8211; what sorts of information they were likely to use in making choices in their scheduling?</li>
<li>Could there be a place for preserving publicly shared calendars (like the ones you can choose to access on Google Calendar or <a target="_blank" href="http://www.apple.com/macosx/features/ical/">Apple&#8217;s iCal</a>) such that they would be available to researchers later? What organization would most likely be capable of taking this sort of task on?</li>
<li>Could <a target="_blank" href="http://en.wikipedia.org/wiki/Emulator">emulators </a>be used to permit easy access to centrally stored born digital calendars? At least one <a target="_blank" href="http://www.palmos.com/dev/tools/emulator/">PalmOS Emulator</a> already exists, created mainly for use by those developing software for hardware that runs the Palm operating system it mimics how the tested software would run in the real world. Should archivists be keeping copies of this sort of software as they look to the future of retaining the best access possible to these sorts of records?</li>
<li>How can the standard <a target="_blank" href="http://en.wikipedia.org/wiki/ICalendar">iCalendar</a> format be leveraged by archivists working to preserve born digital calendars?</li>
<li>To what degree are the schedules of people whose records will be of interest to archivists someday moving out of private offices (and even out of personally owned computers and handheld devices) and into the centralized storage of web applications such as Google Calendar?</li>
</ul>
<p>I know that this is just a tiny bite of the kinds of issues being grappled with by Archivists around the world as they begin to accept born digital records into archives. Each type of application (scheduling vs accounting vs business systems) will pose similar issues to those described above &#8211; along with special challenges unique to each type. Perhaps if each of the most common classes of applications (such as scheduling) are tackled one by one by a designated team we can save individual archivists the pain of reinventing the wheel. Is this already happening?</p>
<p>This post is from from: <a href="http://www.spellboundblog.com">Spellbound Blog</a>.<br/><br/><a href="http://www.spellboundblog.com/2006/07/20/paper-calendars-palm-pilots-and-google-calendar/">Paper Calendars, Palm Pilots and Google Calendar</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.spellboundblog.com/2006/07/20/paper-calendars-palm-pilots-and-google-calendar/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>

