learning technology | Spellbound Blog

SAA2008: The Wiki is Online

June 18, 2008

As you may have heard elsewhere, the wiki to support the 2008 annual meeting of the Society of American Archivists is now online and waiting for your contributions.

Check out (or add to) the the pages with Maps of San Francisco, hotel information and details about public transport. Look for a roommate or a rideshare. Learn about or organize an unofficial event.

New to wikis? Well, there is a page just for you!

New to SAA Conferences? Check out the SAA First-Timer Tips. Been a million times? Well then go make sure that the First-Timer Tips page includes everything it should!

What I mention above just scratches the surface of what is on the wiki… and remember, the goal isn’t only to read but also update, add and correct the wiki. Because a full history of every page is kept there is no way for you to do anything wrong such that we cannot roll back to a prior version very easily. I am also offering help for anyone new and nervous with wikis. Either post a question on my profile page on the wiki or send me a message via my contact page.

THATCamp 2008: Text Mining and the Persian Carpet Effect

June 1, 2008 5 Comments

I attended a THATCamp session on Text Mining. There were between 15 and 20 people in attendance. I have done my best to attribute ideas to their originators wherever possible – but please forgive the fact that I did not catch the names of everyone who was part of this session.

What Is Text Mining?

Text mining is an umbrella phrase that covers many different techniques and types of tools.

The CHNM NEH-funded text mining initiative defined text mining as needing to support these three research functions:

Locating or finding: improving on search
Extraction: once you find a set of interesting documents, how do you extract information in new (and hopefully faster) ways? How do you pull data from unstructured bulk into structured sets?
Analysis: support analyzing the data, discovery of patterns, answering questions

The group discussed that there were both macro and micro aspects to text mining. Sometimes you are trying to explore a collection. Sometimes you are trying to examine a single document in great detail. Still other situations call for using text mining to generate automated classification of content using established vocabularies. Different kinds of tools will be important during different phases of research.

Projects, Tools, Examples & Cool Ideas

Andrea Eastman-Mullins, from Alexander Street Press, mentioned the University of Chicago’s ARTFL Project and these two tools:

PhiloLogic: An XML/SGML based full-text search, retrieval and analysis tool
PhiloMine: a extension being developed for PhiloLogic to provide support for “a variety of machine learning, text mining, and document clustering tasks”.

Dan Cohen directed us to his post about Mapping What Americans Did on September 11 and to Twistori which text mines Twitter.

Other Projects & Examples:

MONK project (Metadata Offer New Knowledge)
Open Content Alliance(OCA)
Library of Congress Chronicling America – newspaper pages from 1897-1910
Tanya Clement’s project “Using Digital Tools to Not-Read Gertrude Stein’s The Making of Americans” at University of Maryland, College Park
Two other University of Maryland, College Park projects that were not mentioned during the session, but may be of interest are FeatureLens and BasketLens
Google Docs now includes Flesch-Kincaid Readability Tests and Automated Readability Index in the same window in which it shows you your Word Count
Spam filters – such as Bayesian spam filtering using text mining to identify spam e-mails
Clustering – see my post on this: Clustering Data: Generating Organization from the Ground Up and also take a look at Clusty.com and their ‘remix clusters’ option.

Some neat ideas that were mentioned for ways text mining could be used (lots of other great ideas were discussed – these are the two that made it into my notes):

Train a tool with collections of content from individual time periods, then use the tool to assist in identification of originating time period for new documents. Also could use this same setup to identify shifts in patterns in text by comparing large data sets from specific date ranges
If you have a tool that has learned how to classify certain types of content well… then watch for when it breaks – this can give you interesting trails to things to investigate.

Barriers to Text Mining

All of the following were touched upon as being barriers or challenges to text mining:

access to raw text in gated collections (ie, collections which require payment to permit access to resources) such as JSTOR and Project MUSE and others.
tools that are too difficult for non-programmers to use
questions relating to the validity of text mining as a technique for drawing legitimate conclusions

Next Steps

These ideas were ones put forward as important to move forward the field of text mining in the humanities:

develop and share best practices for use when cultural heritage institutions make digitization and transcription deals with corporate entities
create frameworks that enable individuals to reproduce the work of others and provide transparency into the assumptions behind the research
create tools and techniques that smooth the path from digitization to transcription
develop focused, easy-to-use tools that bridge the gap between computer programmers and humanities researchers

My thoughts
During the session I drew a parallel between the information one can glean in the field of archeology from the air that cannot be realized on the ground. I discovered it has a name:

“Archaeologists call it the Persian carpet effect. Imagine you’re a mouse running across an elaborately decorated rug. The ground would merely be a blur of shapes and colors. You could spend your life going back and forth, studying an inch at a time, and never see the patterns. Like a mouse on a carpet, an archaeologist painstakingly excavating a site might easily miss the whole for the parts.” from Airborne Archaeology, Smithsonian magazine, December 2005 (emphasis mine)

While I don’t see any coffee table books in the near future of text mining (such as The Past from Above: Aerial Photographs of Archaeological Sites), I do think that this idea captures the promise that we have before us in the form of the text mining tools. Everyone in our session seemed to agree that these tools will empower people to do things that no individual could have done in a lifetime by hand. The digital world is producing terabytes of text. We will need text mining tools just to find our way in this blizzard of content. It is all well and good to know that each snowflake is unique – but tell that to the 21st century historian soon to be buried under the weight of blogs, tweets, wikis and all other manner of web content.

Image credit: Drift of Harrachov Mine by alarch via flickr

As is the case with all my session summaries from THATCamp 2008, please accept my apologies in advance for any cases in which I misquote, overly simplify or miss points altogether in the post above. These sessions move fast and my main goal is to capture the core of the ideas presented and exchanged. Feel free to contact me about corrections to my summary either via comments on this post or via my contact form.

Learn About Wikis on Second Life (May 25th, 2008)

May 23, 2008 3 Comments

In case you always wondered how wikis can help archivists, this Sunday (May 25th, 2008) will see archivists gathering in Second Life to answer this question.

When: Sunday May 25th, 9pm-10.30pm GMT (5pm-6:30pm EDT)
Where: Open Air Auditorium at Cybrary City, Second Life

This sounds like a great way to kill two birds with one stone. If you have been looking for a reason to explore Second Life or you have been wondering about how wikis are being used to benefit archives and special collections (or both!) – this looks like a great combination.

Learn more about this event via the Second Life Library Project post How on Virtual Earth can Wikis Help Archivists?.

In the interest of full disclosure – I admit that I won’t be there. The first (and last) time I tried to explore Second Life I got motion sick after about 15 minutes. I understand that this is not very common – but since I am one of those people who get motion sick watching others play 3D video games I wasn’t too surprised. I have a theory about trying again one day with a Second Life expert at my side to help me tweak my settings to the least ‘hand held camera’ version of the Second Life experience – I just haven’t gotten there yet. Any tips from Second Life gurus welcome!

Of Pirates, Treasure Chests and Keys: Improving Access to Digitized Materials

April 23, 2008 1 Comment

Dan Cohen posted yesterday about what he calls The Pirate Problem. Basically the Pirate Problem can be summed up as “there are ways of acting and thinking that we can’t understand or anticipate.” Why is that a ‘Pirate Problem’? Because a pirate pub opened near his home and rather than folding shortly thereafter due to lack of interest from the ‘very serious professionals’ who populate DC suburbs – the pub was a rousing success due to the pirate aficionados who came out of the woodwork to sing sea shanties and drink grog. This surprising turn of events highlighted for him the fact that there are many ways of acting and thinking (some people even know all the words to sea shanties without needing sheet music).

Dan recently delivered the keynote speech at a workshop at the University of North Carolina at Chapel Hill. The workshop brought together dozens of historians to talk about how the 16 million archival documents of the Southern Historical Collection (SHC) should be put online. He devoted his keynote “to prodding the attendees into recognizing that the future of archives and research might not be like the past” and goes on in his post to explain:

The most memorable response from the audience was from an award-winning historian I know from my graduate school years, who said that during my talk she felt like “a crab being lowered into the warm water of the pot.” Behind the humor was the difficult fact that I was saying that her way of approaching an archive and understanding the past was about to be replaced by techniques that were new, unknown, and slightly scary.

This resistance to thinking in new ways about digital archives and research was reflected in the pre-workshop survey of historians. Extremely tellingly, the historians surveyed wanted the online version of the SHC to be simply a digital reproduction of the physical SHC.

Much of the stress of Dan’s article is on fear of new techniques of analysis. The choppy waters of text mining and pattern recognition threaten to wash away traditional methods of actually reading individual pages and “most historians just want to do their research they way they’ve always done it, by taking one letter out of the box at a time”.

I certainly like the idea of new technologically based ways of analyzing large sets of cultural heritage materials, but I also believe that reading individual letters will always be important. The trick is finding the right letter!

And of course – we still need the context. It isn’t as if when we digitize major collections like the SHC that we are going to scan and OCR each page without regard to which box it came out of. We can’t slice and dice archival records and manuscripts into their component parts to feed into text analysis with no way back to the originals.

I like to imagine the combination of all the new technology (be it digitization, cross collection searching, text mining or pattern recognition) as creating keys to different treasure chests. Humanities scholars are treasure hunters. Some will find their gems through careful reading of individual passages. Others will discover patterns spread across materials now co-existing virtually that before digitization would have been widely separated by space and time. Both methods will benefit from the digitization of materials and the creation of innovative search and text analysis tools. Both still require an understanding of a material’s origin. The importance of context isn’t going anywhere – we still need to know which box the letter came from (and in a perfect world, which page came before and which came after). I want scholars to still be able to read one page from the box – I just want them to be able to do it from home in the middle of the night if they are so inclined with their travel budget no worse for wear.

Dan ties his post together by pointing out that:

… in Chapel Hill I was the pirate with the strange garb and ways of behaving, and this is a good lesson for all boosters of digital methods within the humanities. We need to recognize that the digital humanities represent a scary, rule-breaking, swashbuckling movement for many historians and other scholars.

In my opinion, the core message should be that we just found more locked treasure chests – and for those who are interested, we have some new keys that just might open those locks. I enjoyed the Pirate metaphor (obviously) and I appreciate that there are real issues here relating to strong discomfort with the fast changing landscape of technology, but I have to believe that if we do something that prevents historians from being able to read one letter at a time we are abandoning the treasure chests that are already open for the new ones for which we haven’t yet found the right keys. I am greedy. I want all the treasure!

Image credit: key to anything by Stoker Studios via flickr

SAA2008: PDFs of Conference Presentations

March 23, 2008

I found another reason recently to be excited about the progress of SAA’s online presence. Buried in the ARCHIVES 2008: Archival R/Evolution & Identities Checklist for Presenters is first tidbits of a plan to provide access to PDF versions of conference presentations on the SAA website.

Send an Electronic Copy of Your Presentation to SAA. The conference organizers would like to offer meeting attendees the opportunity to view presentations after the conference on the SAA 2008 Annual Meeting website (www.archivists.org). If you’ll supply a copy of your presentation, we’ll convert it to a PDF and post it. Please note that by sending SAA a copy of your presentation in electronic format, you grant permission for your presentation to be viewed by all SAA 2008 Annual Meeting attendees.

I am so pleased! I have always wanted access to the presentations – both for those sessions I attend and those I cannot. I have often been that person hovering at the edge of the stage after a panel, waiting to request a soft copy of the presentation.

I do wonder what they mean when they say that the presentations will be “viewable by meeting attendees”. In my heart of hearts I hope they go a step further and let the speakers sign off on these presentations being shared with the world (or at least with all of SAA). I haven’t gone through every Session Page on the SAA 2007 Un-Official Wiki, but I believe that not very many presenters took the opportunity to provide links to soft copies of their presentations. I hope that SAA is more successful on this front.

No matter the choices made relating to immediate access – I see this as a big step forward in the commitment to using technology. I think one of the best ways to learn is through getting your hands dirty. Technology is listed as one of SAA’s strategic priorities. Every choice that SAA makes that encourages their membership to become more tech-savvy is a step towards supporting that priority.

New Skills for a Digital Era: Official Proceedings Now Available

February 26, 2008 1 Comment

New Skills for a Digital Era Logo From May 31st through June 2nd of 2006, The National Archives, the Arizona State Library and Archives, and the Society of American Archivists hosted a colloquium to consider the question “What are the practical, technical skills that all library and records professionals must have to work with e-books, electronic records, and other digital materials?”. The website for the New Skills for a Digital Era colloquium already includes links to the eleven case studies considered over the course of the three days of discussion as well as a list of additional suggested readings. As mentioned over on The Ten Thousand Year Blog, the pre-print of the proceedings has been available since August, 2007.

As announced in SAA’s online newsletter, the Official Proceedings of the New Skills for a Digital Era Colloquium, edited by Richard Pearce-Moses and Susan E. Davis, is now available for free download. Published under Creative Commons Attribution, this document is 143 pages long and includes all the original case studies. I have a lot of reading to do!

The meat of the proceedings consists of a 32 page ‘Knowledge and Skills Inventory’ and a page and a half of reflections – both co-authored by Richard Pearce-Moses and Susan E. Davis. The Keynote Address by Margaret Hedstrom titled ‘Are We Ready for New Skills Yet?’ is also included.

I am very pleased with how much access has been provided to these materials. These topics are clearly of interest to many beyond the 60 individuals who were able to take part in the original gathering. As an archival studies student it has often been a great source of frustration that so few of the archives related conferences publish proceedings of any kind. It is part of what has driven me to attempt to assemble exhaustive session summaries for those sessions I have personally attended at the past two SAA Annual meetings (see SAA2006 and SAA2007). I think that the Unofficial Conference Wiki for SAA2007 was also a big step in the right direction and I hope it will continue to evolve and improve for the upcoming SAA2008 annual meeting in San Francisco.

The course I elected to take this term is dedicated to studying Communities of Practice. This announcement about the New Skills for a Digital Era’s proceedings has me thinking about the community of practice that seems to currently be taking form across the library, archives and records management communities. I will share more thoughts on this as I sort through them myself.

Finally, a question for anyone reading this post who attended the colloquium: Are you still discussing the case studies with others from that session two years ago? If not, do you wish you were?

Image Credit: The image at the top of this post is from the New Skills for a Digital Era website.

Nurturing Fearlessness in the Face of Computers

September 26, 2007 8 Comments

I have spent a fair amount of time thinking about what makes me different from those around me who proclaim themselves “not techie”. It isn’t that I know more about specific programs. I think it is that I am not afraid.

My friends and family always ask me to find them information online (or the best price for a new camera or the best airfare for their next trip) – and I really don’t think that I know better places to look online before I start. I think that I just don’t give up after the first two tries don’t work!

We talked about some of these ideas in the ‘Archives and Web 2.0 Technologies’ discussion at SAA2007 in Chicago. How do you get folks more comfortable with new technology? How to nurture fearlessness?

Dorthea Salo’s recent post, Training-wheels culture, got me thinking about this again.

She says:

Librarians are a timorous breed, fearful of ignorance and failure. We believe knowledge is power, which taken to an unhealthy extreme can mean that we do not do anything until we think we understand everything. We do not learn by doing, because learning by doing invariably means failure. So a librarian just won’t sit down with AACR2, Connexions, and the AUTOCAT mailing-list archive and work out how to catalogue a novel item. Nor she won’t sit down at the computer and beat software with rocks until it works.

She’ll sit passively, hands in lap, and ask for training, feeling guilty the whole time for displaying ignorance.

I know I do have more experience with some things. Yes, I have spent years designing databases and years building software, but that doesn’t mean that I know how to use a database with which I have never had contact before. The difference between myself and many others is that I am not afraid to just try. I back up often and I give myself permission to make mistakes. There is nothing I can do (short of wielding a hammer) that will break my computer like the one in the image I included above.

So how do we get more archivists and librarians (those in school and those already on the job) comfortable with trying things when they sit down in front of a computer? I have a few ideas.

Full Immersion

This approach would be akin to learning a new language by full immersion, and perhaps it might work well for the same reasons. What if a person was put in front of a computer with 3 programs they don’t know how to use, but have always wished they could get ‘trained’ on? And what if that computer had a handy button for the instructor to instantly reset the student back to a clean slate at any time? I really think a day (or even an afternoon) with the opportunity to play and know that you can’t break anything permanently could be very empowering.

There are self defense training courses that put a ‘mock attacker’ in a big padded suit and coach the student in how to attack them. Because of the big suit they can hit and kick and go full force but know that they won’t really hurt this person. At the same time they teach their body what hitting that hard would feel like. They teach their body how to react to defend itself when they need it. It may seem extreme, but I think that some people need a safe environment to try anything. Go ahead, see what happens if you try! Yeah – it won’t work sometimes.. but other times it will.

The class could start with a lesson in hunting for computer software “how to” answers online. Add some knowledgeable floaters to the room to prevent total frustration from taking its toll (but with strict instructions to avoid too much hand holding) and we might have something.

Scavenger Hunt

People know what a scavenger hunt is – so what if we harnessed that understanding and made them look for stuff online. What if they HAD to use some Web 2.0 style sites to find (and bookmark? and screen capture?) items on their list? This sort of idea could be implemented remotely with new hunts launching on a specific date and time and with a firm deadline so people are urged to stick with it long enough to start feeling comfortable.

I think one of the saddest thing about the whole ‘Web 2.0’ label is that it scares people away. A scavenger hunt that happened to push you into the brier patch hunting for a creative commons licensed photo of a purple duck could having you using those tools before you remembered they were supposed to be threatening.

How Did You Do That?

I have noticed a pattern among those who don’t feel at home on a computer. They will often find one way to do something and then do it that way every time – forever. While I can understand the “if it ain’t broke, don’t fix it” mentality, I also cringe when I see someone take 10 keystrokes to do something that could take 2.

Now, take those same folks and make them watch me do something an easier way – either on their computer or at least with programs they are familiar with on my own. Things I don’t even think twice about can be a revelation. “How did you do that?” is something I frequently hear when I work in front of someone who takes pains to stay within their comfort zone when they use their own computer.

There are a number of different ways this could be implemented in the real world. I imagine a group of tech savvy volunteers willing to be a mentor to someone who works nearby. They could get together once a week for even just an hour – the idea being that the mentor drives and does WHAT the mentee asks to be done, but does it using shortcuts or hunts for the BEST way to do that activity through trial and error.

Film At Eleven

Screencasts are easy to make these days – so what if a different set of mentors were willing to make top five shortcut videos on a per task basis? I can imagine one on ‘top five shortcuts when surfing the web’ and another for ‘top five shortcuts to use on your Windows desktop’. There are certainly plenty of videos that come back on YouTube when I search on windows tips, but it would be more appealing if the videos were made using examples from the point of view of ‘our’ audience – libraries and archives. People can get their heads around new ideas more easily if the context in which they are learning is familiar.

The “Break Me” Challenge

I have a running joke with some of my friends (and many of those around me who write software) that given enough time, I can find a bug in any software I use. I expect this. I know (I was a developer, remember?) that all software has bugs. I know that the only thing a software company who makes something you need has to do is make their software less buggy than the other guy.

What if more people felt a new software tool was a challenge? I must admit, I do feel that way. For me, hunting for obvious bugs is like finding the edges of the world before I start exploring. What are the limits on what I can do here? I want to lift all the rocks and see where the bugs are wiggling so that when I actually must depend on the tool to do something for me I will know where not to step.

Could we create a workshop where people are taught this way of approaching software? I am not talking about teaching people to be pessimistic, rather to make them realize that even when they find a bug it won’t be the end of the world. Trying new things when you first start out rewards you with the knowledge of what your tool can really do. What do most people who buy a new kitchen gadget do when they get home? Find something to cook that requires the gadget. There is no guarantee that it won’t end up in the bottom of your drawer a month later – but at least people are willing to experiment with (and read the instructions for) cooking tools!

Core Education

What all of this was part of an MLS student’s training? I know that there is a basic Information Technology course that is required in the MLS program at my school. I will admit that I didn’t take it (I took an Information Visualization course in the Computer Science department instead) – but I suspect that trying to break software and learn computer shortcuts wasn’t part of the training. No matter who teaches it, the syllabi for the core class always includes a sentence like ‘Become familiar with common information management tools’ in the list of their goals. I think there are a lot of assumptions about what people already know when they get to graduate school. I think we need to acknowledge that every graduate student is not equally comfortable with computers (even those young students right out of college). School is about learning – this should be the easiest place to spread these skills.

Inspiration

Andrea Mercado (of LibraryTechtonics fame) has blogged about both her Geek Out Don’t Freak Out classes and NetGuides program at Reading Public Library. The Geek Out classes are geared toward patrons who have some technology they don’t really know how to use – the digital camera with 50 settings when they only use 2.. or the PDA that they don’t actually know how to sync. The NetGuides are described on the RPL site as “students trained at the library to provide patrons with one-on-one technology answers and personalized instruction”. The topics they cover include basic computing, basic Internet, MP3 players and more (check out all the topics).

Both these examples are real world implementations for the ‘How Did You Do That?’ category above. It is so great to see examples of what can exist. I bet in some institutions there are staff members who need this support as much as their patrons.

Five Weeks to a Social Library was described as “the first free, grassroots, completely online course devoted to teaching librarians about social software and how to use it in their libraries.” I read blog posts from some of the librarians who went through this program (out of the 40 total who took it the first time around) – and they seemed to enjoy their experiences.

Learning 2.0 is an “online self-discovery program that encourages the exploration of web 2.0 tools and new technologies, specifically 23 Things“. I loved seeing that the original creators gave out prizes to their local staff at PLCMC who completed all 23 things (USB MP3 players). From the list on the right side of that Learning 2.0 page, it seems like the idea has spread to other libraries. The PLCMC folks also have created Learning 2.1 – complete with the fabulous tag line “Mashing up 21st century skills with lifelong learning”. They state that the site was created to support on-going learning beyond what was done with Learning 2.0.

Would You Volunteer? Would You Attend?

So.. lots of ideas. There are clearly a number of great programs already out there. Any folks out there want to chat about if they think my ideas would be helpful? Or doable? Or too intimidating? Or overly optimistic?

How can we build on the success of existing programs? Tell me encouraging tales of programs like NetGuides. Point me to other initiatives like Five Weeks and Learning 2.1. Is any of this making it into onto the radar of archivists?

Book Review: Dreaming in Code (a book about why software is hard)

May 24, 2007 1 Comment

Dreaming in Code: Two Dozen Programmers, Three Years, 4,732 Bugs, and One Quest for Transcendent Software
(or “A book about why software is hard”) by Scott Rosenberg

Before I dive into my review of this book – I have to come clean. I must admit that I have lived and breathed the world of software development for years. I have, in fact, dreamt in code. That is NOT to say that I was programming in my dream, rather that the logic of the dream itself was rooted in the logic of the programming language I was learning at the time (they didn’t call it Oracle Bootcamp for nothing).

With that out of the way I can say that I loved this book. This book was so good that I somehow managed to read it cover to cover while taking two graduate school courses and working full time. Looking back, I am not sure when I managed to fit in all 416 pages of it (ok, there are some appendices and such at the end that I merely skimmed).

Rosenberg reports on the creation of an open source software tool named Chandler. He got permission to report on the project much as an embedded journalist does for a military unit. He went to meetings. He interviewed team members. He documented the ups and downs and real-world challenges of building a complex software tool based on a vision.

If you have even a shred of interest in the software systems that are generating records that archivists will need to preserve in the future – read this book. It is well written – and it might just scare you. If there is that much chaos in the creation of these software systems (and such frequent failure in the process), what does that mean for the archivist charged with the preservation of the data locked up inside these systems?

I have written about some of this before (see Understanding Born Digital Records: Journalists and Archivists with Parallel Challenges), but it stands repeating: If you think preserving records originating from standardized packages of off-the-shelf software is hard, then please consider that really understanding the meaning of all the data (and business rules surrounding its creation) in custom built software systems is harder still by a factor of 10 (or a 100).

It is interesting for me to feel so pessimistic about finding (or rebuilding) appropriate contextual information for electronic records. I am usually such an optimist. I suspect it is a case of knowing too much for my own good. I also think that so many attempts at preservation of archival electronic records are in their earliest stages – perhaps in that phase in which you think you have all the pieces of the puzzle. I am sure there are others who have gotten further down the path only to discover that their map to the data does not bear any resemblance to the actual records they find themselves in charge of describing and arranging. I know that in some cases everything is fine. The records being accessioned are well documented and thoroughly understood.

My fear is that in many cases we won’t know that we don’t have all the pieces we need to decipher the data until many years down the road leads me to an even darker place. While I may sound alarmist, I don’t think I am overstating the situation. This comes from my first hand experience in working with large custom built databases. Often (back in my life as a software consultant) I would be assigned to fix or add on to a program I had not written myself. This often feels like trying to crawl into someone else’s brain.

Imagine being told you must finish a 20 page paper tonight – but you don’t get to start from scratch and you have no access to the original author. You are provided a theoretically almost complete 18 page paper and piles of books with scraps of paper stuck in them. The citations are only partly done. The original assignment leaves room for original ideas – so you must discern the topic chosen by the original author by reading the paper itself. You decide that writing from scratch is foolish – but are then faced with figuring out what the person who originally was writing this was trying to say. You find 1/2 finished sentences here and there. It seems clear they meant to add entire paragraphs in some sections. The final thorn in your side is being forced to write in a voice that matches that of the original author – one that is likely odd sounding and awkward for you. About halfway through the evening you start wishing you had started from scratch – but now it is too late to start over, you just have to get it done.

So back to the archivist tasked with ensuring that future generations can make use of the electronic records in their care. The challenges are great. This sort of thing is hard even when you have the people who wrote the code sitting next to you available to answer questions and a working program with which to experiment. It just makes my head hurt to imagine piecing together the meaning of data in custom built databases long after the working software and programmers are well beyond reach.

Does this sound interesting or scary or relevant to your world? Dreaming in Code is really a great read. The people are interesting. The issues are interesting. The author does a good job of explaining the inner workings of the software world by following one real world example and grounding it in the landscape of the history of software creation. And he manages to include great analogies to explain things to those looking in curiously from outside of the software world. I hope you enjoy it as much as I did.

Category: learning technology