Menu Close

Year: 2009

DH2009: Digital Lives and Personal Digital Archives

Session Title: Digital Lives: How people create, manipulate and store their personal digital archives
Speaker: Peter Williams, UCL

Digital lives is a joint project of UCL, British Library and University of Bristol

What? We need a better understanding of how people manage digital collections on their laptops, pdas and home computers. This is important due to the transition from paper-based personal collections to digital collections. The hope is to help people manage their digital archives before the content gets to the archives.

How? Talk to people with in-depth narrative interview. Ask people of their very first memories of information technology. When did they first use the computer? Do they have anything from that computer? How did they move the content from that computer? People enjoyed giving this narrative digital history of their lives.

Who? 25 interviewees – both established and emerging people whose works would or might be of interest to repositories of the future.

Findings?

  • They created a detailed flowchart of users’ reported process of document manipulation.
  • Common patterns in use of email showed that people used email across all these platforms and environments. Preserving email is not just a case of saving one account’s messages:
    • work email
    • Gmail/Yahoo
    • mails via Facebook
    • Twitter
  • Documented personal information styles that relate skills dimension to data security dimension.

The one question I caught was from someone who asked if they thought people would stop using folders to organize emails and digital files with the advent of easy search across documents. The speaker answered by mentioning the revelations in the paper Don’t Take My Folders Away!. People like folders.

My Thoughts

This session got me to think again about the SAA2008 session that discussed the challenges that various archivists are facing with hybrid literary collections. Matthew Kirschenbaum also pointed me to MITH’s white paper: Approaches to Managing and Collecting Born-Digital Literary Materials for Scholarly Use.

I am very interested to see how ideas about preserving personal digital records evolve. For example, what happens to the idea of a ‘draft’ in a world that auto-saves and versions documents every few minutes such as Google Documents does?

With born digital photos we run into all sorts of issues. Photos that are simultaneously kept on cameras, hard drives, web based repositories (flickr, smugmug, etc) and off-site backup (like mozy.com). Images are deleted and edited differently across environments as well. A while back I wrote a post considering the impact of digital photography on the idea of photographic negatives as the ‘photographers’ sketchbooks’: Capa’s Found Images and Thoughts on Digital Photographers’ Sketchbooks.

I really liked the approach of this project in that it looked at general patterns of behavior rather than attempting to extrapolate from experiences of archivists with individual collections. This sort of research takes a lot of energy, but I am hopeful that basically creating these general user profiles will lead to best practices for preserving personal digital collections that can be applied easily as needed.

As is the case with all my session summaries from DH2009, please accept my apologies in advance for any cases in which I misquote, overly simplify or miss points altogether in the post above. These sessions move fast and my main goal is to capture the core of the ideas presented and exchanged. Feel free to contact me about corrections to my summary either via comments on this post or via my contact form.

Yahoo & Google’s Search for Reusable Images and the Flickr Commons

When I read about Yahoo Image Search’s recent addition of a filter to return only creative commons Flickr images, I got all excited about what this might mean for images in the Flickr Commons. So I raced off to the Yahoo Image Search page to see how it works. The short answer is that the new special rights setting of  no known copyright restrictions that they created for members of the Flickr Commons apparently doesn’t count.

For my test I searched for an exact match on “Ticket with portrait of George Washington”. This returns one result – the one image in Flickr with the same name, from The Field Museum in Flickr Commons. If you click on the ‘More Filters’ link, you will see other ways to filter your Creator permits reuse - Yahoo image searchresults – including the option to restrict your results to only include images whose creators permit reuse.

Next I clicked in the ‘Creator allows reuse’ and my one result disappeared! Quite disappointing in my book.

Google is also getting onto the ‘make it easy to search for reusable images’ bandwagon. Search Engine Land reported that Google Images Quietly Adds Creative Commons Filter. That post pointed me to Google  Operating System‘s search interface that lets you play with the options that Google has available. After a clicking through to some of the images returned by a Google Image Search for creative commons images of archives, the way the Google model appears to work is to look for creative commons badges or links on the page with the image. I even found Flickr creative commons images, but when I tried to find my Flickr Commons image of the ticket used above for my Yahoo image search experiment it wasn’t returned by Google either.

So if an archives (or museum or library) posts images on a page that indicates that the content is licensed under creative commons, it seems those images will then appear in Google’s image search as reusable. That is good news! Another way to get users to find your public domain images.

The question I am left is how to resolve the gap between Flickr Commons’ ‘no known copyright restrictions  rights statement and both Google and Yahoo’s definition of reusable content.

Archivists and New Technology: When Do The Records Matter?

Navigating the rapidly changing landscape of new technology is a major challenge for archivists. As quickly as new technologies come to market, people adopt them and use them to generate records. Businesses, non-profits and academic institutions constantly strive to find ways to be more efficient and to cut their budgets. New technology often offers the promise of cost reductions. In this age of constantly evolving software and technological innovation, how do archivists know when a new technology is important or established enough to take note of? When do the records generated by the latest and greatest technology matter enough to save?

Below I have include two diagrams that seek to illustrate the process of adopting new technology. I think they are both useful in aiding our thinking on this topic.

The first is the “Hype Cycle“, as proposed by analyst Jackie Fenn at Gartner Group. It breaks down the phases that new technologies move through as they progress from their initial concept through to broad acceptance in the marketplace. The generic version of the Hype Cycle diagram below is from the Wikipedia entry on hype cycle.

Gartner Hype Cycle (Wikipedia)

Each summer, Gartner comes out with a new update on Where Are We In The Hype Cycle?. Last summer, microblogging was just entering the ‘Peak of Inflated Expectations’, public virtual worlds were sliding down into the ‘Trough of Disillusionment’ and location aware applications were climbing back up the ‘Slope of Enlightenment’. There is even a book about it: Mastering the Hype Cycle: How to Choose the Right Innovation at the Right Time.

The other diagram is the Technology Adoption Lifecycle from Geoffrey Moore’s Crossing the Chasm. This perspective on the technology cycle is from the perspective of bringing new technology to market. How do you cross the chasm between early adopters and the general population?

Technology Adoption Lifecycle (Wikipedia)

Archivists need to consider new technology from two different perspectives. When to use it to further their own goals as archivists and when to address the need to preserve records being generated by new technology. A fair bit of attention has been focused on figuring out how to get archivists up to speed on new web technology. In August 2008, ArchivesNext posted about hunting for Web 2.0 related sessions at SAA2008 and Friends Told Me I Needed A Blog posted about SAA and the Hype Cycle shortly thereafter.

But how do we know when a technology is ‘important enough’ to start worrying about the records it generates? Do we focus our energy on technology that has crossed the chasm and been adopted by the ‘early majority’? Do we watch for signs of adoption by our target record creators?

I expect that the answer (such as there can be one answer!) will be community specific. As I learned in the 2007 SAA session about preserving digital records of the design community, waiting for a single clear technology or software leader to appear can lead to lost or inaccessible records. Archivists working with similar records already come together to support one another through round tables, mailing lists and conference sessions. I have noticed that I often find the most interesting presentations are those that discuss the challenges a specific user community is facing in preserving their digital records. The 2008 SAA session about hybrid analog/digital literary collections discussed issues related to digital records from authors. Those who worry about records captured in geographic information systems (GIS) were trying to sort out how to define a single GIS electronic record when last I dipped my toes into their corner of the world in the Fall of 2006.

It is not feasible to imagine archivists staying ahead of every new type of technology and attempting to design a method for archiving every possible type of digital records being created. What we can do is make it a priority for a designated archivist within every ‘vertical’ community (government, literary, architecture… etc) to keep their ear to the ground about the use of technology within that community. This could be a community of practice of its own. A group that shares info about the latest trends they are seeing while sharing their best practices for handling the latest types of records being seen.

The good news is that archivists aren’t the only ones who want to be able to preserve access to born digital records. Consider Twitter, which only provides easy access to recent tweets. A whole raft of third-party tools built to archive data from Twitter are already out there, answering the demand for a way to backup people’s tweets.

I don’t think archivists always have the luxury of waiting for technology to be adopted by the majority of people and to reach the ‘Plateau of Productivity’. If you are an archivist who works with a community  that uses cutting edge technology, you owe it to your community to stay in the loop with how they do their work now. Just because most people don’t use a specific technology doesn’t mean that an individual community won’t pick it up and use to the exclusion of more common tools.

The design community mentioned above spoke of working with those creating the tools for their community to ensure easy archiving down the line. In our fast paced world of innovation, a subset of archivists need to stay involved with the current business practices of each vertical being archived. This group can work together to identify challenges, brainstorm solutions, build relationships with the technology communities and then disseminate best practices throughout the archives community. I did find a web page for the SAA’s Technology Best Practices Task Force and its document Managing Electronic Records and Assets: A Working Bibliography, but I think that I am imagining something more ongoing, more nimble and more tied into each of the major communities that archivists must support. Am I describing something that already exists?

University of Maryland: Benefits of Blogging Workshop (May 6, 2009)

There are still spaces available in a workshop I am giving May 6, 2009 at the University of Maryland’s iSchool. The workshop, titled Benefits of Blogging: Why you should start a blog today!, is free and open to anyone in the University of Maryland community.

This is the workshop description:

Blogging is an easy way to build your professional network, improve your writing and get your ideas out there. Information professionals need to understand how to take advantage of the promise of blogs, both to support their careers as well as a tool for institutions. This workshop will be led by an active blogger who has found great success in becoming part of a broader community via her blog. Learn about free tools, things to keep in mind and why you should start a blog today.

When: 5pm Wednesday May 6, 2009

Where: iSchool Student Lab, Hornbake South room 2108

Registration: Maryland iSchool Workshop Registration

Are you interested in this session, but not affiliated with the University of Maryland? Please let me know, either via my contact form or a comment below, and I will see what I can do about putting together another session off-campus.

ArchivesZ Poster Wins 2nd Place at GRID 2009

2nd PlaceThe title says it all. I won 2nd place in the “Smart Computers and Computing” section of the University of Maryland’s Graduate Research Interaction Day (GRID) for my poster ArchivesZ: Visualizing Archival Collections (what is in all those boxes?).

1st place in “Smart Computers and Computing” went to the fabulous Dave Levin for his presentation on TrInc: Small Trusted Hardware for Large Distributed Systems.

Overall, it was a great experience. I wish I could have been in multiple rooms at the same time so I could have seen more posters and presentations. I also wished I had understood that I could have presented with either a poster or a power point deck. That was not entirely clear ahead of time. The downside of of my choice was being tied to my poster, but the upside is that I still have the poster that can be examined by readers like you. Obviously it all worked out in the end.

A big thanks to everyone in the Graduate Student Government who worked so hard to bring this event together.

Warner Brothers Archive DVDs: Classic Movies On-Demand

The latest example of a media company finding a way to profit from their archives, Warner Brothers has launched the Warner Brothers Archive. Nestled neatly within the the WBshop.com website, among the TV shows and promotional merchandise, the movies from the archives include everything customers have come to expect from an online shop. We have user reviews, video clips and the ways to share links. You can browse by genre or decade. They are currently holding a vote to see what title should be added to the inventory next.

One of the films available from the archives is the 1975 action feature Doc Savage: The Man of Bronze. Embedded below is a 30 second clip showing Doc Savage entering his “Fortress of Solitude”. They could have made it easier for me to embed this (I had to go figure out how to embed FLV files into this blog post) – but I am happy that they let me embed it at all. If you don’t see a video below, you probably need to install adobe’s shockwave. You can always go watch the clip on the Doc Savage page (click on Video Trailers & Clips).

Each film page carefully notes “This film has been manufactured from the best-quality video master currently available and has not been remastered or restored specifically for this DVD and On Demand release.” and then directs the customer to view the preview clip to evaluate the film’s quality.

The details comes out when we dig into the Warner Archive FAQ. It is here that we learn that the DVDs we can purchase for $19.95 are produced “on-demand”. How are they different from the DVD’s you buy at the store?

DVD’s produced on-demand are similar to, but not quite same as, DVD’s you’d buy at the local video store. DVD movies you buy at the local video outlet are manufactured from a mold via a stamping process whereas on-demand DVDs are “burned”. Each carries information read by the DVD player, but the physical properties of the two are different.

Most DVD players are compatible with both commercial DVD-Video and one or more of the “recordable DVD formats. Our on-demand DVD’s are manufactured using the most widely accepted format, DVD-R.

They also answer this question about copying the DVDs:

Q: I’m trying to make a few extra copies of my DVD, for “safe keeping” and for a surprise present to my mom. When I copied the disc it was un-playable. Why is that? And what can I do about it?

A: This DVD on-demand disc was recorded using CSS encryption. CSS is designed to prevent unauthorized reproduction of the DVD. We’re delighted that you’d like to surprise your mother with the gift of a Warner Bros classic movie. May we suggest she’d like an officially produced and packaged DVD even more? As such we welcome your visit back to the Warner.com classic store at any time.

In addition to being able to purchase DVD-Rs with CSS encryption, many of the archives films permit a download option. Archives movie downloads appear to cost $14.95. The Digital Products FAQ explains the details, but these are the highlights of what comes along with that $5 in savings:

  • Downloads are protected by DRM
  • Downloads only play on MS Windows boxes – no Mac or Linus support
  • You can burn the movie to a CD or DVD, but they “are Digital Rights Management (DRM) protected, so you will only be able to watch the video on the computer or device on which it was originally purchased.”

I give a big thumbs up to Warner Brothers for coming up with a way to leverage their archives. I am less impressed with the non-open format and DRM restrictions they are placing on both the DVD-Rs and downloads. A model that states that a purchased download can be played as often as I want – but requires a specific operating system and only permits play on the same machine from which I made the purchase seems untenable. If I were to buy one of these films, I would spend the extra $5 and get the DVD-R which at least can be played on multiple machines, even if it can never be copied!