Menu Close

Category: archival community

New Skills for a Digital Era: Official Proceedings Now Available

New Skills for a Digital Era LogoFrom May 31st through June 2nd of 2006, The National Archives, the Arizona State Library and Archives, and the Society of American Archivists hosted a colloquium to consider the question “What are the practical, technical skills that all library and records professionals must have to work with e-books, electronic records, and other digital materials?”. The website for the New Skills for a Digital Era colloquium already includes links to the eleven case studies considered over the course of the three days of discussion as well as a list of additional suggested readings. As mentioned over on The Ten Thousand Year Blog, the pre-print of the proceedings has been available since August, 2007.

As announced in SAA’s online newsletter, the Official Proceedings of the New Skills for a Digital Era Colloquium, edited by Richard Pearce-Moses and Susan E. Davis, is now available for free download. Published under Creative Commons Attribution, this document is 143 pages long and includes all the original case studies. I have a lot of reading to do!

The meat of the proceedings consists of a 32 page ‘Knowledge and Skills Inventory’ and a page and a half of reflections – both co-authored by Richard Pearce-Moses and Susan E. Davis. The Keynote Address by Margaret Hedstrom titled ‘Are We Ready for New Skills Yet?’ is also included.

I am very pleased with how much access has been provided to these materials. These topics are clearly of interest to many beyond the 60 individuals who were able to take part in the original gathering. As an archival studies student it has often been a great source of frustration that so few of the archives related conferences publish proceedings of any kind. It is part of what has driven me to attempt to assemble exhaustive session summaries for those sessions I have personally attended at the past two SAA Annual meetings (see SAA2006 and SAA2007). I think that the Unofficial Conference Wiki for SAA2007 was also a big step in the right direction and I hope it will continue to evolve and improve for the upcoming SAA2008 annual meeting in San Francisco.

The course I elected to take this term is dedicated to studying Communities of Practice. This announcement about the New Skills for a Digital Era’s proceedings has me thinking about the community of practice that seems to currently be taking form across the library, archives and records management communities. I will share more thoughts on this as I sort through them myself.

Finally, a question for anyone reading this post who attended the colloquium: Are you still discussing the case studies with others from that session two years ago? If not, do you wish you were?

Image Credit: The image at the top of this post is from the New Skills for a Digital Era website.

Chapters and Loose Papers: A Newsletter for Students of Archival Science

Chapters and Loose Papers

Volume 2, Issue 1 of the student publication Chapters and Loose Papers is now online. Quoting the publication’s About Page: “Chapters and Loose Papers is the official SAA newsletter for students of Archival Science.”

Congratulations to the full editorial board listed in the current issue: Walter Butler (UCLA), Maureen Callahan, and Andrea Medina-Smith (Simmons College). It is a nice mix of reports from student SAA chapters, book reviews and short essays on a variety of topics. The essays included cover archives in the news, special projects and technology topics. On a personal note, I was pleased to see abbreviated versions of two of my blog posts officially ‘in print’.

For those of you who are students, Chapters and Loose Papers is looking for submissions for Volume 2, Issue 2. The deadline is March 1, 2008 and you can e-mail your writing directly to walterb333@aol.com. The official theme for this issue is Community Service. Submissions are welcome from Student SAA Chapters as well as individuals.

Topics of interest listed for this issue are:

  • Student Chapter Happenings
  • Student Projects:
    • Papers
    • Research Pursuits
    • Community Involvement
  • Internship Experiences
  • Technology
  • Archives in The World:
    • Current Events
    • Pop Culture
    • Literature Reviews

So what are you waiting for? Go read the current issue and consider submitting content for the next one. Writing is good for you – and the more interesting stuff we all submit, the more fabulous each issue of Chapters and Loose Papers will become.

Caring for Special Collections: Exploring the Connecting to Collections Bookshelf

Connecting to Collections BookshelfI subscribe to the RSS feed from the Institute of Museum and Library Services (IMLS), and so saw a press release encouraging institutions to apply for the free IMLS Connecting to Collections Bookshelf.

The IMLS Connecting to Collections Bookshelf is intended to provide small and medium-sized libraries and museums with essential resources needed to improve the condition of their collections. The Bookshelf includes books, DVDs, and other collections resources, as well as a Guide to Online Resources and a User’s Guide to all of the materials. It addresses such topics as the philosophy and ethics of collecting, collections management and planning, emergency preparedness, and culturally specific conservation issues.

The Heritage Preservation has created both a 48 page Bookshelf User’s Guide, with a page dedicated to each resources selected for the bookshelf, and a Guide to Online Resources to be used as a companion to the bookshelf. The Bookshelf User’s Guide has a brilliant section at the end giving you pointers to specific sections of the various Bookshelf resources to answer special questions – such as ‘Where can we find information on raising funds for collections care?’ and ‘How can I prioritize the needs of our collections?’.

What is interesting is that it took me a while to realize that each of the institutions that is awarded The Bookshelf will actually receive the books. My past experience with O’Reilly’s Safari Books Online made me assume that the books would be only accessed online. The Safari Books Online site requires a paid membership, but then provides access to an ever growing electronic reference library. The total number of resources is listed as currently over 5,000. One level of membership, Safari Library, provides unlimited access to all the resources (currently listed as $42.99 a month or $472.89 per year) while the less expensive membership level, Safari Bookshelf (currently listed as $22.99 a month or $252.99 a year), provides access to up to ten titles at a time.

Seeing those prices got me wondering, what will the receivers of this bookshelf be getting and what it’s total cost would be? I found my way to a list of the books and resources that will be included. Between the Internet and the 48 page guide to the Bookshelf I found the following information about each element of the Bookshelf. IMLS has broken the bookshelf down into three subsections as shown below:

Bookshelf: The Core Collection

Bookshelf: Nonliving Collections

Bookshelf: Living Collections

Grand Total

The maximum cost (with no membership discounts) to purchase all the components of The Bookshelf would be $951.87. Add in the cost of shipping and printing your own copies from the free downloads and we can probably talk about the monetary value of the Bookshelf being approximately $1000!

Online Acces

While researching all of this I came across a new option on Amazon.com – something they are calling Amazon Upgrade. For an additional fee above and beyond the price you pay for the physical book – you can have immediate and permanent online access to the content of that book. Take a look at the offering explained on the Amazon page for The National Trust Manual of Housekeeping: The Care of Collection in Historic Houses Open to the Public. I assume that they plan to increase the titles for which this is an option. If so, I can envision building an online reference shelf of one’s own – one title at a time. Rather than deciding that something like O’Reilly’s Safari Books Online has enough books to make it worth while for you – you will create your own custom online reference shelf.

The other half of the online access story is of course the number of resources that are posted online for free download (or as living HTML documents being updated over time). These are all the resources from the list above that can be downloaded for free:

What if all the resources that those who care for collections need were available via an online bookshelf? Now that would be an amazing resource for which many would be happy to pay an annual fee. Perhaps it could be provided as part of the membership fee for one or more of the appropriate professional organizations. An additional benefit to an online collection is the opportunity to receive automatic updates and new editions. I will also keep an eye on the Amazon Upgrade option to see how easy it is for someone to build their own online reference shelf – but I think a purposeful online collection designed for cultural heritage institutions would be even more compelling.

Getting the Bookshelf

A lot of organizations have already received the Bookshelf, but the press release that got me looking at all this mentioned that the next (final?) application period will be from March 1 through April 30, 2008. Recipients will be announced in July of 2008.

If you are considering applying you can find more details about the application process and review the questions you must answer online. But even for those that don’t qualify (federally operated and for-profit institutions are not eligible) – the Bookshelf User’s Guide, the Guide to Online Resources and those resources that may be downloaded for free provide a powerful combination of materials to support institutions and individuals as they care for collections of all shapes and sizes.

Note: All prices quoted in this post were valid as of January 27th, 2008. Image shown above from IMLS Connecting to Collections Bookshelf page.

SAA2007: Archives and E-Commerce, Three Case Studies (Session 404)

George Washington US DollarDiane Kaplan, of Yale University Library’s Manuscripts and Archives unit, started off Session 404 (officially titled Exploring the Headwaters of the Revenue Stream) by thanking everyone for showing up for the last session of the day. This was a one hour session that examined ways to generate new funds through e-commerce . Three different e-commerce case studies were presented, followed by a short question and answer period.

University of Wyoming’s American Heritage Center

Mark Shelstad‘s presentation, “Show Me the Money: Or: How Do We Pay for This?”, detailed the approach taken by the University of Wyoming‘s American Heritage Center (AHC) to find alternate revenue streams. After completing a digitization project in the fall of 2004, the AHC had to figure out how to continue their project after their original grant money ran out.

Since they didn’t have a lot of in-house resources, they chose Zazzle.com for their effort to profit from their existing high resolution images. They can earn up to 17% from the sales through a combination of affiliate sales and profits from the sale of products featuring American Heritage Center images.

They had a lot of good reasons for choosing Zazzle.com. Zazzle.com already had an existing ‘special collections’ area, meaning that their images would have a better chance of being found by those interested in their offerings (for example – take a look at the Library of Congress Vintage Photos store). Zazzle.com also did not require an exclusive license to the images. The American Heritage Center Zazzle on-line store opened in 2005.

Currently they are making about $30 a month in royalties from 200 images. Mark pointed out that everyone needs to keep in mind that the major photo provider, Corbis, has yet to turn a profit in online photo sales. He also mentioned a website called Cogteeth.com that lets you click on any image and use those images on t-shirts, mugs.. etc.

Near the end of his talk, Mark shared an amazing idea to create a non-profit that would be a joint organization for featuring and selling products using archival images. I love it! It is easy to see that many archives are small and don’t have the infrastructure to create and run their own e-commerce websites. At the same time, general sites that let anyone set up a store to sell items with custom images on them threaten to loose the special nature of historical images in the shuffle. Even the special collections section of Zazzle lumps the American Heritage Center and the Library of Congress collections with Disney and Star Wars. I would love to see this idea grow!

Minnesota Historical Society

Kathryn Otto of the Minnesota Historical Society (MHS) spoke next. She first gave an overview of traditional services provided by MHS for a fee, such as photocopies, reader-printer copies, microfilm sales, media sales, inter-library loan fees, classes and photograph sales. MHS also earned income via standard use fees and research services.

The first e-commerce initiative at MHS was the sale of Minnesota State Death Certificates from 1904 – 2001. Made available via the Minnesota Death Certificate Index they provide the same data as Ancestry.com, but the MHS index provides a better search interface. They have had users tell them that they couldn’t find something on Ancestry.com – but that they were able to find what they needed on the MHS site.

To their existing Visual Resources Database, MHS also added a buy button for most images. Extra steps were added into the standard buy process to deal with the addition of a use fee depending on how the purchaser claims the image will ultimately be used. One approach that did not work for them was to offer expensively printed pre-selected images. The historical society sells classes online and can handle member vs non-member rates. TheVeterans Graves Registration Index is a tiny database that was created by reusing the interface used for the death certificates.

The Birth Certificate Index provides “single, non-certified copies of individual birth certificates reproduced from the originals” via the website.. while “[o]fficial, certified copies of these birth certificates are available through the Minnesota Department of Health.” The MHS site provides much faster and easier service than the Department of Health as can be seen from this page detailing how to order a non-certified copy of a birth record from the DOH – which requires printing, filling out and either faxing or snail mailing a form.

Features to keep in mind as you branch into in e-commerce:

  • Statistics – Consider the types of statistics you want. Their system just gave them info about orders – not how much they made.
  • Sales tax – Figure out how is it handled
  • Postage/Handling fees – Look at the details! The MHS Library-Archives was stuck with the Museum Store’s postage rates because the e-commerce system could not handle different fees for different types of objects.
  • Can’t afford credit card fees? Consider PayPal.
  • Advertise what you are selling on your own website.

Godfrey Memorial Library, Middletown, CT

The final panelist was Richard Black, Director of the Godfrey Memorial Library in Middletown, Connecticut. The Godfrey is a small, non-profit, genealogical research library with approximately 120,000 genealogical items. They currently have 5 full time staff and 60 volunteers.

Services they provide:

About 3 years ago they had exhausted all of their endowment money and faced the strong possibility of closing the doors. They were down to one full time librarian and a few volunteers and were dependent mostly on donations and some minor income from other sources/services.

They had only a few options open to them:

  • find more money from other sources
  • merge with another library
  • close the doors
  • sell some of the content
  • others??

The first approach to raise funds was to create a subscription website. The Godfrey acquired Heritage Quest census records and added other databases as resources allowed. Subscriptions were sold for $35 a year. The board thought they might be lucky to get 100 subscriptions.. but they actually got approximately 14,000!

Now the portal provides access to sites for which a premium has been paid (so that subscribers don’t have to pay), sites that are available free on the Internet (but made easier to find) and sites unique to Godfrey, including digitized material in the library and other material that has been made available to them. They just added 95,000 Jewish grave-sites – brought to them by a local rabbi. Another recent addition was a set of transcriptions of a grave-site made as an Eagle Scout project. They also negotiated to have their books digitized for them for free. The company performing the digitization will pay a royalty to Godfrey as the books are used.

The costs to acquire data for the portal includes $60,000 a year for access to premium sites, the cost to digitize and transcribe unique content (there are opportunities to partner and reduce costs) and the cost to acquire patrons. The efforts of the Godfrey staff and volunteers is ‘free’ – but costs time.

The Godfrey subsequently lost access to the Heritage Quest material. This was like taking the anchor store out of the corner of a mall. It forced them to diversify their revenue streams and watch for new opportunities.

Current revenue source distribution:

  • online portal 45%
  • annual appeal 10%
  • patron requests 5%
  • contract services 35% (OCLC analytical cataloging that they do)
  • misc 5%

The endowment funds have been restored and the Godfrey’s staff is now growing again.

Questions

Question: Did you meet resistance in your institutions?
Answer: No.. Minnesota said they had such success that the 2 questions they here now are A) What do we put online next? B) How long can they protect their income from the rest of the institution?

Question: (From someone from a NJ archives) Is there a way to do e-commerce with government records and not have the money ‘stolen’ from them?
Answer: Minnesota – The department of health was happy for death and birth certificates business to go away? They do worry about the future when they might try to make a marriage index – because that territory is already ‘owned’ by a group that wants to keep that income.

Question: When you charge for use fees – are there people who don’t pay them?
Answer: Minnesota: Probably – no way to really know.
Mark (American Heritage Center): Our images are public domain – they can do what they like with them.

Question: Do you brand your images?
Answer: Mark: Yes.. a logo and URL goes with the images.

My Thoughts

I was particularly impressed by how much information was conveyed in the course of the 1 hour session. My personal highlights were:

  • As I mentioned above, I want Mark’s idea for a non-profit to sell co-located products based on archival images to gain support and momentum.
  • I was pleased by the point that the MHS makes money from their Minnesota Death Certificate Index partly due to their improved and powerful search interface. The data is available elsewhere – but they made it easier to find information, so they will become the destination of choice for that information.
  • The Godfrey’s story is inspirational. In an age when we hear more and more often about archives and libraries being forced to cut back services due to funding shortfalls, it is great to hear about a small archives that pulled themselves back from the brink of disaster by brave experimentation.

These three case studies gave a great glimpse of some of the ways that archives can get on the e-commerce bandwagon. There is no magic here – just the willingness to dig in, figure out what can be done and try it. That said – there is definitely lots of room to learn from others successes and mistakes. The more real world success and failure stories archives share with the archival community about how to ‘do’ e-commerce, the easier it will be for each subsequent project to be a success.

As is the case with all my session summaries from SAA2007, please accept my apologies in advance for any cases in which I misquote, overly simplify or miss points altogether in the post above. These sessions move fast and my main goal is to capture the core of the ideas presented and exchanged. Feel free to contact me about corrections to my summary either via comments on this post or via my contact form.

October is American Archives Month

SAA American Archives MonthWith barely more than a week to go, I am finally getting my act together to mention American Archives Month. To check if there are activities somewhere near you, go to the very thorough Council of State Archivists listing of activities for American Archives Month 2007. If you love looking at posters take a look at their awsome Archives Week/Month Poster Gallery.

The Society of American Archivists came out with a great array of resources in support of the celebration this year. I especially like the How To Know If Something Is Newsworthy and Tips for Media Interviews fliers – but if you download only one document to look through – make it the American Archives Month Public Relations Kit. They get to the heart of one of my favorite sentiments – keep archival records in front of the eyes of the everyday person. I don’t mean that in a sensational way… I don’t want archives in the news for screwing up (even if they do say that any publicity is good publicity). I want every news story that could have the support of archival records to use them and acknowledge them. I want every middle school kid who lives in a town with an archives to know that it exists and to have some idea why they should care. I want every teacher who has an archives with an enthusiastic archivist in it near them to KNOW about that enthusiastic archivist and use the available resources to make their lessons richer.

American Archives Month is a great vehicle for reaching out and pulling people in the doors. It could also be used as an opportunity for archives to try new programs and judge their popularity before rolling out sessions to be held throughout the year.

Finally – another way to get a sense of what is happening locally is to keep an eye on at what other blogs are posting about American Archives Month 2007.

Reflections on SAA2007 and Ten Tips for an Optimal Conference Experience

I want to start off by saying that I really enjoyed SAA 2007. I met amazing people. I went to sessions that made me think. I gave my first SAA conference presentation. I handed out dozens of cards for the unofficial SAA 2007 wiki and for this blog. I brainstormed ideas for sessions, workshops, books and articles. I have seeds for more projects than a single person could start (let alone finish) in a year.

I will be posting session summaries for a number of the sessions I attended over the course of the next week. I have also added a link to my presentation slides on the new Presentations page (note the optimistic use of plural in the page’s name).

My brain is still buzzing from the whirlwind that was SAA 2007 for me, but I have created a list of the top 10 basic conference attending tips that I (re)discovered during the conference and hope to remember for SAA 2008 (and any other conference I attend):

10: Eat more often. Eat real food. Hors d’oeuvres don’t count.

9: Going full throttle without any breaks for more than one day is impossible. At some point my brain won’t take in new information and all I want to do is sit and think about a session I went to yesterday.

8: You never know which sessions will be your favorites. It always happens that at least one session I wasn’t so sure about knocks my socks off — while another that I was so excited about drives me back out the door after 10 minutes.

7: Always bring an extra jacket.

6: Make new friends. Cultivate your inner extrovert. Be bold and introduce yourself. Never assume that everyone around you knows each other – do the kind thing and initiate introductions. This gets easier the more you practice. And don’t worry – everyone forgets names, that is part of the reason they give us those snazzy name tags and insist we wear them.

5: Bring twice as many business cards as you think you need.

4: Don’t have cards? Make them! I have used both VistaPrint and GotPrint. VistaPrint has a set of designs that they will print for free (with their logo on the back). Gotprint makes super lush, shiny cards on nice heavy stock. Both include online tools to create your card – but will also let you upload a PDF if you want to use Photoshop to do something more graphically inspired. If you ended up with either my Spellbound Blog card or the 2007 Wiki card in your stack of cards, you have a sample of what GotPrint can create.

3: Bring the big book they send you in the mail that describes all the sessions. The on site booklet only has the session titles – and often that isn’t enough information to make your choices.

2: Do the fun stuff! It is a good way to force your brain to take a break. It also gives you a chance to meet new people (see tip #6 above).

1: Be flexible. Plans change, opportunities for networking, brainstorming and being exposed to new ideas are around every corner. The choice to NOT attend a session you meant to go to almost always means it will be replace by something else – likely better than what you had planned to do anyway.

Now.. if I can just remember to look at this before I head out to SAA 2008!

Thank you again to everyone who made this conference open and welcoming. I enjoyed meeting so many fabulous new people and I hope to stay in touch with you all (and remember all your names).

SAA2007: Opening Plenary Session Ponders Diversity

In his introduction, Bruce Bruemmer began with a disarming “Thank you disembodied voice” – and merrily rolled along through a short, cheery and heartfelt introduction for SAA president Elizabeth W. Adkins. He saved time (and likely vocal stress) by prerecording a YouTube video enumerating Adkins’s accomplishments . He led rounds of applause for Adkins’s father, aunt, uncle and husband. Bruemmer claims her only fault is that she is too serious. That she did not perceive the inherent humor of Velveeta and Miracle Whip concerned him.

He finally found the chink in her armor when he broke down laughing at the apparently often repeated J. L. Kraft quote “What we do, we do do” – and at this she finally admitted that it was ‘a little funny’.

Elizabeth Adkins’s Plenary Speech

Adkins began her talk by leading the hall in applauding the program committee, the host committee, the sponsors, past presidents, international visitors, and council members – each in turn.

She then made an exciting announcement – American Archivist is being made available online! If you are onsite at the conference, there will be a peek at the beta version on display on Friday in the Embassy Room. Issues from 2000 forward will be available online and they are still working on the digitization of all back issues. SAA will still print the journal. Access to the digital version will be available via a link off the SAA homepage. All but the 6 most recent issues will be available freely to anyone. More work will need to be done to improve visibility through indexing services and complete the digitization of back issues.

After this, she launched into her main speech “Our Journey Toward Diversity – And a Call to (More) Action”. I will do my best to include as many points as I managed to fully  captured in my notes. If this topic interests you – I encourage you to watch for publication of the full original. Please forgive me any misquotes, omissions and oversights. I have also included a few additional details on points that were in the presentation.

Our Journey Toward Diversity – And a Call to (More) Action

Adkins first contemplated diversity of the presidents of SAA by considering how long had it had been since a corporate archivist had been SAA president. The answer was William Overman in 1957 – and Overman is the only other corporate archivist to ever be selected as president. Adkins is also one of only 16 women to have been SAA President.

What does SAA Mean by Diversity? Why do we care? Adkins reviewed the 2004 census of the profession known as A*CENSUS . With its 5,620 responses it was much more extensive than the surveys done in 1956 and 1982.

Gender Imbalance

From A First Look at A*CENSUS Results (published in August of 2004):

The archival profession has experienced a significant shift in gender in the last half century. The A*CENSUS survey indicates that the ratio of women to men is now approximately 2:1. This is almost a mirror image of the gender distribution reported in Ernst Posner’s 1956 survey of SAA members, in which 67% were men and 33% were women.

Adkins stated that the current gender imbalance is an issue for two reasons:

  • we need men’s perspective and input
  • since women are still generally paid less than men – having a gender imbalance is likely driving down salaries

Library and Museums are seeing this same gender imbalance while the gender imbalance is flipped in the IT industry.

Race and Ethnic Diversity

According to A*CENSUS 2004 only 7% of the SAA membership is non-white while the general US populate is 25% non-white (with an even greater number of non-whites in kindergarten classes today).

Why should we care?
* “It’s the right thing to do”
* Completeness of the documentary record
* It’s good business business
* Competition with other professions and career paths

Dr. Harold T. Pinkett (1914-2001) was the first African American at NARA – named an SAA fellow in 1962, editor of American Archivist 1968-1971 and council member from 1971-1972.

SAA first diversity efforts launched in 1970s

From 1936-1972, women in SAA made up only 28-33% of SAA members. The 1970s brought lots of progress for women’s representation and activity in SAA.

Work on Racial and Ethnic diversity started in 1978…more work supported 1981-1987, some efforts supported – other efforts (such as desire for a fellowship to support study) were not.

The Archivists and Archives of Color Roundtable (AACR) founded in 1987, took on this name in 1994 (?). The Harold T. Pinkett Award was established in 1993 “to encourage minority students to consider careers in the archival profession and promote minority participation in SAA”.

In 1997 SAA created a Diversity Task Force and a final report was submitted in 1999. SAA Council accepted final report and moved forward in an ad hoc matter. In 2002 members of the task force were frustrated by lack of progress and passed a resolution asking for info on progress. The crux of the answer was “not a lot”.

In May 2003 the SAA council created a ‘diversity committee’… council is now actually talking about diversity and actually putting things in motion.

Focus on Students

There has a been a huge growth of Student Chapters. The concept was approved by the SAA council in 1993. There has been a growth from 3 chapters to nearly 30. Currently 20% of all members, more than 10% of attendees at this meeting, are students. Adkins hopes the students will help bringing more diversity into SAA and asked for a round of applause for the students attending the meeting.

Where are we now?

In 2005, SAA launched a new strategic planning effort and Diversity was identified one of the three highest priorities (with Technology and Public Awareness being the other 2).

What is the state of diversity today? Lots of talk – but how much actual action?

What is done?

  • position statement
  • census completed
  • monitoring progress
  • education for non-archivists who serve under represented groups
  • experimentation with the idea ofDiversity Fair

Next actions?

  • outreach on college and university campuses
  • provide other “entry points” into the archival profession
  • Archival education

The Task Force recommendations included improvement of the SAA website, providing financial aid for minorities and under represented communities, and working on SAA’s new member development.

Adkins presented an interesting idea of reaching out to kids age 10-15 such that we might influence their future career choices. She also suggested that SAA emulate the ALA model of the Spectrum Scholarship. Established in 1997, the Spectrum Scholarship program granted over 60 $5,000 scholarships this year alone. While SAA does not have the money to support a scholarship at this level – Adkins announced that a new SAA Minority Scholarship has been approved by the SAA council (this leading to the first spontaneous applause of the speech). She also made a big point of pointing to the Midwest Archives Conference’s Archie Motely Memorial Scholarship for Minority Students and saying that they should get credit as leaders in the area of minority scholarships.

“Diversity starts with a commitment to inclusion”

Addressing diversity concerns is hard work, but diversity will improve SAA in ways we can’t grasp now. She compared future progress to past efforts that now seem obvious (provision of childcare, the membership committee..etc).

Adkins concluded that that we need to build on a foundation of inclusion. A ‘welcoming respectful attitude’ will help us move forward. But we need to move forward with not just words – but with also with actions.

The hall gave her a standing ovation. Confronted with this, Adkins remarked that she had made it through so far but now she was getting all verklempt .

Final Count Down to SAA2007

The final count down to the annual conference of the Society of American Archivists, this year convening in Chicago, is well under way. Many of you might already be confirming your flights and packing your bags. I won’t be on site until Wednesday night – but thought I would try and catch as many of you as I could before you head away from your regular blog reading rhythms.

Are you attending?

Over 115 registered users (37 of them have introduced themselves) have been adding tons of content to the UnOffical Conference Wiki. If you haven’t visited recently (or at all) take a quick browse through all the great info that has been added.

If you are interested in trying your hand at posting session summaries – I say go for it! You don’t need to have a blog to do this. The wiki is open for anyone’s contributions. If you have any questions about how to post about a session on the wiki, feel free to contact me and I will do whatever I can to help.

Are you a presenter?

Take a look at the page for your session on the wiki and consider what you might add to tell attendees more about what you will talk about. Upload your handouts (and let me know if you have problems with this). Add links to related information or supporting websites, before or after your talk.

Are you in charge of a group meeting?

Consider adding detailed agendas (and thanks to all of you who already have!) to your page linked off the Group Meetings page. If you welcome those who are not members of your round table or section, add a friendly ‘everyone welcome’ note.

Watching from afar?

If you are not attending, please consider participating from wherever you are. If there is a session you would kill to have attended – then go to the Session Coverage page (or the session specific page for the session in question) and put a note next asking for someone to post a summary. This might also encourage presenters to add more of their materials to the wiki after the fact.

At the Conference

I hope to meet as many of you on-site as I can. I will be presenting as part of Session 804 Preserving Context and Original Order in a Digital World, Saturday at 1pm. I also plan to attend the Blogger Get-Together if I possibly can (once they decide when and where it will be). I will do my best to update both the Session Coverage page and my user page on the wiki with the sessions I plan to attend. If last year is any indication of how I will blog – I will take notes while offline and then post session summaries (with additional thoughts) after the fact. I discovered that I do not enjoy posting stream of consciousness style, on-the-spot posts. All my posts for the conference will be classified as SAA2007. I will also link to them from the session pages on the wiki. Finally, my posts (and everyone else’s if they are tagged SAA2007) should be available if you go to the Technorati page for SAA2007. Want to reach me? Use my contact form or post a comment here.

Thoughts on Digital Preservation, Validation and Community

The preservation of digital records is on the mind of the average person more with each passing day. Consider the video below from the recent BBC article Warning of data ticking time bomb.


Microsoft UK Managing Director Gordon Frazer running Windows 3.1 on a Vista PC
(Watch video in the BBC News Player)

The video discusses Microsoft’s Virtual PC program that permits you to run multiple operating systems via a Virtual Console. This is an example of the emulation approach to ensuring access to old digital objects – and it seems to be done in a way that the average user can get their head around. Since a big part of digital preservation is ensuring you can do something beyond reading the 1s and 0s – it is promising step. It also pleased me that they specifically mention the UK National Archives and how important it is to them that they can view documents as they originally appeared – not ‘converted’ in any way.

Dorthea Salo of Caveat Lector recently posted Hello? Is it me you’re looking for?. She has a lot to say about digital curation , IR (which I took to stand for Information Repositories rather than Information Retrieval) and librarianship. Coming, as I do, from the software development and database corners of the world I was pleased to find someone else who sees a gap between the standard assumed roles of librarians and archivists and the reality of how well suited librarians’ and archivists’ skills are to “long-term preservation of information for use” – be it digital or analog.

I skimmed through the 65 page Joint Information Systems Committee (JISC) report Dorthea mentioned (Dealing with data: Roles, rights, responsibilities and relationships). A search on the term ‘archives’ took me to this passage on page 22:

There is a view that so-called “dark archives” (archives that are either completely inaccessible to users or have very limited user access), are not ideal because if data are corrupted over time, this is not realised until point of use. (emphasis added)

For those acquainted with software development, the term regression testing should be familiar. It involves the creation of automated suites of test programs that ensure that as new features are added to software, the features you believe are complete keep on working. This was the first idea that came to my mind when reading the passage above. How do you do regression testing on a dark archive? And thinking about regression testing, digital preservation and dark archives fueled a fresh curiosity about what existing projects are doing to automate the validation of digital preservation.

A bit of Googling found me the UK National Archives requirements document for The Seamless Flow Preservation and Maintenance Project. They list regression testing as a ‘desirable’ requirement in the Statement of Requirements for Preservation and Maintenance Project Digital Object Store (defined as “those that should be included, but possibly as part of a later phase of development”). Of course it is very hard to tell if this regression testing is for the software tools they are building or for access to the data itself. I would bet the former.

Next I found my way to the website for LOCKSS (Lots of Copies Keep Stuff Safe). While their goals relate to the preservation of electronically published scholarly assets’ on the web, their approach to ensuring the validity of their data over time should be interesting to anyone thinking about long term digital preservation.

In the paper Preserving Peer Replicas By Rate­Limited Sampled Voting they share details of how they manage validation and repair of the data they store in their peer-to-peer architecture. I was bemused by the categories and subject descriptors assigned to the paper itself: H.3.7 [Information Storage and Retrieval]: Digital Libraries; D.4.5 [Operating Systems]: Reliability . Nothing about preservation or archives.

It is also interesting to note that you can view most of the original presentation at the 19th ACM Symposium on Operating Systems Principles (SOSP 2003) from a video archive of webcasts of the conference. The presentation of the LOCKSS paper begins about halfway through the 2nd video on the video archive page .

The start of the section on design principles explains:

Digital preservation systems have some unusual features. First, such systems must be very cheap to build and maintain, which precludes high-performance hardware such as RAID, or complicated administration. Second, they need not operate quickly. Their purpose is to prevent rather than expedite change to data. Third, they must function properly for decades, without central control and despite possible interference from attackers or catastrophic failures of storage media such as fire or theft.

Later they declare the core of their approach as “..replicate all persistent storage across peers, audit replicas regularly and repair any damage they find.” The paper itself has lots of details about HOW they do this – but for the purpose of this post I was more interested in their general philosophy on how to maintain the information in their care.

DAITSS (Dark Archive in the Sunshine State) was built by the Florida Center for Library Automation (FCLA) to support their own needs when creating the Florida Center for Library Automation Digital Archive (Florida Digital Archive or FDA). In mid May of 2007, FCLA announced the release of DAITSS as open source software under the GPL license.

In the document The Florida Digital Archive and DAITSS: A Working Preservation Repository Based on Format Migration I found:

… the [Florida Digital Archive] is configured to write three copies of each file in the [Archival Information Package] to tape. Two copies are written locally to a robotic tape unit, and one copy is written in real time over the Internet to a similar tape unit in Tallahassee, about 130 miles away. The software is written in such a way that all three writes must complete before processing can continue.

Similar to LOCKSS, DAITSS relies on what they term ‘multiple masters’. There is no concept of a single master. Since all three are written virtually simultaneously they are all equal in authority. I think it is very interesting that they rely on writing to tapes. There was a mention that it is cheaper – yet due to many issues they might still switch to hard drives.

With regard to formats and ensuring accessibility, the same document quoted above states on page 2:

Since most content was expected to be documentary (image, text, audio and video) as opposed to executable (software, games, learning modules), FCLA decided to implement preservation strategies based on reformatting rather than emulation….Full preservation treatment is available for twelve different file formats: AIFF, AVI, JPEG, JP2, JPX, PDF, plain text, QuickTime, TIFF, WAVE, XML and XML DTD.

The design of DAITSS was based on the Reference Model for an Open Archival Information System (OAIS). I love this paragraph from page 10 of the formal specifications for OAIS adopted as ISO 14721:2002.

The information being maintained has been deemed to need Long Term Preservation, even if the OAIS itself is not permanent. Long Term is long enough to be concerned with the impacts of changing technologies, including support for new media and data formats, or with a changing user community. Long Term may extend indefinitely. (emphasis added)

Another project implementing the OAIS reference model is CASPAR – Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval. This project appears much greater in scale than DAITSS. It started a bit more than 1 year ago (April 1, 2006) with a projected duration of 42 months, 17 partners and a projected budget of 16 million Euros (roughly 22 million US Dollars at the time of writing). Their publications section looks like it could sidetrack me for weeks! On page 25 of the CASPAR Description of Work, in a section labeled Validation, a distinction is made between “here and now validation” and “the more fundamental validation techniques on behalf of the ‘not yet born'”. What eloquent turns of phrase!

Page 7 found me another great tidbit in a list of digital preservation metrics that are expected:

2) Provide a practical demonstration by means of what may be regarded as “accelerated lifetime” tests. These should involve demonstrating the ability of the Framework and digital information to survive:
a. environment (including software, hardware) changes: Demonstration to the External Review Committee of usability of a variety of digitally encoded information despite changes in hardware and software of user systems, and such processes as format migration for, for example, digital science data, documents and music
b. changes in the Designated Communities and their Knowledge Bases: Demonstration to the External Review Committee of usability of a variety of digitally encoded information by users of different disciplines

Here we have thought not only about the technicalities of how users may access the objects in the future, but consideration of users who might not have the frame of reference or understanding of the original community responsible for creating the object. I haven’t seen any explicit discussion of this notion before – at least not beyond the basic idea of needing good documentation and contextual background to support understanding of data sets in the future. I love the phrase ‘accelerated lifetime’ but I wonder how good a job we can do at creating tests for technology that does not yet exist (consider the Ladies Home Journal predictions for the year 2000 published in 1900).

What I love about LOCKSS, DAITSS and CASPAR (and no, it isn’t their fabulous acronyms) is the very diverse groups of enthusiastic people trying to do the right thing. I see many technical and research oriented organizations listed as members of the CASPAR Consortium – but I also see the Università degli studi di Urbino (noted as “created in 1998 to co-ordinate all the research and educational activities within the University of Urbino in the area of archival and library heritage, with specific reference to the creation, access, and preservation of the documentary heritage”) and the Humanities Advanced Technology and Information Institute, University of Glasgow (noted as having “developed a cutting edge research programme in humanities computing, digitisation, digital curation and preservation, and archives and records management”). LOCKSS and DAITSS have both evolved in library settings.

Questions relating to digital archives, preservation and validation are hard ones. New problems and new tools (like Microsoft’s Virtual PC shown in the video above) are appearing all the time. Developing best practices to support real world solutions will require the combined attention of those with the skills of librarians, archivists, technologists, subject matter specialists and others whose help we haven’t yet realized we need. The challenge will be to find those who have experience in multiple areas and pull them into the mix. Rather than assuming that one group or another is the best choice to solve digital preservation problems, we need to remember there are scores of problems – most of which we haven’t even confronted yet. I vote for cross pollination of knowledge and ideas rather than territorialism. I vote for doing your best to solve the problems you find in your corner of the world. There are more than enough hard questions to answer to keep everyone who has the slightest inclination to work on these issues busy for years. I would hate to think that any of those who want to contribute might have to spend energy to convince people that they have the ‘right’ skills. Worse still – many who have unique viewpoints might not be asked to share their perspectives because of general assumptions about the ‘kind’ of people needed to solve these problems. Projects like CASPAR give me hope that there are more examples of great teamwork than there are of people being left out of the action.

There is so much more to read, process and understand. Know of a digital preservation project with a unique approach to validation that I missed? Please contact me or post a comment below.