Menu Close

SEO Evaluation of an Archival Website: Looking at UMBC’s Digital Collections

Flickr Commons: Do-it-yourself-womanEach week brings announcements of archives launching new websites. Today both my email and Twitter told me about  University of Maryland, Baltimore County’s new Digital Collections site. Who can resist peeking at new materials available online?

I have spent much of the past year learning the details of Search Engine Optimization. Usually shortened to SEO, this simply refers to the use of techniques which improve the traffic sent to a website via organic search. Want your webpage to show up at the top of the list for a specific search in Google? You want to work on your SEO.

So when I look at new archives website, I can’t help but keep an eye open for how well the site is optimized for search engines.

I hope that UMBC will forgive me for nitpicking their new site. A lot of their choices are great for SEO,  but they also have room for improvement.

Things Done Well for SEO

  • Home Page Title & Description: The site’s home page has a good meta description. This is the text displayed below the link on a search results page – as shown below:UMBC Digital Collection Google Result
  • Unique Page Titles At Collection Level: Each photography collection homepage has a unique page title and a nice block of explanatory text. Google can only read words – so the more unique text on a page, the better the job Google can do in figuring out what your page is about. Example: Ardsley Park Album
  • Good anchor text: (also known as link text) The words used in anchor text tells search engines information about the destination page. For example, the blue text below is anchor text. UMBC Anchor Text Example

Areas for SEO Improvement

  • Unique Page Titles At Item Level: Individual images and documents all use a generic page title such as ‘UMBC | Digital Archive | Document Viewer’. Document Example: Accidental Death of an Anarchist Image Example: 10 year old Bootblack
  • H1 Tags: In the HTML of each page, the dominant heading of the page should use the <h1> tag. This helps Google know the phrase you are targeting with this page. It is your 2nd best place to emphasize your content after the page title. In the case of the item pages, there seems to often be a headline type title at the top of the page – but it currently is not an demarcated with an <h1> tag.
  • Think About Search Results and Indexing: Pages displaying results of internal searches on your site are not likely to be useful as indexed pages in Google. The thinking here is that they can dilute the focus on the item and collection level pages on your site if Google also has many search results pages in the index. If UMBC wanted their search pages to be indexed, then those pages’ URLs should be simplified and the search results pages need a page title that somehow includes the search criteria. There are two ways that I know of to disable this indexing – blocking via the site’s robots.txt file or via a robots meta tag in the header of the search results page. Both of these methods tell obliging search engines to not crawl certain parts of your site.

Final Thoughts

There are plenty of other things that UMBC could do to support this new website. They could create an XML sitemap of all their pages and submit it to Google (maybe they already have). They might re-title some of their pages based on using a tool like Google Insight to see what variations of a phrase is searched on most frequently. My goal here was to give you a taste of the sorts of things that catch my eye. Also, SEO is still more of an art than a science – so you will sometimes notice that what one SEO expert recommends is the opposite of what the next expert would tell you.

In many cases changes, such as the Unique Page Title at the Item Level mentioned above, may not even be possible due to software or programmer resource limitations. The trick is to take advantage of every option that is available. There are also trade-offs to be made. UMBC’s site provides some very slick interfaces for viewing the details of a group of documents, such as theater programs and other materials related to a theatrical production. The imlementation elegantly handles the situation of multiple scanned images which relate to a coherent set of documents. Sometimes you can’t have both your innovative UI and perfect SEO. Then it gets down to what your goals are for your website. Are you trying to make a specific community of existing users happy by providing them with tools they can use? Or does your mission focus more on reaching out to a broader audience?

There is no silver bullet to search engine optimization. It just takes knowledge of the available tools and techniques combined with a willingness to keep learning and experimenting. Like the ‘Do-It-Yourself-Woman‘ pictured above in the Nationaal Archief‘s photo I found out on the Flickr Commons, you too can learn the basics and do-it-yourself. A great starting point is Google’s free SEO Guide. Also, please remember that the best time to plan your SEO strategy is before you have built your site in the first place!

I would love to do research on how much progress archives websites can make in their organic search traffic after SEO improvements. My thinking is to take a snapshot of a month of analytics (the statistics that tell you how many people are visiting your website) and then apply some SEO inspired changes. After a suitable delay (it takes some time for SEO to do its job) we consider another month of analytics to determine any change in organic traffic.

Do you want me to do a quick review of your archives website to see if there is room for SEO improvement? Please contact me or add a comment to this post. I feel like there is a conference presentation in all this if we can find a good set of websites to optimize.

Finally, thank you to unsuspecting UMBC – your new website really is beautiful.

Image credit: Doe-het-zelf vrouw /Do-it-yourself-woman from Nationaal Archief on Flickr Commons.

A History of Our Own, Representing Communities and Identities on the Web (SAA09: Session 202)

LOC Flickr Commons: Sylvia Sweets Tea RoomAndrew Flinn, University College London (UCL), was the second speaker during SAA09’s Session 202 with his presentation ‘A History of Our Own, Representing Communities and Identities on the Web’. Flinn began with the idea that archives are “a place for creating and re-working memory”. While independent community archives are constituted around many purposes, Flinn’s main interest is in communities focused on absences and mis-representation of a group or event in history. Communities in which there is a cultural, politcal, or artistic activism. Some of these communities may be considered ‘movements’.

How should/can archivists support local archiving activities?

Part of the challenge of online communities is the need to capture the interactions in order to not loose the full picture. The National Listing of Community Archives in the UK‘s website states that they “seek to document the history of all manner of local, occupations, ethnic, faith and other diverse communities”.

The UCL’s International Centre for Archives and Records Management Research and User Studies (ICARUS) “brings together researchers in user access and description, community archives and identity, concepts and contexts of records and archives, and information policy”. Flinn is the Principal Investigator on the ICARUS project Community archives and identities which focuses on in depth interviews of 4 institutions which are “documenting and sustaining community heritage”.

These are some example online community sites:

Main Findings

  • proceed from a position that ‘knowing your own history’ is beneficial their communities as well as to the public at large
  • the quality of the work is done by individual passion and sacrifice, voluntary
  • there is ambivalence to/about the mainstream archives sector — keen to work with mainstream archives, but scarred by past bad experiences
  • good practices now could lead to partnerships in the future
  • these are living archives — not static.. still alive and growing
  • these ideas prompt re-evaluation of conventional archives thinking
  • lots of access to digital objects – perhaps movement to online existence

We need to understand that these communities evolve and are fluid. They have as broad variety of structures, sizes and methods of working. What are the patterns in participation & ownership?

The site urban 75 has hosted extended discussions about recent UK history. Efforts include identification of places and people in uploaded photos. The site connects people about issues about housing and local services – it is very practical but it also has evolved to include this historical documentation. One example post from the Brixton Forum shows a discussion about an Old shop front revealed on Atlantic Road.

A Short Aside

Next Flinn apologized for taking his talk slightly off script. Setting his papers aside, he spoke to the audience about the eXHulme website which he had discovered the evening before while finishing his presentation. Having lived in Hulme, Manchester himself, he felt a great impact from looking through the site. He spent 4 hours looking at it – including photos such as the travellers living in their buses parked – otteburn close 1996 seen at the bottom of this page. His discovery and exploration of this site gave him a greater personal understanding of the impact of these types of community documentation projects. I felt he would have been happy to keep talking about this site and the directions it had sent his thoughts — but he then got back to his papers and continued.

Building Community Online

Interactions online are the historic record of the community itself. Archives evolve and change as the community builds and edits their online content. These heritage and archive sites work to shift from the idea of visitors to engaging users in interaction — they need users of the website to feel part of the community.

Examples of sites building community online:

How do you successfully encourage participation (rather than large number of passive observers) which is crucial to the success of these types of initiatives? Lurking without contributing is easy – even if joining requires action. The rate of uptake may correspond with the sense of ownership. Heritage projects might encourage and sustain such participation. See Elisa Giaccardi & Leysia Palen’s article  – The Social Production of Heritage through Cross-media Interaction: Making Place for Place-making.

Suggestions

  • encourage conversation and treat all stories as having value – value every account
  • promote a sense of ownership once a story has been shared
  • allow for multiple ways to engage with and share content and memories
  • recognize and let users shift from observer to active member

Flinn’s Conclusions

  • What are the challenges and perils facing community archives? Lack of resources. People are doing these things in unsustainable ways
  • Why should we sustain independent community archives? Benefit to individuals, communities and broader society.
  • What can professional archivists do? Support and partnership with groups seeking this sort of partnership.

My Thoughts

The image I included above is from the Library of Congress’s Flickr Commons project. If you read through the comments on this photo you can see a diverse group of individuals come together to document the history of Sylvia Sweets Tea Room. This is just another example of the process of documentation being as interesting as the original image itself.

There is still so much to learn in the arena of building productive online communities. Archivists working through how to archive what online communities create will need to understand how the process of creation is documented via various software tools. As the techniques for encouraging participation evolve – archivists will need to evolve right along with them. I think it is interesting to envision archivists working in this space and supporting these types of communities — becoming as much the champions of the community itself as preservers of a community’s collaborative creations.

Image Credit: Flickr Commons Library of Congress: Sylvia Sweets Tea Room, corner of School and Main streets, Brockton, Mass

As is the case with all my session summaries from SAA2009, please accept my apologies in advance for any cases in which I misquote, overly simplify or miss points altogether in the post above. These sessions move fast and my main goal is to capture the core of the ideas presented and exchanged. Feel free to contact me about corrections to my summary either via comments on this post or via my contact form.

Archival Collections Online: Reaching Audiences Beyond The Edge of Campus (SAA09: Session 405)

The Archivist's Life, 23 May 1954Expanding Your Local and Global Audiences (Session 405, SAA 2009) shared how three institutions of higher education are using the web to reach out to new audiences. While the general public may still hold close the stereotype of archives as of rooms full of boxes of paper (not so different from this Duke image on Flickr: “Mattie Russell, curator of manuscripts, and Jay Luvaas, director of the Flowers Collection, examine the papers of Senator Willis Smith in the library vault.”), the presenters in this session are focused on expanding peoples’ experience of archives beyond boxes of papers locked away in a vault. They are using the web as a tool to reach beyond the walls of their reading rooms and the edges of their campuses.

Duke University Rare Books, Manuscript & Special Collections Library (RBMSCL) : Lynn Eaton (Reference Archivist)

While I didn’t find my way into this session until the start of the next speaker’s presentation, Lynn was kind enough to share with me her personal printout of her presentation slides. The links below and any associated commentary are based solely on my own interpretation of the various screen-shots included.

University of Nevada Las Vegas (UNLV) Digital Collections: Tom Sommer (University and Technical Services Archivist)

UNLV has experimented with new technologies as they appear. Tom made a point of saying that when they started seeing others provide a feature on their websites, UNLV would find a way to try it out. A great example of this is the addition of a tag cloud and google map to The Boomtown Years collection listed below.

Marist College Archives and Special Collections: John Ansley (Head, Archives and Special Collections)

Marist first launched their website in 2001 to raise awareness of their collections. They also used listserves and the on-campus newspaper. Utlimately their best tactic was working one-on-one with professors whose interests intersected with their collections. This led to contact with special interest groups. Working with the special interest groups led to new tag and metadata values for their collections.

My Thoughts

The archivists at all three of these educational institutions have tried new things and worked hard to share their materials with people beyond the traditional range of a reading room. The promise of the web, and all the tools and techniques it supports, is still being uncovered. It will be up to innovative archivists to keep discovering ways to push the envelope and welcome new audiences from all the corners of the globe.

Image Credit: http://www.flickr.com/photos/dukeyearlook/ / CC BY-NC-SA 2.0

As is the case with all my session summaries from SAA2009, please accept my apologies in advance for any cases in which I misquote, overly simplify or miss points altogether in the post above. These sessions move fast and my main goal is to capture the core of the ideas presented and exchanged. Feel free to contact me about corrections to my summary either via comments on this post or via my contact form.

SAA09: My Session on Online Communities (Session 101)

Thank you to everyone who came to our session this morning (Building, Managing, and Participating in Online Communities: Avoiding Culture Shock Online). Word on the street is that we had about 150 people in the audience.

As I mentioned during our talk – here is the Online Communities Comparison Chart. Please let me know if you have any issues accessing this document and feel free to share it with anyone you like.

If you had questions you were unable to ask during the session – please feel free to post them as comments below or send me a message via my  Contact Form. I will be sure to pass questions along to all the members of our panel. I also plan to update this post with links to everyone’s slides as they appear online.

Slides from our talk:

SAA has posted video of our presentation on facebook. The one I have linked to is the first of 7 segments. To view each in order, keep clicking ‘previous’ to view the next video.

Blog L’Archivista has a great post about our session.

THATCamp Austin 2009: Now Accepting Applications

THATCamp Austin 2009THATCamp Austin 2009 will be the first regional THATCamp. Slated for Tuesday evening August 11st, 2009 in Austin, Texas it will be held on the campus of the University of Texas, Austin. ‘THAT’ stands for The Humanities and Technology, while the Camp portion refers to the fact that it is an unconference.

What is an ‘unconference’ you ask? It is an attendee organized gathering focused on a common theme – in this case digital humanities. In the days leading up to the camp, attendees will post their ideas for discussion topics – but the final schedule will be sorted out on the ground during the gathering itself.

The original THATCamp event, organized by the Center for History and New Media (CHNM) at George Mason University, was a full two day weekend event. THATCamp Austin 2009 will be held on a single evening during the same week that the Annual Meeting of the Society of American Archivists is being held in Austin (and has the blessing of the CHNM).

I had an amazing time at the first THATCamp at CHNM in 2008 and wrote 3 posts about various presentations and discussions. Since I was unable to attend THATCamp 2009 I am especially pleased to be lending a hand in organizing this first regional THATCamp while I will be in Texas for SAA. If you can get yourself to Austin on Tuesday night August 11th and have a passion for the digital humanities — take a look at the what/when/where details over on the THATCamp Austin 2009 About Page.

A few details hijacked from the THATCamp Austin website:

How do I sign up?
Unfortunately, we only have space for 60-70 participants, so we’ll have to do some vetting. To apply for a spot, simply send email to thatcamp.austin.2009@gmail.com., telling us what you’d like to present, and what you think you will get out of the experience. Please don’t send full proposals. We’re talking about an informal note of around 250 words, max.  Please include your T-shirt size and an email address you can check from public places so that we can register you with the University of Texas wi-fi system.

How much?
THATCamp Austin is free to all attendees, but a $25 donation towards T-shirts and pizza will be very much appreciated.

Don’t be afraid to take a step into the less-structured unconference world. What I experienced at the first THATCamp was a group of very enthusiastic individuals who were so pleased to find like minded people with whom to talk – regardless of our very varied backgrounds. Folks have reported coming away from both of the THATCamps at CHNM feeling energized and rededicated to their projects — as well as having found new collaborators and opportunities for cross-polination across all the diverse members of the digital humanities community.

DH2009: Digital Lives and Personal Digital Archives

Session Title: Digital Lives: How people create, manipulate and store their personal digital archives
Speaker: Peter Williams, UCL

Digital lives is a joint project of UCL, British Library and University of Bristol

What? We need a better understanding of how people manage digital collections on their laptops, pdas and home computers. This is important due to the transition from paper-based personal collections to digital collections. The hope is to help people manage their digital archives before the content gets to the archives.

How? Talk to people with in-depth narrative interview. Ask people of their very first memories of information technology. When did they first use the computer? Do they have anything from that computer? How did they move the content from that computer? People enjoyed giving this narrative digital history of their lives.

Who? 25 interviewees – both established and emerging people whose works would or might be of interest to repositories of the future.

Findings?

  • They created a detailed flowchart of users’ reported process of document manipulation.
  • Common patterns in use of email showed that people used email across all these platforms and environments. Preserving email is not just a case of saving one account’s messages:
    • work email
    • Gmail/Yahoo
    • mails via Facebook
    • Twitter
  • Documented personal information styles that relate skills dimension to data security dimension.

The one question I caught was from someone who asked if they thought people would stop using folders to organize emails and digital files with the advent of easy search across documents. The speaker answered by mentioning the revelations in the paper Don’t Take My Folders Away!. People like folders.

My Thoughts

This session got me to think again about the SAA2008 session that discussed the challenges that various archivists are facing with hybrid literary collections. Matthew Kirschenbaum also pointed me to MITH’s white paper: Approaches to Managing and Collecting Born-Digital Literary Materials for Scholarly Use.

I am very interested to see how ideas about preserving personal digital records evolve. For example, what happens to the idea of a ‘draft’ in a world that auto-saves and versions documents every few minutes such as Google Documents does?

With born digital photos we run into all sorts of issues. Photos that are simultaneously kept on cameras, hard drives, web based repositories (flickr, smugmug, etc) and off-site backup (like mozy.com). Images are deleted and edited differently across environments as well. A while back I wrote a post considering the impact of digital photography on the idea of photographic negatives as the ‘photographers’ sketchbooks’: Capa’s Found Images and Thoughts on Digital Photographers’ Sketchbooks.

I really liked the approach of this project in that it looked at general patterns of behavior rather than attempting to extrapolate from experiences of archivists with individual collections. This sort of research takes a lot of energy, but I am hopeful that basically creating these general user profiles will lead to best practices for preserving personal digital collections that can be applied easily as needed.

As is the case with all my session summaries from DH2009, please accept my apologies in advance for any cases in which I misquote, overly simplify or miss points altogether in the post above. These sessions move fast and my main goal is to capture the core of the ideas presented and exchanged. Feel free to contact me about corrections to my summary either via comments on this post or via my contact form.

Yahoo & Google’s Search for Reusable Images and the Flickr Commons

When I read about Yahoo Image Search’s recent addition of a filter to return only creative commons Flickr images, I got all excited about what this might mean for images in the Flickr Commons. So I raced off to the Yahoo Image Search page to see how it works. The short answer is that the new special rights setting of  no known copyright restrictions that they created for members of the Flickr Commons apparently doesn’t count.

For my test I searched for an exact match on “Ticket with portrait of George Washington”. This returns one result – the one image in Flickr with the same name, from The Field Museum in Flickr Commons. If you click on the ‘More Filters’ link, you will see other ways to filter your Creator permits reuse - Yahoo image searchresults – including the option to restrict your results to only include images whose creators permit reuse.

Next I clicked in the ‘Creator allows reuse’ and my one result disappeared! Quite disappointing in my book.

Google is also getting onto the ‘make it easy to search for reusable images’ bandwagon. Search Engine Land reported that Google Images Quietly Adds Creative Commons Filter. That post pointed me to Google  Operating System‘s search interface that lets you play with the options that Google has available. After a clicking through to some of the images returned by a Google Image Search for creative commons images of archives, the way the Google model appears to work is to look for creative commons badges or links on the page with the image. I even found Flickr creative commons images, but when I tried to find my Flickr Commons image of the ticket used above for my Yahoo image search experiment it wasn’t returned by Google either.

So if an archives (or museum or library) posts images on a page that indicates that the content is licensed under creative commons, it seems those images will then appear in Google’s image search as reusable. That is good news! Another way to get users to find your public domain images.

The question I am left is how to resolve the gap between Flickr Commons’ ‘no known copyright restrictions  rights statement and both Google and Yahoo’s definition of reusable content.

Archivists and New Technology: When Do The Records Matter?

Navigating the rapidly changing landscape of new technology is a major challenge for archivists. As quickly as new technologies come to market, people adopt them and use them to generate records. Businesses, non-profits and academic institutions constantly strive to find ways to be more efficient and to cut their budgets. New technology often offers the promise of cost reductions. In this age of constantly evolving software and technological innovation, how do archivists know when a new technology is important or established enough to take note of? When do the records generated by the latest and greatest technology matter enough to save?

Below I have include two diagrams that seek to illustrate the process of adopting new technology. I think they are both useful in aiding our thinking on this topic.

The first is the “Hype Cycle“, as proposed by analyst Jackie Fenn at Gartner Group. It breaks down the phases that new technologies move through as they progress from their initial concept through to broad acceptance in the marketplace. The generic version of the Hype Cycle diagram below is from the Wikipedia entry on hype cycle.

Gartner Hype Cycle (Wikipedia)

Each summer, Gartner comes out with a new update on Where Are We In The Hype Cycle?. Last summer, microblogging was just entering the ‘Peak of Inflated Expectations’, public virtual worlds were sliding down into the ‘Trough of Disillusionment’ and location aware applications were climbing back up the ‘Slope of Enlightenment’. There is even a book about it: Mastering the Hype Cycle: How to Choose the Right Innovation at the Right Time.

The other diagram is the Technology Adoption Lifecycle from Geoffrey Moore’s Crossing the Chasm. This perspective on the technology cycle is from the perspective of bringing new technology to market. How do you cross the chasm between early adopters and the general population?

Technology Adoption Lifecycle (Wikipedia)

Archivists need to consider new technology from two different perspectives. When to use it to further their own goals as archivists and when to address the need to preserve records being generated by new technology. A fair bit of attention has been focused on figuring out how to get archivists up to speed on new web technology. In August 2008, ArchivesNext posted about hunting for Web 2.0 related sessions at SAA2008 and Friends Told Me I Needed A Blog posted about SAA and the Hype Cycle shortly thereafter.

But how do we know when a technology is ‘important enough’ to start worrying about the records it generates? Do we focus our energy on technology that has crossed the chasm and been adopted by the ‘early majority’? Do we watch for signs of adoption by our target record creators?

I expect that the answer (such as there can be one answer!) will be community specific. As I learned in the 2007 SAA session about preserving digital records of the design community, waiting for a single clear technology or software leader to appear can lead to lost or inaccessible records. Archivists working with similar records already come together to support one another through round tables, mailing lists and conference sessions. I have noticed that I often find the most interesting presentations are those that discuss the challenges a specific user community is facing in preserving their digital records. The 2008 SAA session about hybrid analog/digital literary collections discussed issues related to digital records from authors. Those who worry about records captured in geographic information systems (GIS) were trying to sort out how to define a single GIS electronic record when last I dipped my toes into their corner of the world in the Fall of 2006.

It is not feasible to imagine archivists staying ahead of every new type of technology and attempting to design a method for archiving every possible type of digital records being created. What we can do is make it a priority for a designated archivist within every ‘vertical’ community (government, literary, architecture… etc) to keep their ear to the ground about the use of technology within that community. This could be a community of practice of its own. A group that shares info about the latest trends they are seeing while sharing their best practices for handling the latest types of records being seen.

The good news is that archivists aren’t the only ones who want to be able to preserve access to born digital records. Consider Twitter, which only provides easy access to recent tweets. A whole raft of third-party tools built to archive data from Twitter are already out there, answering the demand for a way to backup people’s tweets.

I don’t think archivists always have the luxury of waiting for technology to be adopted by the majority of people and to reach the ‘Plateau of Productivity’. If you are an archivist who works with a community  that uses cutting edge technology, you owe it to your community to stay in the loop with how they do their work now. Just because most people don’t use a specific technology doesn’t mean that an individual community won’t pick it up and use to the exclusion of more common tools.

The design community mentioned above spoke of working with those creating the tools for their community to ensure easy archiving down the line. In our fast paced world of innovation, a subset of archivists need to stay involved with the current business practices of each vertical being archived. This group can work together to identify challenges, brainstorm solutions, build relationships with the technology communities and then disseminate best practices throughout the archives community. I did find a web page for the SAA’s Technology Best Practices Task Force and its document Managing Electronic Records and Assets: A Working Bibliography, but I think that I am imagining something more ongoing, more nimble and more tied into each of the major communities that archivists must support. Am I describing something that already exists?