Menu Close

Category: SEO

SAA2010: SEO and Archives Websites

Just a quick reminder that I will be presenting tomorrow morning at SAA2010 on the topic of search engine optimization and archives websites. I am part of session 502 officially titled Not on Google? It Doesn’t Exist: Findability and Search Engine Optimization for Archives. My specific portion of the presentation is titled ‘Building Archives Websites That Google Will Love’ and will be a general introduction to SEO concepts and why they are important to those involved in the creation of websites for archives and other cultural heritage institutions. It will include some basic tips and techniques.

My two co-presenters, Matt Herbison and Mark Matienzo, will discuss more in depth issues related to website architecture, URLs and increasing links back into your website. We hope you can join us, even though our session is during the less than pleasant 8am Saturday morning time slot. I will be posting my slides after our session and linking to them from my presentations page. I plan to pick up some donuts to sweeten the deal!

SEO Evaluation of an Archival Website: Looking at UMBC’s Digital Collections

Flickr Commons: Do-it-yourself-womanEach week brings announcements of archives launching new websites. Today both my email and Twitter told me about  University of Maryland, Baltimore County’s new Digital Collections site. Who can resist peeking at new materials available online?

I have spent much of the past year learning the details of Search Engine Optimization. Usually shortened to SEO, this simply refers to the use of techniques which improve the traffic sent to a website via organic search. Want your webpage to show up at the top of the list for a specific search in Google? You want to work on your SEO.

So when I look at new archives website, I can’t help but keep an eye open for how well the site is optimized for search engines.

I hope that UMBC will forgive me for nitpicking their new site. A lot of their choices are great for SEO,  but they also have room for improvement.

Things Done Well for SEO

  • Home Page Title & Description: The site’s home page has a good meta description. This is the text displayed below the link on a search results page – as shown below:UMBC Digital Collection Google Result
  • Unique Page Titles At Collection Level: Each photography collection homepage has a unique page title and a nice block of explanatory text. Google can only read words – so the more unique text on a page, the better the job Google can do in figuring out what your page is about. Example: Ardsley Park Album
  • Good anchor text: (also known as link text) The words used in anchor text tells search engines information about the destination page. For example, the blue text below is anchor text. UMBC Anchor Text Example

Areas for SEO Improvement

  • Unique Page Titles At Item Level: Individual images and documents all use a generic page title such as ‘UMBC | Digital Archive | Document Viewer’. Document Example: Accidental Death of an Anarchist Image Example: 10 year old Bootblack
  • H1 Tags: In the HTML of each page, the dominant heading of the page should use the <h1> tag. This helps Google know the phrase you are targeting with this page. It is your 2nd best place to emphasize your content after the page title. In the case of the item pages, there seems to often be a headline type title at the top of the page – but it currently is not an demarcated with an <h1> tag.
  • Think About Search Results and Indexing: Pages displaying results of internal searches on your site are not likely to be useful as indexed pages in Google. The thinking here is that they can dilute the focus on the item and collection level pages on your site if Google also has many search results pages in the index. If UMBC wanted their search pages to be indexed, then those pages’ URLs should be simplified and the search results pages need a page title that somehow includes the search criteria. There are two ways that I know of to disable this indexing – blocking via the site’s robots.txt file or via a robots meta tag in the header of the search results page. Both of these methods tell obliging search engines to not crawl certain parts of your site.

Final Thoughts

There are plenty of other things that UMBC could do to support this new website. They could create an XML sitemap of all their pages and submit it to Google (maybe they already have). They might re-title some of their pages based on using a tool like Google Insight to see what variations of a phrase is searched on most frequently. My goal here was to give you a taste of the sorts of things that catch my eye. Also, SEO is still more of an art than a science – so you will sometimes notice that what one SEO expert recommends is the opposite of what the next expert would tell you.

In many cases changes, such as the Unique Page Title at the Item Level mentioned above, may not even be possible due to software or programmer resource limitations. The trick is to take advantage of every option that is available. There are also trade-offs to be made. UMBC’s site provides some very slick interfaces for viewing the details of a group of documents, such as theater programs and other materials related to a theatrical production. The imlementation elegantly handles the situation of multiple scanned images which relate to a coherent set of documents. Sometimes you can’t have both your innovative UI and perfect SEO. Then it gets down to what your goals are for your website. Are you trying to make a specific community of existing users happy by providing them with tools they can use? Or does your mission focus more on reaching out to a broader audience?

There is no silver bullet to search engine optimization. It just takes knowledge of the available tools and techniques combined with a willingness to keep learning and experimenting. Like the ‘Do-It-Yourself-Woman‘ pictured above in the Nationaal Archief‘s photo I found out on the Flickr Commons, you too can learn the basics and do-it-yourself. A great starting point is Google’s free SEO Guide. Also, please remember that the best time to plan your SEO strategy is before you have built your site in the first place!

I would love to do research on how much progress archives websites can make in their organic search traffic after SEO improvements. My thinking is to take a snapshot of a month of analytics (the statistics that tell you how many people are visiting your website) and then apply some SEO inspired changes. After a suitable delay (it takes some time for SEO to do its job) we consider another month of analytics to determine any change in organic traffic.

Do you want me to do a quick review of your archives website to see if there is room for SEO improvement? Please contact me or add a comment to this post. I feel like there is a conference presentation in all this if we can find a good set of websites to optimize.

Finally, thank you to unsuspecting UMBC – your new website really is beautiful.

Image credit: Doe-het-zelf vrouw /Do-it-yourself-woman from Nationaal Archief on Flickr Commons.

Video News Archives: Digitization as Good Business

Flickr: OSU Spring Game 2006 Media Lineup by Chris MetcalfMy work now includes more SEO (Search Engine Optimization) work and so I have added SEO focused blogs to my RSS feedreader. Today I spotted Search Engine Land‘s post Business Opportunities For Video News Archives. Stephen Baker calculates that 35 years worth of archive footage equals 51,100 hours of content per station. With approximately 20 stations per broadcast group he estimates a cost of $30 million per group to digitize each broadcast group’s archive of news footage. See the original article for more details on his calculations.

He then proposes 3 approaches to monetizing these efforts and leveraging the resulting digitized video:

  1. Media-Centric Wikipedia – complete with an expectation that social media contributions would provide “scalable way for creating editorial metadata, such as descriptions and story summaries that would be costly to otherwise create”. This makes me think of Flickr Commons for video.
  2. Education Site – akin to NBCU’s iCue site I mentioned in my post about NBC News Archive footage on Hulu. “Efforts like this provide educational/subscription opportunities as well as sponsorship/advertising opportunities—what advertiser doesn’t want to get in front of 13 – 18 year olds?”
  3. News Site Extension – described as “bolting the news archive onto the existing site”. The major benefit of this is that “more content provides more SEO opportunity and, hence, larger audience reach.”

Baker concludes:

In a market where traditional media is struggling to create unique and compelling online experiences and business models, the archive represent a differentiator that can jump-start audience building and monetization initiatives. Not only is it an important representation of world history that must be saved for “preservation-sake”, the archive represents a large, untapped online opportunity.  Who will be first to realize its potential?

The ultimate goal of all three of these scenarios is to offset the extreme expense of digitization of thousands of hours of news footage. I think it is refreshing to see a perspective from outside the cultural heritage corner of the world that still sees video archives as rich resources worth preserving. I also like seeing ideas that are pitched in manner that should catch the attention of those making budgets and struggling with finding funding for large digitization efforts.

Image Credit: Flickr photo OSU Spring Game 2006 Media Lineup by Chris Metcalf