Each week brings announcements of archives launching new websites. Today both my email and Twitter told me about University of Maryland, Baltimore County’s new Digital Collections  site. Who can resist peeking at new materials available online?
I have spent much of the past year learning the details of Search Engine Optimization . Usually shortened to SEO, this simply refers to the use of techniques which improve the traffic sent to a website via organic search . Want your webpage to show up at the top of the list for a specific search in Google? You want to work on your SEO.
So when I look at new archives website, I can’t help but keep an eye open for how well the site is optimized for search engines.
I hope that UMBC will forgive me for nitpicking their new site. A lot of their choices are great for SEO, but they also have room for improvement.
Things Done Well for SEO
- Home Page Title & Description: The site’s home page has a good meta description. This is the text displayed below the link on a search results page – as shown below:
- Unique Page Titles At Collection Level: Each photography collection homepage has a unique page title and a nice block of explanatory text. Google can only read words – so the more unique text on a page, the better the job Google can do in figuring out what your page is about. Example: Ardsley Park Album 
- Good anchor text : (also known as link text) The words used in anchor text tells search engines information about the destination page. For example, the blue text below is anchor text.  
Areas for SEO Improvement
- Unique Page Titles At Item Level: Individual images and documents all use a generic page title such as ‘UMBC | Digital Archive | Document Viewer’. Document Example: Accidental Death of an Anarchist  Image Example: 10 year old Bootblack 
- H1 Tags: In the HTML of each page, the dominant heading of the page should use the <h1> tag. This helps Google know the phrase you are targeting with this page. It is your 2nd best place to emphasize your content after the page title. In the case of the item pages, there seems to often be a headline type title at the top of the page – but it currently is not an demarcated with an <h1> tag.
- Think About Search Results and Indexing: Pages displaying results of internal searches  on your site are not likely to be useful as indexed pages in Google. The thinking here is that they can dilute the focus on the item and collection level pages on your site if Google also has many search results pages in the index. If UMBC wanted their search pages to be indexed, then those pages’ URLs should be simplified and the search results pages need a page title that somehow includes the search criteria. There are two ways that I know of to disable this indexing – blocking via the site’s robots.txt file  or via a robots meta tag  in the header of the search results page. Both of these methods tell obliging search engines to not crawl certain parts of your site.
There are plenty of other things that UMBC could do to support this new website. They could create an XML sitemap of all their pages and submit it to Google (maybe they already have). They might re-title some of their pages based on using a tool like Google Insight  to see what variations of a phrase is searched on most frequently. My goal here was to give you a taste of the sorts of things that catch my eye. Also, SEO is still more of an art than a science – so you will sometimes notice that what one SEO expert recommends is the opposite of what the next expert would tell you.
In many cases changes, such as the Unique Page Title at the Item Level mentioned above, may not even be possible due to software or programmer resource limitations. The trick is to take advantage of every option that is available. There are also trade-offs to be made. UMBC’s site provides some very slick interfaces for viewing the details of a group of documents, such as theater programs and other materials related to a theatrical production . The imlementation elegantly handles the situation of multiple scanned images which relate to a coherent set of documents. Sometimes you can’t have both your innovative UI and perfect SEO. Then it gets down to what your goals are for your website. Are you trying to make a specific community of existing users happy by providing them with tools they can use? Or does your mission focus more on reaching out to a broader audience?
There is no silver bullet to search engine optimization. It just takes knowledge of the available tools and techniques combined with a willingness to keep learning and experimenting. Like the ‘Do-It-Yourself-Woman ‘ pictured above in the Nationaal Archief ‘s photo I found out on the Flickr Commons, you too can learn the basics and do-it-yourself. A great starting point is Google’s free SEO Guide . Also, please remember that the best time to plan your SEO strategy is before you have built your site in the first place!
I would love to do research on how much progress archives websites can make in their organic search traffic after SEO improvements. My thinking is to take a snapshot of a month of analytics  (the statistics that tell you how many people are visiting your website) and then apply some SEO inspired changes. After a suitable delay (it takes some time for SEO to do its job) we consider another month of analytics to determine any change in organic traffic.
Do you want me to do a quick review of your archives website to see if there is room for SEO improvement? Please contact me  or add a comment to this post. I feel like there is a conference presentation in all this if we can find a good set of websites to optimize.
Finally, thank you to unsuspecting UMBC – your new website really is beautiful.
Image credit: Doe-het-zelf vrouw /Do-it-yourself-woman  from Nationaal Archief on Flickr Commons.
- SAA 2006 Poster: Communicating Context in Online Collections 
- Happy Birthday Spellbound Blog 
- Google Tackles Magazine Archives 
- Yahoo & Google’s Search for Reusable Images and the Flickr Commons