- Spellbound Blog - https://www.spellboundblog.com -

German Federal Archives, Crowdsourcing & the Wikimedia Commons

Posted By Jeanne On January 26, 2009 @ 1:01 am In access,digitization,metadata,photography,virtual collaboration,web 2.0 | Comments Disabled

^[1]

I spotted the New York Times article Historical Photos in Web Archives Gain Vivid New Lives ^[2] via Dan Cohen’s Twitter Feed ^[3]. The article is a nice treatment of the difference between the Library of Congress ^[4]‘s 50 photo a week contributions to the Flickr Commons ^[5] and the German Federal Archives ^[6]‘ contribution of 100,000 images to the Wikimedia Commons ^[7] (described as ” the virtual archive for material used in Wikipedia articles”).

I took a look at the details of this project – starting with the homepage of the Commons: Bundesarchiv ^[8] on the Wikimedia Commons. This passage explains one of the goals of the Budesarchiv Gallery ^[9]:

Very old photographs have become public domain, and events and persons of today can be photographed by Wikipedians with their digital cameras. But for the time between there is a huge gap in Wikipedia articles. The donation of Federal Archive is important to close that gap, and it is to hope that it can serve as a model to other institutions in Germany or elsewhere.

Also, each individual photo includes this disclaimer:

For documentary purposes the German Federal Archive often retained the original image captions, which may be erroneous, biased, obsolete or politically extreme. Factual corrections and alternative descriptions are encouraged separately from the original description.

There is a special category to call out instances of these types of descriptions – BArch images with biased descriptions ^[10]. In my exploration, I discovered only a very few with these original image captions translated to English. One example is the photo of a single room home for a family of eleven ^[11].

In contrast to the Library of Congress addition of 50 photos a week, the German Federal Archive plans to add “a few thousand images a month”. The Commons:Bundesarchiv To Do ^[12] list is also interesting reading. The To Do page includes tasks both in German and English (though the wiki discussion page is all in German). I love having the opportunity to read about issues confronting those working on this sort of project. For example – there is a discussion about how to determine if an image should remain Uncategorized ^[13]. What if only 1 person out of three is tagged? Does it still ‘deserve’ to remain marked as ‘uncategorized’?

New categories created for use in this project need to use a special template so that they show up properly within the sub-categories of the Category:Images from the German Federal Archive page ^[14]. For example – the page which sorts images by country ^[15] has 64 sub-categories at the time of this post. A new country added using this template approach would immediately show up on the images by country sub-category page.

I will say that the learning curve for classifying images within the Wikimedia Commons in general, and the Budesarchiv project in specific, is much higher than tagging images in the Flickr Commons. There is a handy CommonSense tool ^[16](available via the ‘find categories’ tab on any image) that will suggest categories based on keywords, but even that is a bit overwhelming for a beginner.

As an example, let’s look at the image I chose for this post of two boys finishing their ice cream in 1949. Here are the categories currently assigned:

Images from the German Federal Archive, year 1949 ^[17]
Images from the German Federal Archive, location Berlin ^[18]
History of Germany ^[19]
Ice cream ^[20]
Black and white photographs of children ^[21]
Black and white photographs of Germany ^[22]
Standing males ^[23]
Photographs by Brenner ^[24]

Let’s take a look at what the wiki text looks like to set these categories. First there is the special template for the project which specifies the year and location. I believe that these are attributes uploaded with the original photograph. This gives us the first two categories in our list (emphasis added mine):

{{BArch-License|
|signature=Bild 183 1984-0202-506
|batch=Bild 183
|year=1949
|month=
|location=Berlin
|PD=
}}

Then we get to the standard Wikimedia Commons categories. These are the categories most akin to tags in Flickr. These are the categories which will promote discovery of these images alongside images from other sources from across the Wikimedia Commons:

[[Category:History of Germany]]
[[Category:Ice cream]]
[[Category:Black and white photographs of children]]
[[Category:Black and white photographs of Germany]]
[[Category:Standing males]]
[[Category:Photographs by Brenner]]

These categories were clearly hand added by someone, since the original caption reads (by my rough translation) At the beach: “Is it already gone?”. I suppose I could go in and add [[Category:Beaches]] ^[25], but I am honestly not sure if there is enough beach in the photo to warrant such a classification.

I am very curious to see comparison stats of the assignment of categories/tags to images in both the Flickr & Wikimedia Commons a year from now. How will we measure success? How will we grade the accuracy of metadata assigned by the public? Which images will get more public views and usage – those added to the Flickr Commons or those added to the Wikimedia Commons?

For now, I am happy to set aside all these thorny questions. I am just so pleased to see a new and ambitious experiment in crowdsourcing image metadata.

Comments Disabled (Open | Close)

Comments Disabled To "German Federal Archives, Crowdsourcing & the Wikimedia Commons"

#1 Comment By Gary On January 28, 2009 @ 6:17 am

Interesting article thanks! I was not aware of either the German Federal Archives nor Library of Congress contributions to freely available photos.

#2 Pingback By Open Knowledge Foundation Blog » Blog Archive » Open Everything Berlin + CC Salon Berlin On February 27, 2009 @ 10:13 am

[…] The donation received good press coverage (see articles in the New York Times, and Spiegel Online) and is an outstanding example of a cultural heritage institution making material available under an open license. (The other high-profile example is Flickr Commons. There’s an interesting blog post comparing the two here.) […]

#3 Pingback By DH2009: Digital Curiosities and Amateur Collections – Spellbound Blog On June 29, 2009 @ 10:24 pm

[…] The Flickr Commons is a big step forward, but it isn’t the only option. There are also varying opinions about how successful the crowdsourcing aspect of the Flickr Commons is for memory institutions. A lot of this goes back to to a core question “how do we know if we have succeeded?”. There is much to be said for setting out clear goals when launching online initiatives. Is your goal increased traffic to your site or crowdsourcing of metadata? A great example of an initiative whose goal is clearly collection of crowdsourced metadata is the German Federal Archives who chose to use the Wikimedia Commons for their photo metadata initiative. […]

#4 Comment By Mark On December 28, 2010 @ 2:46 pm

A few thousand images per month? That’s no small undertaking, but I can’t imagine how 50 photos from the Library of Congress is acceptable. Over worked staff members, sure…..but come on guys that should only take an hour or so. Certainly we have a couple of interns which can help