Menu Close

Category: virtual collaboration

Why Preserve? To Connect!

In honor of 2025’s World Digital Preservation Day (WDPD), I am finally taking a leap back into posting here. My last post was in February of 2020 – and while I can see a half-dozen partially written posts lurking behind the scenes, none of them were ever finished “enough” to actually post.

So… Happy World Digital Preservation Day! I just spent the last 4 days attending iPRES 2025 virtually. I was in Maryland while most of the attendees were in person on the other side of the planet in New Zealand. Luckily, I’m a night owl, so attending sessions from 3pm – 10:30pm my time was just fine with me.

The conference closed last night (still Wednesday) for me – but now I’ve caught up to Thursday November 6th and have the time to reflect on this year’s WDPD theme of “Why Preserve?”. Please keep in mind that the contents of this post, along with everything here on Spellbloundblog, reflect only my thoughts as an individual.

First, some context about me. I love stories and I love connection of all kinds – connections among people, connections between the past and all our possible futures, and connections that build community. Somewhere at their intersection is where I see the role for preservation. Without our digital records (preserved in such a way that they retain their context, can be trusted to be authentic, and can be interacted with in a meaningful way) we will lose stories of the past and all the evidence they contain. We will lose many kinds of connection.

Many communities have decided that this reason for preserving means that time, energy, and funding should be allocated toward this goal. One of iPRES 2025’s themes was Tūhono (Connect). This thread ran through keynotes, posters, bake-off demonstrations, and presentations/panels of all kinds. And for me – the theme of Tūhono elegantly ties into my understanding of “Why Preserve?”.

We preserve to connect. To connect the past to the future. To connect with both our professional digital preservation community and with those whose records are being preserved. Digging into my copious notes from the last few days, here are a few tidbits from iPRES 2025 that kept the focus on connection.

  • Late Sunday my time, I attended a workshop on Archival Resource Keys (ARKs). The ARK Alliance is a community that supports the ARK infrastructure. ARKs and the ARK Alliance are all about connection. ARKs are being used by libraries, archives, museums, government agencies, and more. From their website “ARKs are open, mainstream, non-paywalled, decentralized persistent identifiers that you can start creating in under 48 hours.” Want to connect your stuff to anyone who wants to refer to it in a durable way? ARKs can help.
  • Tuesday paper session 2 included a paper on “A Collaborative Framework for Migrations”, talking about digital preservation in Finland. The presenters highlighted that collaboration was key to success. Cultural institutions are experts in semantics and understanding while the digital preservation service is responsible for bit-level preservation, but you need both to ensure logical preservation. Without that collaboration, you can’t ensure the future usability of the information.
  • Wednesday’s keynote on “Encountering Collapse: Power, Community, and the Future of Open Infrastructure” was delivered by Rosalyn Metz, Chief Technology Officer for Libraries and Museum at Emory University. There were so many compelling elements to this talk, but I’ll share the one that spoke to me most strongly of connection. Community is the backbone of open infrastructure: “The resilience of infrastructure depends on the relationships that sustain it. Communities, not technologies, make infrastructure possible.”.
  • I spent pretty much all day Wednesday in the Bake-Offs, in which people demo tech tools and solutions. To my eye, it was a fantastic parade of people sharing. So many opportunities for speakers to literally demonstrate their expertise. I always love seeing what other folks are working on, especially open source projects that might be just the thing someone needs to move their own project forward. It’s like speed dating for future collaboration.
  • I saw many posters and lightening talks – but one that jumps out as fitting this theme was presented by Amy Pienta, Research Professor at ICPSR at University of Michigan. She spoke about the role of data stewards in safeguarding public data. DataLumos is a great example of a community coming together to ensure crucial resources are preserved. I’m glad that they exist, doing the work — and perhaps serving as inspiration for others to work on whatever challenges they find.
  • The closing keynote address from Peter-Lucas Jones, CEO of Te Hiku Media, specifically was tied to the conference theme of Connect. In order to understand traditional data, you must understand the importance of indigenous language. The efforts of Te Hiku Media include multiple ways of leveraging technology to both preserve the Māori language and give back to the community keeping the language alive (a few examples: teaching computers te reo Māori, creating a synthentic voice that can run on assistive devices and speaks te reo Māori, live bi-lingual captioning). He also emphasized that it was important to “empower communities to lead the change they need” – and that data licensing is key to prevent that what they are creating can only be used for purposes in sync with the communities wishes.
  • The last session I attended was Panel 7: “Working with ICT in Digital Preservation”. My connection thread from this panel discussion was the need for all of us to support one another as we navigate the multi-fold challenges to building the technical environments we need to preserve at-risk records. Yes, we do need to plug in old tech bought off ebay to see if it will work (and hope it won’t catch fire!). Yes, we need to leverage other teams’ success and use it as a “hey it worked for them” kind of argument to help us go around institutional rules that are keen on standardization. And yes – we need to connect with as many parts of our organizations to explain what digital preservation work is, how we do it, and why it is important.

This list is far from exhaustive, but I hope it gives you a taste of why the strongest thread for me from iPRES 2025 was connection. And why that is also my answer to “Why Preserve?”. To Connect.

PS: I’d like to thank the Web Hypertext Application Technology Working Group (WHATWG) who apparently created this fantastically useful named character reference list of all the character names that HTML recognizes so that I could appropriately publish two of the words I wanted to in this post accurately (Tūhono and Māori) via the WordPress text HTML interface. If you are curious, the answer to making the characters ū and ā display is preceding the strings umacr; and amacr; with an &. Yes, I needed the help of a community to share my ideas on connection.

Seeking Diverse Voices: Reflections on Recruiting Chapter Authors

My original book proposal for Partners for Preservation was anonymized and shared by the commissioning editor to a peer in the digital preservation community. One of the main comments I received was that I should make sure that I recruited authors from outside the United States. Given that the book’s publisher, Facet, is a UK-based publisher – it made sense that I should work to avoid only recruiting US chapter authors.

But I didn’t want to stop with trying to recruit authors from outside the US. I wanted to work towards as diverse a set of voices for the ten chapters as I could find.

When I started this project, I had no experience recruiting people to write chapters for a book. I definitely underestimated the challenges of finding chapter authors. I sent a lot of emails to a lot of very smart people. It turns out that lots of people don’t reply to an email from someone they don’t already know. I worked hard to balance waiting a reasonable time for a reply with continuing my quest for authors.

I needed people who fit all of the following criteria:

  • topic expert
  • interested in writing a chapter
  • with enough time to write a chapter by my deadlines

… all while keeping an eye on all the other facets of each author that would contribute to a diverse array of voices. There were a lot of moving parts.

This is a non-exhaustive list of sources I used for finding my authors:

  • personal contacts
  • referrals from colleagues and friends
  • LinkedIn
  • lists of presenters from conferences
  • authors of articles related to my topics of interest
  • lots of googling

I am very proud of the eleven chapter authors (one chapter was co-written by two individuals) I recruited. For a book with only 10 chapters, having a balanced gender distribution and five different countries of residence represented feels like a major accomplishment. Each chapter author is shown below, in the order in which their chapters appear in the book.

I picked the “Grow It Yourself” WPA poster featured at the top of this post because the work of recruiting the right balance of authors often felt like planning a garden. I pursued many potential chapter authors with ideas in mind of what they might write. Over the life of the project, my vision of each chapter evolved – much as a garden plan must be based on the availability of seeds, sunlight, and water.

I believe that the extra effort I put into finding these authors made Partners for Preservation a better book. It probably would have been much easier to recruit 5 white men from the US and 5 white men from the UK to write the chapters I needed, but the final product would have been less compelling. I hope you find this to be the case if you choose to read the book. I also hope that if you work on a similar project that you consider making a similar extra effort.

Image credit: Grow it yourself Plan a farm garden now. by Herbert Bayer from NYC WPA War Services, [between 1941 and 1943]. https://www.loc.gov/pictures/collection/wpapos/item/99400959/

 

Chapter 10: Open Source, Version Control and Software Sustainability by Ildikó Vancsa


Chapter 10 of Partners for Preservation is ‘Open Source, Version Control and Software Sustainability’ by Ildikó Vancsa. The third chapter of Part III:  Data and Programming, and the final of the book, this chapter shifts the lens on programming to talk about the elements of communication and coordination that are required to sustain open source software projects.

When the Pacific Telegraph Route (shown above) was finished in 1861, it connected the new state of California to the East Coast. It put the Pony Express out of business. The first week it was in operation, it cost a dollar a word. Almost 110 years later, in 1969, saw the first digital transmission over ARPANET (the precursor to the Internet).

Vancsa explains early in the chapter:

We cannot really discuss open source without mentioning the effort that people need to put into communicationg with each other. Members of a community must be able to follow and track back the information that has been exchanged, no matter what avenue of communication is used.

I love envisioning the long evolution from the telegraph crossing the continent to the Internet stretching around the world. With each leap forward in technology and communication, we have made it easier to collaborate across space and time. Archives, at their heart, are dedicated to this kind of collaboration. Our two fields can learn from and support one another in so many ways.

Bio:

Ildikó Vancsa started her journey with virtualization during her university years and has been in connection with this technology in different ways since then. She started her career at a small research and development company in Budapest, where she focused on areas like system management, business process modeling and optimization. Ildikó got involved with OpenStack when she started to work on the cloud project at Ericsson in 2013. She was a member of the Ceilometer and Aodh project core teams. She is now working for the OpenStack Foundation and she drives network functions virtualization (NFV) related feature development activities in projects like Nova and Cinder. Beyond code and documentation contributions, she is also very passionate about on-boarding and training activities.

Image source: Route of the first transcontinental telegraph, 1862.
https://commons.wikimedia.org/wiki/File:Pacific_Telegraph_Route_-_map,_1862.jpg

Chapter 7: Historical Building Information Model (BIM)+: Sharing, Preserving and Reusing Architectural Design Data by Dr. JuHyun Lee and Dr. Ning Gu

Chapter 7 of Partners for Preservation is ‘Historical Building Information Model (BIM)+: Sharing, Preserving and Reusing Architectural Design Data’ by Dr. JuHyun Lee and Dr. Ning Gu. The final chapter in Part II: The physical world: objects, art, and architecture, this chapter addresses the challenges of digital records created to represent physical structures. I picked the image above because I love the contrast between the type of house plans you could order from a catalog a century ago and the way design plans exist today.

This chapter was another of my “must haves” from my initial brainstorm of ideas for the book. I attended a session on ‘Preserving Born-Digital Records Of The Design Community’ at the 2007 annual SAA meeting. It was a compelling discussion, with representatives from multiple fields. Archivists working to preserve born-digital designs. People working on building tools and setting standards. There were lots of questions from the audience – many of which I managed to capture in my notes that became a detailed blog post on the session itself. It was exciting to be in the room with so many enthusiastic experts in overlapping fields. They were there to talk about what might work long term.

This chapter takes you forward to see how BIM has evolved – and how historical BIM+ might serve multiple communities. This passage gives a good overview of the chapter:

“…the chapter first briefly introduces the challenges the design and building industry have faced in sharing, preserving and reusing architectural design data before the emergence and adoption of BIM, and discusses BIM as a solution for these challenges. It then reviews the current state of BIM technologies and subsequently presents the concept of historical BIM+ (HBIM+), which aims to share, preserve and reuse historical building information. HBIM+ is based on a new framework that combines the theoretical foundation of HBIM with emerging ontologies and technologies in the field including geographic information systems (GIS), mobile computing and cloud computing to create, manage and exchange historical building data and their associated values more effectively.”

I hope you find the ideas shared in this chapter as intriguing as I do. I see lots of opportunities for archivists to collaborate with those focused on architecture and design, especially in the case of historical buildings and the proposed vision for HBIM+.

Bios:

Ning Gu is Professor of Architecture in the School of Art, Architecture and Design at the University of South Australia. Having an academic background from both Australia and China, Professor Ning Gu’s most significant contributions have been made towards research in design computing and cognition, including topics such as computational design analysis, design cognition, design com­munication and collaboration, generative design systems, and Building Information Modelling. The outcomes of his research have been documented in over 170 peer-reviewed publications. Professor Gu’s research has been supported by prestigious Australian research funding schemes from Australian Research Council, Office for Learning and Teaching, and Cooperative Research Centre for Construction Innovation. He has guest edited/chaired major international journals/conferences in the field. He was Visiting Scholar at MIT, Columbia University and Technische Universiteit Eindhoven.

JuHyun Lee is an adjunct senior lecturer, at the University of Newcastle (UoN). Dr. Lee has made a significant contribution towards architectural and design research in three main areas: design cognition (design and language), planning and design analysis, and design computing. As an expert in the field of architectural and design computing, Dr. Lee was invited to become a visiting academic at the UoN in 2011. Dr. Lee has developed innovative computational applications for pervasive computing and context awareness in the building environments. The research has been published in Computers in Industry, Advanced Engineering Informatics, Journal of Intelligent and Robotic Systems. His international contribution has been recognised as: Associate editor for a special edition of Architectural Science Review; Reviewer for many international journals and conferences; International reviewer for national grants.

Image Source: Image from page 717 of ‘Easy steps in architecture and architectural drawing’ by Hodgson, Frederick Thomas, 1915. https://archive.org/details/easystepsinarch00hodg/page/n717

Chapter 4: Link Rot, Reference Rot and the Thorny Problems of Legal Citation by Ellie Margolis

The fourth chapter in Partners for Preservation is ‘Link Rot, Reference Rot and the Thorny Problems of Legal Citation’ by Ellie Margolis. Links that no longer work and pages that have been updated since they were referenced are an issue that everyone online has struggled with. In this chapter, Margolis gives us insight into why these challenges are particularly pernicious for those working in the legal sphere.

This passage touches on the heart of the problem.

Fundamentally, link and reference rot call into question the very foundation on which legal analysis is built. The problem is particularly acute in judicial opinions because the common law concept of stare decisis means that subsequent readers must be able to trace how the law develops from one case to the next. When a source becomes unavailable due to link rot, it is as though a part of the opinion disappears. Without the ability to locate and assess the sources the court relied on, the very validity of the court’s decision could be called into question. If precedent is not built on a foundation of permanently accessible sources, it loses
its authority.

While working on this blog post, I found a WordPress Plugin called Broken Link Checker. It does exactly what you expect – scans through all your blog posts to check for broken URLs. In my 201 published blog posts (consisting of just shy of 150,000 words), I have 3002 unique URLs. The plugin checked them all and found 766 broken links! Interestingly, the plugin updates the styling of all broken links to show them with strikethroughs – see the strikethrough in the link text of the last link in the image below:

For each of the broken URLs it finds, you can click on “Edit Link”. You then have the option of updating it manually or using a suggested link to a Wayback Machine archived page – assuming it can find one.

It is no secret that link rot is a widespread issue. Back in 2013, the Internet Archive announced an initiative to fix broken links on the Internet – including the creation of the Broken Link Checker plugin I found. Three years later, on the Wikipedia blog, they announced that over a million broken outbound links on English Wikipedia had been fixed. Fast forward to October of 2018 and an Internet Archive blog post announced that More than 9 million broken links on Wikipedia are now rescued.

I particularly love this example because it combines proactive work and repair work. This quote from the 2018 blog post explains the approach:

For more than 5 years, the Internet Archive has been archiving nearly every URL referenced in close to 300 wikipedia sites as soon as those links are added or changed at the rate of about 20 million URLs/week.

And for the past 3 years, we have been running a software robot called IABot on 22 Wikipedia language editions looking for broken links (URLs that return a ‘404’, or ‘Page Not Found’). When broken links are discovered, IABot searches for archives in the Wayback Machine and other web archives to replace them with.

There are no silver bullets here – just the need for consistent attention to the problem. The examples of issues being faced by the law community, and their various approaches to prevent or work around them, can only help us all move forward toward a more stable web of internet links.

Ellie Margolis

Bio:
Ellie Margolis is a Professor of Law at Temple University, Beasley School of law, where she teaches Legal Research and Writing, Appellate Advocacy, and other litigation skills courses. Her work focuses on the effect of technology on legal research and legal writing. She has written numerous law review articles, essays and textbook contributions. Her scholarship is widely cited in legal writing textbooks, law review articles, and appellate briefs.

Image credit: Image from page 235 of “American spiders and their spinningwork. A natural history of the orbweaving spiders of the United States, with special regard to their industry and habits” (1889)

Overview of Partners for Preservation

This friendly llama (spotted in the Flickr Commons) is here to give you a quick high-level tour of Partners for Preservation.

The book’s ten chapters have been organized into three sections:

Part 1: Memory, Privacy, and Transparency

Part 2: The Physical World: Objects, Art, and Architecture

 Part 3: Data and Programming

As I recruited authors to write a chapter, the vision for each individual chapter evolved. Each author contributed their own spin on the topic I originally proposed. There were two things I had hoped for and was particularly pleased to have come to pass. First was that I learned new things about each of the fields addressed in the book. The second was discovering threads that wove through multiple chapters. While the chapters are each freestanding and you may read the book’s chapters in any order you like, the section groupings were designed to help highlight common threads of interest to archivists focused on digital preservation.

The book also includes a foreword by Nancy McGovern, and my own introductory and final thoughts.

I will be writing a blog post about each chapter’s author(s) and sharing some favorite tidbits along the way. Thanks for your interest in Partners for Preservation. [Updated 1/29/2018 to add links above to the chapter spotlight posts]

Countdown to Partners for Preservation

Yes. I know. My last blog post was way back in May of 2014. I suspect some of you have assumed this blog was defunct.

When I first launched Spellbound Blog as a graduate student in July of 2006, I needed an outlet and a way to connect to like-minded people pondering the intersection of archives and technology. Since July 2011, I have been doing archival work full time. I work with amazing archivists. I think about archival puzzles all day long. Unsurprisingly, this reduced my drive to also research and write about archival topics in the evenings and on weekends.

Looking at the dates, I also see that after I took an amazing short story writing class, taught by Mary Robinette Kowal in May of 2013, I only wrote one more blog post before setting Spellbound Blog aside for a while in favor of fiction and other creative side-projects in my time outside of work.

Since mid-2014, I have been busy with many things – including (but certainly not limited to):

I’m back to tell you all about the book.

In mid-April of 2016, I received an email from a commissioning editor in the employ of UK-based Facet Publishing (initially described to me as the publishing arm of CILIP, the UK’s equivalent to ALA). That email was the beginning of a great adventure, which will soon culminate in the publication of Partners for Preservation by Facet (and its distribution in the US by ALA). The book, edited by me and including an introduction by Nancy McGovern, features ten chapters by representatives of non-archives professions. Each chapter discusses challenges with and victories over digital problems that share common threads with issues facing those working to preserve digital records.

Over the next few weeks, I will introduce you to each of the book’s contributing authors and highlight a few of my favorite tidbits from the book. This process was very different from writing blog posts and being able to share them immediately. After working for so long in isolation it is exciting to finally be able to share the results with everyone.

PS: I also suspect, that finally posting again may throw open the floodgates to some longer essays on topics that I’ve been thinking about over the past years.

PPS: If you are interested in following my more creative pursuits, I also have a separate mailing list for that.

Harnessing The Power of We: Transcription, Acquisition and Tagging

In honor of the Blog Action Day for 2012 and their theme of ‘The Power of We’, I would like to highlight a number of successful crowdsourced projects focused on transcribing, acquisition and tagging of archival materials. Nothing I can think of embodies ‘the power of we’ more clearly than the work being done by many hands from across the Internet.

Transcription

  • Old Weather Records: “Old Weather volunteers explore, mark, and transcribe historic ship’s logs from the 19th and early 20th centuries. We need your help because this task is impossible for computers, due to diverse and idiosyncratic handwriting that only human beings can read and understand effectively. By participating in Old Weather you’ll be helping advance research in multiple fields. Data about past weather and sea-ice conditions are vital for climate scientists, while historians value knowing about the course of a voyage and the events that transpired. Since many of these logs haven’t been examined since they were originally filled in by a mariner long ago you might even discover something surprising.”
  • From The Page: “FromThePage is free software that allows volunteers to transcribe handwritten documents on-line.” A number of different projects are using this software including: The San Diego Museum of Natural History’s project to transcribe the field notes of herpetologist Laurence M. Klaube and Southwestern University’s project to transcribe the Mexican War Diary of Zenas Matthews.
  • National Archives Transcription: as part of the National Archives Citizen Archivist program, individuals have the opportunity to transcribe a variety of records. As described on the transcription home page: “letters to a civil war spy, presidential records, suffrage petitions, and fugitive slave case files”.

Acquisition:

  • Archive Team: The ArchiveTeam describes itself as “a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage.” Here is an example of the information gathered, shared and collaborated on by the ArchiveTeam focused on saving content from Friendster. The rescued data is (whenever possible) uploaded in the Internet Archive and can be found here:

    Springing into action, Archive Team began mirroring Friendster accounts, downloading all relevant data and archiving it, focusing on the first 2-3 years of Friendster’s existence (for historical purposes and study) as well as samples scattered throughout the site’s history – in all, roughly 20 million of the 112 million accounts of Friendster were mirrored before the site rebooted.

Tagging:

  • National Archives Tagging: another part of the Citizen Archivist project encourages tagging of a variety of records, including images of the Titanic, architectural drawings of lighthouses and the Petition Against the Annexation of Hawaii from 1898.
  • Flickr Commons: throughout the Flickr Commons, archives and other cultural heritage institutions encourage tagging of images

These are just a taste of the crowdsourced efforts currently being experimented with across the internet. Did I miss your favorite? Please add it below!

Heading to Austin for SXSW Interactive

Anyone out there going to be at SXSWi? I would love to find like-minded DH (digital humanities) and GLAM (Galleries, Libraries, Archives & Museums) folks in Austin. If you can’t go, what do you wish I would attend and blog about after the fact?

No promises on thoroughness of my blogging of course. I never have mastered the ‘live blogging’ approach, but I do enjoy taking notes and if the past is any guide to the future I usually manage at least 2 really detailed posts on sessions from any one conference. The rest end up being notes to myself that I always mean to somehow go back to and post later. Maybe I need to spend a month just cleaning up and posting old session summaries (or at least those that still seem interesting and relevant!).

Drop me a comment below or contact me directly and let me know if you will be in Austin between March 10 and 15. Hope to see some of you there!