- Home /
- Presentations /
- The Final Death(s) of Digital Scholarship—An Ongoing Case Study of DH2005 Projects
The Final Death(s) of Digital Scholarship—An Ongoing Case Study of DH2005 Projects
Talk presented March 1, 2019, at the Digital Afterlives Symposium at Bard Graduate Center, New York, NY.
My talk is titled “The Final Death(s) of Digital Scholarship—An Ongoing Case Study of DH2005 Projects.” What do I mean by digital scholarship? I mean a scholarly endeavor that uses digital tools or produces a digital product. In this presentation, I’ll be focusing on web projects that were created in a scholarly environment. For example:
- DH Curation Guide, based at the University of Illinois, which offers expert essays about curating digital humanities data (I worked on this as a grad student)
- US Epigraphy Project, based at Brown, is a database of ancient inscriptions for scholars of antiquity (I designed the nav banner as an undergrad student)
- Walt Whitman Archive, based at UNL, which makes Whitman’s writings publicly available, along with other research resources (I was not at all involved in this but have long admired it)
These examples and the case studies I’ll talk about today are all web-based projects. They’re websites with some kind of interactive feature. The main thrust of my talk is applicable to most digital scholarship projects.
The two deaths of digital scholarship
You may be thinking, “The Final Death(s) of Digital Scholarship,” this is a pretty melodramatic title. Or you may be wondering, that sounds kind of familiar, where did she get that from?
I began thinking about digital scholarship in the middle of watching a very emotional scene in the Disney movie Coco. The character Chicharrón has died once in our world, the world of the living. But after a long time in the spirit world, he finally dissolves into nothingness.
Our protagonist, Miguel, asks, “Wait, what happened?”
His friend Héctor replies, “He’s been forgotten. When there’s no one left in the living world who remembers you, you disappear from this world. We call it the Final Death.”
I will admit I’m on the verge of tears just thinking about Coco, so I will hurry myself into my transition: when Héctor talks about the final death, it reminds me so much of the lifespan of a digital project on the web. A project (like an image database or an oral history collection), after a period of activity and creation, usually gets wrapped up after the grant is finished or the creator moves on. The project may have “died” in the sense that it’s no longer active. But it’s still there, just static. It enjoys an afterlife of usefulness and citation, or maybe just the occasional visitor.
But it’s almost inevitable that this afterlife ends. The project is deleted when its creator forgets to renew a domain name or moves on to another university and declines to transfer the web project. It dies a second and final death — it’s disappeared, gone forever.
Digital scholarly projects will disappear
The disappearance of scholarly work on the web means that there are holes opening up in the scholarly record. If you’re a scholar who works in the digital realm, the sources you use and reuse and cite will disappear. What does that mean for your claims? What does that mean for the reproducibility of your results? If you produce digital projects yourself, what does that mean for the record of your own research?
Digital work is especially prone to this final death. The digital decays fast and it decays completely.
Or as Neal Beagrie puts it, “In the right conditions papyrus or paper can survive by accident or through benign neglect for centuries, or in the case of the Dead Sea Scrolls, for thousands of years … In contrast, digital information will not survive and remain accessible by accident: it requires ongoing active management from as early in the life-cycle as possible.” Beagrie, Neal. “Digital Curation for Science, Digital Libraries, and Individuals.” International Journal of Digital Curation 1.1 (2006): 3-16.
As someone whose work is almost all web-based, I care a lot about preserving digital scholarship. The web hasn’t been around long enough for us to tell exactly where we’re headed and how this new technology (new in the grand scheme of things) will pan out for us, but we have some early indications.
We know that lot of the early web is lost to us — as Megan Sapnar Ankerson puts it: “It is far easier to find an example of a film from 1924 than a website from 1994.” [Ankerson, Megan Sapnar. “Writing Web Histories with an Eye on the Analog Past.” New Media & Society (2011) 14.3: 384-400.]
It’s also easier to find a 100-year-old academic paper, or a 300-year-old book, thanks to libraries, than a 25-year-old website. Now, I’m a librarian, and I’ll be the first to tell you that not everything must be saved. Weeding our collections is the only way to keep a library useful and spacious. And digital preservation is an extremely costly and labor-intensive endeavor.
My argument is that digital, web-based scholarship is part of the scholarly record, so it should be preserved. But it can only be preserved with human care and labor. By changing some scholarly practices, we can bless our digital projects with long and fruitful afterlives.
What’s the “death rate” of digital scholarly projects?
Those of us who do digital work might look for indications of how best to accomplish this. How long have past scholarly projects lasted? What can their longevity tell us about the fate of our own work?
Starting in 2015, I examined academic web projects that were 10 years old. I looked at a specific set of these projects, those that were presented at the Digital Humanities conference in 2005, held at the University of Victoria. The field of digital humanities, aka humanities computing, was well-established by then. (Scene-setting: this is back when social media was just starting out, and Facebook didn’t have a news feed, and Gmail was invite-only.)
I’m looking only at projects presented at DH 2005 that had a web component, something that was publicly available on the web and had some kind of interactive feature. There were 48 of these projects. There may have been more such projects at DH 2005 than the ones I’ve included in my audit, but I’ve only included projects that I could definitely tell had a web component as indicated in the abstract.
By 2015, 7 of these web projects (14%) were no longer online at all.
By 2019, that has increased to 10 (21%). Not a shocking leap there, but indicative of the larger trend.
That number will never go down. It will only increase over time. Some of these projects were fully available 4 years ago and are now just partially available, as a result of digital decay. Some have disappeared. What happened to the projects that had this status change? Here are two case studies that trace unexpected afterlives of DH projects.
Case study: Forced Migration Online
Let’s take an in-depth look at one of the projects that disappeared, Forced Migration Online, which was based out of Oxford University. It was a resource repository and information hub about human displacement: refugees, IDPs, and people displaced by disasters. It launched in 2002. In 2005, one of the project team members, Dr. Deegan, presented the essentials of this project at the DH Conference. When I first looked at their site (forcedmigration.org) in 2015, it was still fully online, but today you get a page not found error. There’s no obvious place that it moved to that I could find with my best searching skills. What happened?
Using the Internet Archive’s Wayback Machine, I saw that in 2014, this message was posted. The project team said, “We recognize that Forced Migration Online is a valuable resource … and we are looking for funding opportunities to enable us to continue its development.”
But by 2018, the site’s alert just said that it was no longer being updated, so we can guess that a funding source was not found.
The domain name is still owned by Oxford and expires in 2020. I would guess that without funding, the site couldn’t be maintained, so there was a decision to simply take it offline as late in the game as possible, even if the domain name was still owned for longer.
Interestingly, after more digging, I stumbled across a subdomain, repository.forcedmigration.org, which is still online as of spring 2019. The page styling is almost all gone, since everything under the main forcedmigration.org domain was deleted from the web. But the repository still works!
You can search and download any of their 5,000 resources, which date from the 1950s through 2011. I checked some of the older resources, and FMO appears to be the only place you can find copies of some of them online. In my dataset, I cataloged this as a “partially available” resource, since the rest of the FMO site is down.
Look at this metadata! The project team put a lot of work into compiling and adding to this information, and making the fair-use case to make these thousands of documents available to the public. I hope that this repository stays online, especially we are still in the midst of a global refugee crisis.
So here’s the breakdown of this project’s afterlife:
- Launched in 2002
- Presented at DH Conference in 2005
- Funding issues beginning 2014
- No longer maintained by 2018
- Offline (mostly) in 2018 or 2019
In total, the Forced Migration Online project was fully available on the open web for 16 or 17 years. After 4 years of funding issues, it was taken offline, except for the repository, which is a valuable collection of documents that remains online… For now.
Case study: Clotel Electronic Edition, Documents Compass
Let’s take a look at another project, the Clotel Electronic Edition, which has an unexpected afterlife.
Clotel is said to be the first African-American novel. There were four versions with different endings published. The Electronic Edition project made these available online, with lots of other critical resources as well. In 2015, I cataloged this project as still available online. The full scholarly edition itself was paywalled, but the project had a thorough website with many resources. The project was part of Documents Compass, a grant-funded project out of the Virginia Foundation for the Humanities.
This is what it looked like as late as 2014. The text says that they “provide non-profit assistance to those who are engaged in or planning documentary editing projects…”
This year, the URL for the Clotel scholarly edition brought me to a “Page not found” landing page at the same URL, documentscompass.org. I took a look at the rest of the site, thinking maybe it was moved without a redirect.
On the site’s homepage, much of the wording is the same — “We provide non-profit assistance to those who are engaged in or planning documentary editing projects,” they mention XML, data tagging, and other things you might expect. But something felt “off” to me. The stock images looked too anodyne.
Scrolling down on the page, the site mentions that Documents Compass is a program of the Virginia Foundation for the Humanities, same as the previous site. But what was bugging me, then?
I scrolled down some more and looked at their blog posts, which were riddled with typos! Lots of spare apostrophes, random capitalization, verb tense errors. And suddenly documentary editing was equated with filming documentaries?
That’s when I realized that this was a fraudulent website. It was not operated by the Documents Compass team at all. They used the same wording as the old site, but it was in no way the same project. Honestly, I’m baffled by this — there’s a fake phone number and fake mailing address listed, along with a web form and a working email address. I did email them but haven’t heard back. I’m not sure what their goal is — to defraud people looking to create scholarly editions who were following a tip about the old Documents Compass project? It’s unclear! Other academic projects have been hijacked in sort of the same way, by fake publishers looking to squeeze publication money out of unsuspecting authors, but this isn’t quite the same thing, as far as I can tell.
The WHOIS lookup says that a site called dropcatch.com registered the URL, documentscompass.org. This is a company that squats on domain names that have expired. The domain name documentscompass.org is bland enough that it can be useful for someone, so it was worth squatting on.
Using the Wayback Machine again, and going back in time, I saw that there was an empty domain squatting page in 2018…
A page that said that the site has been “archived or suspended” in 2017…
And finally, as late as March 2017, there was a news post on the Documents Compass website detailing the end of the project. The Mellon grant lasted from 2008 to 2016. This very nice post is a project history, and is, to a researcher like me, a thorough documentation of how a digital humanities project comes to a close. After funding disappears, the site was online and static for about a year before it disappeared and then, confusingly, hijacked.
I should note that the Clotel scholarly edition website is, presumably, still online, but it’s behind a UVA login. Previously, the project site was publicly available, but it’s not anymore. Thus this is a site that has an online status change in my dataset.
Brief summary of this digital scholarly project:
- Published by Electronic Text Center (UVA) in 2005
- Presented at DH Conference in 2005
- Absorbed into Documents Compass project, funded 2008-2016
- documentscompass.org taken down in 2017
- Domain bought for squatting in 2017
- Fraudulent documentscompass.org website put up, 2018 or 2019
- Clotel scholarly digital edition website available for 12 years on the open web; still available to UVA community
To summarize, the Clotel electronic edition website was available for 12 years on the open web. It is still available at the University of Virginia.
Abandoned, finished, and ongoing digital scholarship
Let’s get back to looking at the subset of DH 2005 projects as a whole again. There’s a good amount of research about what’s termed “abandoned” digital scholarship.
Other researchers have examined abandonment specifically within the field of DH. Ten years ago, Bethany Nowviskie and Dot Porter designed a survey project called “Graceful Degradation: Managing Digital Projects in Times of Transition and Decline.” The survey was sent out in 2009, with over 100 responses. It asked about digital projects in or related to the humanities, and the authors analyzed the findings to see how digital projects fared when facing difficult times (like funding troubles) and periods of transition (like colleagues leaving the institution).
- 64% of respondents had experienced the decline of a project or had weathered a period of difficult transition. Of these:
- 51% were identified as still ongoing
- 26% were abandoned
- 15% were finished
- 8% were just getting started.
The survey results indicated that project scopes tended to change, and that reliable funding was sometimes an issue.
Another, more recent addition to the scholarship around abandoned DH projects comes from Luis Meneses and Richard Furuta’s recent article in Digital Scholarship in the Humanities. Meneses, Luis, and Richard Furuta. “Shelf Life: Identifying the Abandonment of Online Digital Humanities Projects.” Digital Scholarship in the Humanities, web only, 2019.
They looked at every DH Conference abstract from 2006 to 2016, extracted the URLs, and ran a test to see how many returned a 200 response code (which is to say, a site was live on the other end) versus error codes like a redirect or a 404. They called this URL decay, and based on their findings, they ascertained that a DH web project has a shelf life of about 5 years.
Similar research in URL decay have been done in other fields, too, including law and library science, to identify the “half-life” of a URL. The prognosis isn’t great for URL stability. As we’ve seen in the previous two examples of Forced Migration Online and Documents Compass, though, documenting URL changes is a useful but limited way of finding the true status of the project. I focused on just one year’s worth of DH projects so I could examine each project individually to identify their status.
This is what my data looks like, by the way. I’m logging project info, URLs as they were given, URLs as they were in 2015, URLs as they are in 2019, and —
Any changes about the project’s status, along with those interesting notes about what’s still available and what isn’t.
Of the 40 projects that were still available online in 2015, here’s how their availability changed in the past 4 years:
- Over 60% of these projects (26 in total) were still online and still located where they were four years ago
- Four projects (or 10%) had disappeared
- 10 projects (25%) had new URLs: 5 redirected, 4 did not redirect, and 1 gave the new link on the old page
What was intriguing to me was that last point — a quarter of the projects in this subset had a new URL. For the four projects did not have a redirect, I found the new URLs through a web search or by combing the project team’s CVs.
Thinking about abandonment in particular, I also cataloged the projects’ status:
As of 2019:
- 50% of these projects (24 in total) are clearly finished
- 25% (12) are ongoing, amazingly, almost 20 years later
- Four projects are clearly abandoned
- The remaining 8 are unclear to me
Here’s an example of a project that I am assuming is abandoned:
I’ll leave the project unnamed, but here’s a snippet from its description page. It’s still available online in 2019, but the final paragraph of the description is frozen since 2005 in the future tense. “This rich image archive will enable us to…” It “will enable users to display the text…” There’s no rich image archive on the site, sadly. I dug a little deeper. The PI retired in 2005. I looked at the CV of one member of the project team, and they list working on this project through 2005 but no later. So with this information, I felt comfortable calling this project abandoned.
What does a clearly finished project look like?
The Online Chopin Variorium Edition was presented at DH 2005. Today, it has a very nice landing page that clearly describes the history of the project, which ended in 2015. At the bottom of the page, it notes who is maintaining the site: “This site is maintained under a Service Level Agreement by King’s Digital Lab.” Clearly, care has been taken to keep this project online, even past the end of its funding.
And what does an ongoing project look like?
The TaPoR project was presented at DH 2005, and 14 years later, it’s going strong! It’s an ongoing database of tools for text analysis. The About page does note that the first two versions of the site are no longer available, but they can provide access on request.
What can we learn from this audit?
So, having taken a close look at 48 projects that had a web component from DH 2005, what are some of the conclusions that I have drawn?
Web-based scholarly projects are disappearing
Sometimes this happens quickly, sometimes slowly: it’s difficult to predict. An enormous amount of funding does not guarantee a long afterlife! Some of the projects that are no longer available were expensive, labor-intensive efforts — one of these had $150,000 in NEH and Mellon funding.
Preservation is unevenly distributed and is the result of many factors. This includes hosting issues: as we saw previously, in just 4 years, a quarter of the projects I looked at had changed their URLs. Hosting changes are likely to happen.
So are funding issues. Other research has identified funding fluctuations as a major factor in digital preservation, since keeping even a plain website online and functioning takes active and ongoing maintenance, which is to say, human care and labor.
For those projects that have interactive components, it might not be realistic to expect ongoing maintenance into the future. A complex project may rely on a content management system that needs ongoing security patches, PHP updates, server updates, and so on, and it’s likely that there will be an incompatibility problem eventually. Everything digital dies.
Additionally, project teams change: people retire, people die, people move on to other institutions. This is normal in academia, and it’s not always something you can plan for when you’re writing out your project roadmap.
Some projects have unexpected afterlives
In the course of this research, I found that some projects have unexpected afterlives. It’s not just that some projects survive and some die. Some, like the Documents Compass project, end up being hijacked or otherwise fraudulently presented. It’s hard to hold on to a domain name, especially in an institutional context, so digital projects do run this risk more than you’d think.
Documentation is critical for future researchers
In the course of my research, I found that documentation by project teams is critical for understanding the context and impact of digital projects in the scholarly record. We saw several sites that had project page updates. The best of these had a clearly written project history that detailed funding sources, team members, and important dates and milestones.
For completed or abandoned projects, including the “end of life” notification — or as I call it, the “goodbye cruel world” note — helps future researchers understand what happened to your project and why it’s inactive or gone. This is important, even if your site eventually goes offline, because a future researcher might check the Internet Archive for information and find this note.
*(Note, July 2019: the Digital Documentation Process project’s Archiving Dossier Narrative provides wonderful standardized template for this “end of life” note. Hat-tip to Micah Vandegrift for passing this on to me.)
Traditional academic outputs, like articles and white papers, were also critical for understanding the role and features of digital projects. For interactive projects, screenshots and videos were key to understanding how the project functioned, even after it broke or disappeared.
Additionally, I was surprised at how much I could rely on people’s CVs to understand the afterlife of a project. Sometimes they told me that funding ran out, or that team members moved on. Often they gave me a glimpse into how digital research can morph into several different, successive projects — a different kind of afterlife.
Cultivate a preservation mindset
Finally, I would not be a responsible librarian if I did not mention the importance of having a preservation mindset. Preservation work begins at the moment of creation. Having a preservation mindset means that your project roadmap includes a section about archiving and its afterlife, who could host it and how it will be funded. It also means making choices with long-term, low-risk thinking. Should your project be based on an older but widely-used CMS, or a shiny new one with limited support? A preservation mindset would indicate that the less-exciting but more stable system is the better choice.
The responsibility of preservation belongs wholly to the project team. If you know that your digital project has a short projected lifespan, that should change how you design your project, how you share it, and how you preserve some part of it for yourself. Libraries, archives, and other repositories of our scholarly heritage are addressing digital preservation problems. But scholars who do digital work must be invested in preserving their work, too — or we run the risk of losing a whole domain of scholarship. It’s up to us to give our digital projects a long afterlife and stave off that final death for as long as possible.
This work builds off of a talk I gave at MLA 2015.
Thank you to Jesse Merandy and Bard Graduate Center for organizing this symposium.