Seeing: final report for EMDA
Final presentation given July 26, 2013 at Early Modern Digital Agendas, a summer research institute funded by NEH at the Folger Institute in Washington, D.C. Our final reports were given after three weeks of high-level discussion centering on digital aspects of early modern scholarship. There was also a steady stream of tweets throughout the institute; reading through these gives a good sense of the depth of discussion and sense of community (as evidenced by the continued use of the hashtag!).
We've talked about text in the past few weeks using a variety of metaphors. We've talked about text as a stream like a bitstream or a stream of glyphs, or tweet stream, a linear progression without a real beginning or end, or even editions that have iterative lifecycles; we've also talked about discrete units of text: the books as items, splittable into words (maybe), letters, even symbols that have lost their meaning. I'm making a bit of a jump here, but it to me it seems that the ways we talk about light — the wave, the particle — and seeing have some traction with the ways we talk about texts and reading. So I'll look at the different ways we've seen this month.
invisibility and ethics of digital labor
Some things we haven't seen, or we've seen only after having them revealed. Ian Gadd led us through a great exploration of EEBO, and I don't think I'm alone in saying that that revealed to me much of what was invisible in the database — its tangled origins, determined by a weird blend of entrepreneurship and war, and the many remediations that flatten out on the screen. We shared an anxiety about the ethics of labor that is essentially invisible to us, and how easy it is to not see it. (Did TCP ever get back to us about fair pay for their keyboarders?)
There's a way in which the digital is a frictionless environment — it's flat, but (speaking from personal experience) human error scales at a rate equal to our aspirations. It's — I hate this phrase — a slippery slope. We can do more, faster, and sometimes that means that, for example, bad metadata is propagated across a large network. Small example: here at the Folger, Teena Rochfort Smith's entity identifier is not associated with the Hamnet record for the 4-Text Edition of Hamlet. Her name is in the title, but if you were looking at her name record in Hamnet, it's not linked to the 4-text edition. (The librarian I talked to said that was strange and she would look into it.)
Physically, too, the digital is slippery. The screen is presenting representations of dimensionality, but the screen itself is flat and slick. There's a great essay by Brett Victor about how the literal smoothing of our physical relationship to machines is completely nonsensical, since what good are our sensitive hands on a flat surface we just poke at all the time?
I was struck by Michael Witmore's visual of the library as a folded space. We can't see all of our texts in the physical library, because we've folded them into each other. Similarly, the folding in of a library's book images creates an condensing of a user's experience of digital collections. What you're looking for online is folded under the search box. Again, I think of that Craig Mod essay I keep mentioning — we don't have a good way of seeing a whole collection, seeing the edges of what we have. It's a question of scope, I guess. I liked Ellen MacKay's comment the other day, that the digital allows us a range of scope. We should be able to get a 30,000 mile view, and a 30 nanometer view. But the practice of providing a 30,000 mile view isn't yet refined. The best we can do for most of our projects is give numbers (e.g., 500,000 books!) and lists of entities and titles to page through—except maybe for Marc Alexander's beautiful visual of the English language. I'm still in total awe of that and I hope it becomes interactive someday.
being looked at
As a librarian, I think a lot about citation, though I don't actually do a lot of advising in that area. It was extremely useful to me to hear about how citation practices area affected by anxieties about digital realness, and about being looked at — other scholars looking at your citations and making judgements. I'll be taking these anecdotes back with me to think more about how we can better align our standards.
(If you've got a digital project, you better be telling people how to cite you. Put that link in your footer.)
We also look through time, backward and forward. As an interested party in data curation and digital preservation, I think a lot about the afterlives of texts. I've spoken with a few of you about our shared concern for digital projects over time. The community of computing humanists has been around for long enough that too many of the first endeavors are now lost or inaccessible. The best place for data curation to happen is actually at the point of data creation, as a matter of good planning and setting up your workflow right — but that's very difficult to achieve. Instead, we see a lot of projects begin, end, and degrade over time. Several years ago, a survey called Graceful Degradation went around before DH2010, and they found that 64% of respondents had experienced the decline of a project or had weathered a period of difficult transition. Of projects that had experienced decline, 65% continued to be active or were completed, 26% were abandoned, and 8% were just beginning. We talked a bit here about project management. It's rare that we see a project team remain whole beginning to end, and maintains the same scope and purpose as at its outset, and doesn't hit roadblocks in funding, technology, or rights issues. That's one of the natures of our work.
Out of curiosity, I started looking at projects that were presented at DH2005 — like a 'where are they now' for work going on well after DH was in full swing but I think before the DH Commons and the various registries that are active now. [I made an editable Google Drive spreadsheet for 1990 and 2005 DH conference projects—feel free to add, as it is sparsely populated.] Anyway, it was often hard to track down the project. Redirects weren't set up from old URLs, 'About' pages tended to be in the future tense, sometimes I couldn't actually tell if they had planned to offer anything online or where it would be located. In fact, the best way for me to track a project's progress turned out to be the CVs of the investigators, because too often the web sites wouldn't specify the end of the funding period or when the project pivoted. Other people (like funding bodies) have done investigative work like this, and I'd like to as well — not just for the end result of saying X number of projects stuck around or made a significant difference, but I want to do this retracing work for the work itself — to better understand what is needed to ensure future access, either to the resource's information or the resource itself.
We look at project afterlives to see continued access or denied access, or different levels of access. Accordingly, we might also see project reuse — or misuse. That's something I found extremely useful in our conversations this week about visualization and presentation. One source of our anxiety about data visualization is our understanding that it is fundamentally different from text — not just in the strategic display of information that could be better explained in paragraph, but also in the image's status as discrete object. Our technology is porous and allows for our images to be plucked from context, propagating what could become misconstrued facts. Add this to the long chain of dependencies we know exist behind every data set, and you've got complicated image origins, hidden from view behind something that can look pretty slick.
We look at what I'm terming project forelives, too. I loved Bonnie Mak's piece on the archaeology of digital resources, following the ESTC and EEBO as exemplars. Archaeology is a great word — the removal of obfuscations to find evidence of forelives. With this approach, we can retrace scholarly pivots (or changes in direction) and the nebulous concept of project scope, which is ostensibly determined by the project charter but which we wouldn't be surprised is shaped by technologies that make us uncomfortable, or technologies that weren't developed enough at the time of the endeavor.
Looking ahead for me, I see not just looking, but doing. I'll continue to provide research data management models for the faculty at John Jay and CUNY as a whole. We haven't done a great job yet of shaping humanities data practices at my institution, but that's something I truly believe is beneficial for any project even in its infancy. Also, almost every one of you has approached me to ask if I know someone at John Jay, and almost every time, I had to say no in shame, and I have no excuse since it's been almost a year. So ahead, for me, now that some of the systems I've set up in the library are humming away, I know I need to get out and network and collaborate. And I really do feel that one thing this institute has given me is the vocabulary necessary to work with faculty in the English department or other humanities folks. So, thanks to all of you for that.
I see also not just doing, but playing. There's no failure in play, just pivot points, which is why billing things as prototypes or experiments or proofs of concept is so valuable. I thought I knew what I was doing when I came into this program with an agenda to further my text mining chops, but seeing these linguistics projects has totally blown me away. I'll be exploring more in language analysis within my own studies this fall, armed with a bolstered toolkit and a better understanding of what's been done and what's possible. I'm also not totally sure after all that historical sentiment analysis is a road I want to go further down — DocuScope seems to have covered the positive/negative method, along with a ton of other sentiments. So my initial question seems reductive, to be hitting just positive and negative over time, rather than embracing all word senses, but I also know that since I haven't actually tried it yet, there are divergent paths awaiting me. In any case, I think at this point, I'm feeling pretty overwhelmed and paralyzed with a sense of 'I know nothing'-ness, which can be a fun and low-risk place to be. So for now I'm going to approach the massive unknown by way of amusement and caper rather than serious endeavor with deliverables.
Posted September 22, 2013