Is This Your Book? What we call digitized manuscripts and why it matters

This is a version of a paper I presented as a Rare Book School Lecture at the University of Pennsylvania in Philadelphia on June 12, 2018, originally entitled “Is this your book? What digitization does to manuscripts and what we can do about it.” 

Good afternoon and thank you for coming to my talk today. The title of my talk is “Is this your book? What digitization does to manuscripts and what we can do about it.” However I want to make a small change to my title. I’m not entirely sure if there’s anything we can do about what digitization does manuscripts but I do think we can think about it, so that’s what I want to do a bit today. I want us to think about digitized books – specifically about digitized manuscripts, since that’s what I’m particularly interested in.

So, like any self-respecting book history scholar, I’m going to start our discussion of digitized manuscripts by talking about memes.

Memes

Definition of the word “meme” from the Oxford English Dictionary.

The word meme was coined in 1976 by Richard Dawkins in his book The Selfish Gene. In the Oxford English Dictionary, meme is defined as “a cultural element or behavioral trait whose transmission and consequent persistence in a population, although occurring by non-genetic means (especially imitation), is considered as analogous to the inheritance of a gene.” Dawkins was looking for a term to describe something that had existed for millennia – as long as humans have existed – and the examples he gave include tunes, ideas, catchphrases, clothes fashions, ways of making pots or building arches. These are all things that are picked up by a community, ideas and concepts that move among members of that community, are imitated and modified, and which are frequently moved on to new communities as well where the process of imitation and modification continues. More recently the term meme has been applied specifically to images or text shared, often with modification, on the Internet, particularly through social media: If you’ve ever been RickRolled, you have been on the receiving end of a particularly popular and virulent meme.

This is all very interesting, Dot (I hear you say), but what do memes have to do with digitized manuscripts? This is an excellent question. What I want to do now is look at a couple of specific examples of memes and think a bit in detail about how they work, and what it looks like to push the same idea through memes that are similar but that have slightly different connotations. Then I want to look at some different terms that scholars have used to refer to digitized manuscripts and think a bit about how those terms influence the way we think about digitized manuscripts (if they do). My proposition is that these terms, while they may not exactly be memes, function like memes in the way they are adapted and used within the library and medieval studies scholarly communities. So let’s see how this goes.

In the film The Black Panther, which was released back in February of this year, there’s a scene where a character has come to the country of Wakanda to challenge the king for the throne. This character, N’Jadaka (also named Erik Stevens, but better known by his nickname Killmonger), is a cousin of the king, T’Challa, but was unknown to pretty much everyone in Wakanda until just before he arrives to make his challenge. At the climax of this scene, during which Killmonger and T’Challa fight hand-to-hand in six inches of water, Killmonger – who is clearly winning – turns to the small audience of Wakandans gathered to witness the battle and exclaims, “IS THIS YOUR KING?” If you haven’t seen the film I’m about the spoil it for your: it turns out the answer to that question is NO.

This is a phrase that was born to be a meme, and within a month that’s exactly what happened.

According to the Know Your Meme website the first instance of the “Is this your king” meme appeared on March 20 on Twitter when @TheyWant_Nolan tweeted a screen shot of the scene with the caption “is this your spring”. If you think back to March, the weather was pretty terrible everywhere around the country. It was long and tedious going back and forth between snow and heat then back to snow. Is this your Spring? NOPE.

This type of meme is a snowclone, defined as “a type of phrasal templates in which certain words may be replaced with another to produce new variations with altered meanings, similar to the “fill-in-the-blank” game of Mad Libs.” I would like to note here that this term, snowclone, was coined in 2004 by American linguists Geoffrey K. Pullum and Glen Whitman specifically to describe this phenomenon. The concept of a snowclone has been around for much longer than the term – think of “I’m not an X but I play one on TV” which was the most hilarious phrase when I was a kid – and the “Is this your king” meme works the same way, where we replace king with some other word to make a phrase that is understood to elicit a negative response.

Here are some other examples of this meme featured on its Know Your Meme page. These all supply the identity of the question asker, they vary widely by topic, and one of them makes a slight modification to the image, but they all imply a negative response to the question.

I made one myself. My meme features a screen shot of my favorite manuscript, UPenn LJS 101, as seen through the Penn in Hand manuscript interface. In my meme, the question asked is, is this your book? As we know from the context of the original meme, the answer to the question is no. This is not my book. Or: It’s not my real book.

I’ve made a few other memes and for some reason most of them play with the relationship that a digitized version of a manuscript has with the physical object.

Memes such as “Is this your king” and this next one, the “Is this a pigeon” meme, enable us to ask questions with assumed answers. In this meme, the original scene is from an anime where a human-like android sees a butterfly and asks, “Is this a pigeon?” This is another snowclone, where the question asker, the object of the question, and the question itself can be replaced with almost literally anything else. I find these snowclone memes work well for my needs, though I find the differences between the emotions that these two memes elicit fascinating.

As before, I’ve replaced the object of the question with digital images of LJS 101 and specifically identified myself as the question asker. As with the previous meme, we know the answer to the question posed is no, although the context is different: while the king meme is used to express aggressive negativity, the pigeon meme is used to express mild but total confusion. The same idea can be pushed through both memes – is this digital thing a manuscript? – and while the answer is the same – no it’s not – the negative response of the pigeon meme is “oh you silly thing, thinking the digitized manuscript is the same as the manuscript” while the negative response of the king meme is “that thing is NOT the same as the manuscript, I’m offended you think so, and I’m going to throw it off a cliff so you don’t try it again.”

Although both of these memes can be used as a kind of mirror for us to view the relationship between a manuscript and its digitized version, they expect different responses and elicit different emotions, much as different words used to refer to the same situation or person might invoke different emotions. The memes are, in effect, acting as a kind of terminology, so now I want to pivot and talk about how terminology might act as memes.

Terms

I would like to take it as a given that that how we talk about things influences how we think about them; therefore, the terms we use to describe things matter. The terms we use to describe other people matter; the terms that we choose to refer to digitized manuscripts matter. I would also like to reiterate the proposition I made a few minutes ago that our terminology, while perhaps not memes themselves, are meme-like. In his 2016 article “’ut legi”: Sir John Mandeville’s Audience and Three Late Medieval English Travelers to Italy and Jerusalem,” Anthony Bale discusses Jerusalem as a meme in medieval English travel writings, but I find that his description of meme fits well with what I would like to do here. He says, “the meme proposes a model of cultural transmission based on audiences’ ongoing use and appropriation of the source, as opposed to the scholarly desire to return to the source as the “best” or “original” iteration.” (for a term, this would mean common usage points not to the original meaning of the word, but to the word as it is being used. That’s a bit of a circular argument but I think it makes sense) He continues, “Memes have not one stable author, no unitary point of origins, and are not retrospective, but rather change with their audiences, causing people to do things; stimulating actions and changing behaviors; leading people to take a particular route, see a particular site, notice one thing but not another, find new meanings in an old source.” (Bale, p. 210)

Following this theory, terms work like this:

  1. A term begins with a specific meaning (e.g., outlined in the OED, citing earlier usage),
  2. A scholar adopts the term because we need some way to describe this new thing that we’ve created. So we appropriate this term, with its existing meaning, and we use it to describe our new thing.
  3. The new thing takes on the old meaning of the term,
  4. The term itself becomes imbued with meaning from what we are now using it to describe.
  5. The next time someone uses that term, it carries along with it the new meaning.

Some scholars take time to define their terms, but some scholars choose not to, instead depending on their audience to recognize the existing definitions and connotations of the terms they use. For example, in her 2013 article “Fleshing out the text: The transcendent manuscript in the digital age,” Elaine Treharne (coming out of a description of how medieval people would have always interacted with a physical book) says: “for the greater proportion of a modern audience on any given day, one has necessarily to rely on the digital replication: the world of the ironically disembodied and defleshed simulacrum, avatar, surrogate.” (Treharne, p. 470) [emphasis mine] Here Treharne uses the terms simulacrum, avatar, and surrogate without defining them, and she groups them together, in that order, placing simulacrum first in that list. More than the other two, simulacrum has a negative connotation – as we can see from its entry in the OED, a simulacrum is a “mere image”; it looks like a thing without possessing its substance or proper qualities; it is a “specious imitation”. Although it is near identical in meaning and from related Latin roots as the term facsimile, which I’ll discuss in a moment, facsimile lacks the negative connotations that simulacrum has. Although the terms are undefined by the author, it seems that this was a purposeful word choice intended to elicit a negative response.

Compare this with Bill Endres, who in his 2012 article “More than Meets the Eye: Going 3D with an Early Medieval Manuscript” spends several paragraphs defining his terms and arguing for why he chooses to use some terms and not others. Endres says, “I will refer to 3D and 2D images as digital artifacts or digital versions, although not totally satisfied with either term as it relates to epistemology. I am tempted to refer to them as digital offspring, the results of a marriage between digital and manuscript technologies, with digital versions having unique qualities and a life of their own. This term is problematic but it speaks to the excesses, commonalities, and deficits when digital versions are measured against their physical antecedent.” (Endres, p. 4) Endres then discusses some other terms, including two of the ones I will consider in a moment, so we’ll return to his thoughts later. The point here is that Endres defines his terms and explains why he is using them, while Treharne relies on us to understand her meaning through the known definition of her terms.

Facsimile

For each term I will discuss pre-digital definitions of the term, using the Oxford English Dictionary as the source.[1] I’ll also include a few quotes where scholars refer to digitized manuscripts using that term, although these quotes are meant to be representative and not exhaustive (that is, I couldn’t tell you the first time that the term was used by someone to refer to a digitized manuscript, but I can give you an impression of how the term has been used or is being used currently).

Let’s begin with the term facsimile.

 

It is from the Latin meaning literally make similar. The earliest attestation of the term is from 1661, and refers to a transcribed copy of a text, and not necessarily something that looks just like the text it is being copied from. About 30 years later, facsimile is being used to mean an exact copy or likeness; an exact counterpart or representation, and the citations refer to written texts or drawings. The term continues to be used according to this definition into the later 19th century, by the time photography of books and manuscripts has become well-represented in the scholarly landscape. (David McKitterick, Old Books, New Technologies, pp. 117-118)

By the late 19th century, facsimile has been adapted to refer to the communication of images through radio, wire, or similar methods – the modern day “fax” machine, for example. This meaning maintains the previous definitions focusing on a facsimile as some kind of copy, but adds the meaning of communicating over distance, and I expect these combined uses of the terms – print facsimiles plus the sharing of images over distance – are why digital facsimile became an obvious term to use to describe these new representations of old objects.

The use of facsimile to refer to textual materials clearly varies over time and from individual to individual. In his 1926 article ‘Facsimile’ Reprints of Old Books, A. W. Pollard seems to use the term according to its 1661 attestation, not according to its 1691 attestation. He says “It is intended to cover any reprint the form of which has been influenced to any considerable extent by the form of the edition reproduced.” (Pollard, p. 305) Pollard’s ‘Facsimile’ reprints include “1) Photographic facsimiles, 2) Type-facsimiles, i.e. editions in which types of similar founts to those used in the original are set to follow the original setting as closely as possible; 3) more or less luxurious reprints which seek to reproduce the general effect of the original with such concessions to modern usage as the producer may think desirable.” (Pollard, p. 306)

Facsimile or digital facsimile has been, for as long as I can remember, the default term that libraries use to refer to their own digital copies, and that scholars use to refer to the digital images they incorporate into their online projects. In November 1993, Kevin Kiernan gave a presentation at a symposium of the Association of Research Libraries [Kiernan, “Digital Preservation, Restoration, and Dissemination of Medieval Manuscripts”] in which he says that the Electronic Beowulf  “will in its first manifestation make available in early 1994 a full-color electronic facsimile of Cotton Vitellius A. xv to readers in the British Library and at other selected sites.” He continues,  “As this electronic archive grows, it will incorporate facsimiles of many other documents that help us restore parts of the manuscript that were lost or damaged by fire in the early eighteenth century.” Kiernan is referring not only to straightforward digital images, but also to images taken under ultraviolet light that were included in the edition. As he says later in the presentation, because of the UV images “Readers of the electronic facsimile will thus acquire a reproduction of the manuscript that reveals more than the manuscript itself does under ordinary circumstances.”

The use of the term facsimile makes it possible for scholars to consider how digital facsimiles relate to older ways of making similar. In “The Ghost in the Machine: Digital Avatars and Medieval Manuscripts“, Sian Echard discussion of the restoration of manuscripts by Matthew Parker and his circle, which she interprets as a kind of facsimile. Dr. Echard says “Today, digital technologies continue to recreate medieval books for a variety of audiences, and the digital facsimiles, like the hand and machine produced examples … both reproduce and relocate their medieval objects. But our current attitudes toward facsimile differ from Parker’s and Dibdin’s, and may in fact inhibit our ability to see the extent to which we too are recreating medieval text objects according to our own tastes. As technology has enabled ever more exact reproduction, the cheerful refashioning proposed by Parker has been replaced by an emphasis on the photographic, on the exact, with at times an accompanying confidence that perfect reproduction can approach the revelation of an object’s truth.” (Echard p. 201)

Surrogate

The term surrogate is interesting because, unlike facsimile – which is a fairly straightforward synonym for a copy – the term refers to something standing in for, or perhaps replacing, something else.

It was first used in the 16th century to describe the act of appointing someone as a delegate or a substitute. In the 17th century the term is adopted to be a noun – to refer to a person who is thus delegated. Other uses of the term, meaning more or less similar things, are attested through through the 17th century,

until 1644 we have a general meaning substitute.

 

 

 

Since the 1970s the term has been used in a more intimate way, to refer to sexual surrogates and surrogate mothers. As my colleague Bridget Whearty pointed out to me while we were discussing the word surrogate, the term is almost always used to describe bodies – either a person having power delegated to them, or a body acting as a substitute for another body. So the implication is that using this term to refer to digitized manuscripts doesn’t only mean the digital is standing in for the physical, but it also – by virtue of previous uses of the term – may imply some sort of embodiment or materiality of the digital object that is acting as the surrogate.

Paul Conway has an extensive discussion of the digital surrogate in his 2014 article “Digital transformations and the archival nature of surrogates”, and although he is referring to archival materials and not medieval manuscripts, I would expect that the use of the term comes from the same place, so I will quote him here. He reflects my own thoughts about a surrogate being more than a copy, saying “The creation of digital surrogates from archival sources is fundamentally a process of representation, far more interesting and complex than merely copying from one medium to another. Theories of representation – and the vast literature derived from them – are at the heart of many disciplines’ scholarship and of particular relevance for scholars who work primarily or exclusively in the digital domain.” (Conway pp. 2-3) He then continues to cite several other scholars – Mitchell, Scruton, Geoffrey Yeo, Matthew Kirschenbaum, Michael Taussig, and Johanna Drucker – who discuss the relationship that digital copies continue to have with their sources well after they have been created, even as they have their own materialities.

Bill Endres, who I quoted above, continues his thoughtfulness in the same piece as he considers surrogate as a term for his own use in describing 3D images of manuscripts. He says, “a term that has gained some commonality in 3D is digital surrogate. Bernard Fischer uses the term for 3D renderings of archaeological sites, like the impressive Rome Reborn. Fischer’s interest in 3D is to construct digital cityscapes and large spaces, thus his use of surrogate, the virtual environment functioning as a substitute or proxy, a stand in for the likes of a dig site or what once was, like ancient Rome, as a means to generate and test hypotheses, fulfilling a specific epistemic function. Surrogate fits Fischer’s needs but does not speak as readily to the full range of epistemic considerations that I want to explore for a manuscript, particularly the excesses of a digital artifact that add to our knowledge in other ways and its effect on looking and knowing.” (Endres, p. 4) The excesses that Endres is referring to here are things like special lighting and the affordances of 3D imaging, and he feels that the term surrogate isn’t sufficient to include these things, although Endres’s excesses and are very similar to those things that Kiernan was thinking of in 1993 when he used the term electronic facsimile. However Kiernan did not use the term surrogate in 1993 – it would be interesting to see when the term surrogate was first used to refer to digital objects, and if it would have been available to Kiernan in 1993.

Avatar

The third term, avatar, is relatively new to me, although Sian Echard used it in the chapter quoted above, and the term was also used by classicist Ségolène M. Tarte, in her 2011 presentation “Interpreting Ancient Documents: Of Avatars, Uncertainty and Knowledge Creation,” and is also mentioned by Endres and very recently by Michelle Warren, in a just-published article “Remix the Medieval Manuscript: Experiments with Digital Infrastructure.” This term is not yet common, but it may be gaining purchase because of its inherent complexity.

I really like avatar because of the connotations brought along with its original definition. According to Hindu mythology, an avatar is the incarnate, human manifestation of a deity. It is thus the avatar that is embodied, not the thing that the avatar represents. This can be contrasted with the term surrogate, which is also embodied, but the surrogate embodiment is in replacement of something else, while the embodiment of the avatar is the same thing, but in different form. And compare both of these again with facsimile, which again is a copy – these are three very different terms, and yet we have the desire to apply these terms to… if not the exact same things, than at least to the same kind of things.

The term avatar has also been used to mean more generally a manifestation, and I actually think that this is the usage of the term that is closest to its application to digitized manuscripts, although there is another recent usage that is relevant: avatar as a term to describe a character in a computer game on environment, a character that represents a person or a player within that virtual environment (think of Second Life, or, to use a more current example, Minecraft).

(There was also a popular movie by this name that came out in 2009, right around the same time Second Life was reaching peak popularity, and I can’t give short shrift to Avatar: The Last Airbender, an animated show that ran from 2003-2008.)

So what is an avatar when it comes to medieval manuscripts? Echard uses the term to refer both to physical objects and to digital ones, first describing the digital avatars of the Sherborne Missal included in the British Library exhibit celebrating its purchase. These include large-screen installations in the Library gallery, a CD-Rom available for purchase, an online version, and a 3D animation sequence that plays as an introduction to the CD-ROM. However as Echard says, “The avatars for these rare objects have … been books themselves- manipulable, tangible, physical … the physicality of the book is part of its cultural role, whether as public object or private delight. The digital facsimiles I have discussed here all attempt in one way or another to offer these medieval and early modem books to the fulfilling of both roles, and yet I would argue that they are ultimately stymied by the requirement to disembody the objects they display. The resulting tension, between access and absence, creates the ghosts that haunt the digital realm.” (Echard, p. 214) I’ve always loved this description of the tension of digitized manuscripts, and I am tickled to notice only now that the term avatar as attached to it.

I know that I keep quoting Endres, but I find here that again his thoughtfulness in exploring the terminology is really refreshing and I wish more scholars did this kind of intellectual work. He says,  “I find Ségolène Tarte’s impulse to call digital versions avatars most consistent with my needs, the digital version as an incarnation, the physical artifact crossing over and into a digital form. Since I am working on a gospel book, I cannot help but to think about this issue’s echo in early Christian prohibitions against depictions of Christ in the flesh, the prohibition motivated by the belief that physical matter is mundane, not divine, and therefore a painting or statue could not portray Christ’s divine nature, thus could not portray Christ and was blasphemous. In a similar vein, without the blasphemy, a digital version cannot portray all of the features of a physical artifact, but as mentioned, it also includes excesses. I appreciate Tarte’s choice of the word avatars, its recognition that digital artifacts have excesses and exist in a different reality and with different rules and potentials, offering unique advantages and experiences, a recognition that I want to carry forward in my sense of digital artifact or version.” (Endres, p. 4)

Before I conclude, I would like to remark on our apparent desire as a community to apply meaning to digital version of manuscripts by using existing terms, rather than by inventing new terms. After all, we coin new words all the time – just in this paper, I’ve mentioned snowclone and meme, so it would be understandable if we decided to make up a new term rather than reusing old ones. But as far as I know we haven’t , and if anyone has it hasn’t caught up enough to be reused widely in the scholarly community. I expect this comes from a desire to describe a new thing in terms that are understandable, as well as to define the new thing according to what came before. After all, both snowclone and meme are terms for things that have existed long before there were words for them, while digital versions of manuscripts are new things that have a close relationship with things that existed before, so while we want to differentiate them we also want to be able to acknowledge their similarities, and one way to do that is through the terms we call them.

Although we use these three terms – facsimile, surrogate, and avatar – to refer to digitized manuscripts, it is clear that these terms don’t mean the same thing, and that by choosing a specific term to refer to digitized manuscripts we are drawing attention to particular aspects of them. If I call a digitized manuscript a facsimile, I draw attention to its status as a copy. If I call it a surrogate, I draw attention to its status as a stand-in for the physical object. And if I call it an avatar, I draw attention to its status as a representation of the physical object in a digital world. Not a copy, not a replacement, but another version of that thing. Like pushing an idea through different memes, pushing the concept of a digitized manuscript through different terms give us flexibility in how we consider them and how we explain them, and our feelings about them, to our audiences. That we can so easily apply terms with vastly different meanings to the digital versions of manuscripts says something about the complexity of these objects and their digital counterparts.

Thank you.

Sincere thanks to Bridget Whearty, Keri Thomas, Johanna Green, and Anna Levine, for their help getting this paper ready for the public eye.

[1] In the paper presented at the Rare Book School (which was recorded; I will add a link here when it becomes available) I used the Historical Thesaurus of English as the source for the term definitions, but I found during further editing that the Thesaurus timelines weren’t doing what I needed them to. If I continue this work, I expect to bring the timelines back in again.

How to download images using IIIF manifests, Part I: DownThemAll

IIIF manifests are great, but what if you want to work with digital images outside of a IIIF interface? There are a few different ways I’ve figured out that I can use IIIF manifests to download all the images from a manuscript. The exact approach will vary since different institutions construct their image URLs in different ways. Here’s the first approach, which is fairly straightforward and uses e-codices as an example. Tomorrow I’ll post a second post using on the Vatican Digital Library. Please remember that most institutions license their images, so don’t repost or publish images unless the institution specifically allows this in their license.

Method 1: The manifest has urls that resolve directly to image files

This is the easiest method, but it only works if the manifest contains urls that resolve directly to image files. If you can copy a url and paste it into a browser and an image displays, you can use this method. The manifests provided by e-codices follow this approach.

  1. Install DownThemAll, a Firefox browser plugin that allows you to download all the files linked to from a webpage. (There is a similar browser plugin for Chrome, called Get Them All, but it did not recognize the image files linked from the manifest)
  2. Go to e-codices, search for a manuscript, and click the “IIIF manifest” link on the Overview page.
    IIIF manifest link (look for the colorful IIIF logo)
    IIIF manifest link (look for the colorful IIIF logo)

    The manifest will open in the browser. It will look like a mess, but it doesn’t need to look good.

    Messy manifest.
    Messy manifest.
  3. Open DownThemAll. It will recognize all the files linked from the manifest (including .json files, .jpg, .j2, and anything else) and list them. Click the box next to “JPEG Images” at the bottom of the page (under “Filters”). It will highlight all the JPEG images in the list, including the various “default.jpg” images and files ending with “.jp2”

    Screen Shot 2016-07-14 at 4.46.47 PM
    JPEG images highlighted in Down Them All
  4. Now, we only want the images that are named “default.jpg”. These are the “regular” jpeg files; the .jp2 files are the masters and, although you could download them, your browser wouldn’t know what to do with them. So we need to create a new filter so we get only the default.jpg files. To do this, first click “Preferences” in the lower right-hand corner, then click the “Filters” button in the resulting window.
    Filters.
    Filters.

    There they are. To create a new filter, click the “Add New Filter” button, and call the new filter “Default Jpg” (or whatever you like). In the Filtered Extensions field, type “/\/default.jpg” – the filter will select only those files that end with “default.jpg” (yes you do need three slashes there!). Note that you do not need to press save or anything, the filter list updates and saves automatically.

    New filter
    New filter
  5. Return to the main Down Them All view and check the box next to your newly-created filter. Be amazed as all the “default.jpg” files are highlighted.

    AMAZE
    AMAZE
  6. Don’t hit download just yet. If you do, it will download all the files with their given names, and since they are all named “default.jpg” it won’t end well. It will also download them all directly to whatever is specified under “Save Files in” (in my case, my Downloads folder) which also may not be ideal. So you need to change the Renaming Mask to at least give you unique names for each one, and specify where to download all those files. In the case of e-codices the manifest urls include both the manuscript shelfmark and the folio number for each image, so let’s use the Renaming Mask to name the files according to the file page. Simply change *name* to *flatsubdirs* (flat subdirectories). Under “Save Files in”, browse to wherever you want to download all these files.

    Renaming Mask and Save Files in, read to go
    Renaming Mask and Save Files in, read to go
  7. Press “Start” and wait for everything to download.
    Downloading...
    Downloading…

    Congratulations, you have downloaded all the images from this manuscript! You’ll probably want to rename them (if you’re on Mac you can use Automator to do this fairly easily), and you should also save the manifest alongside the images.

    TOMORROW: THE VATICAN!