Reflection: Sleuthing

Prior to my employment at END, I’d never seen an 18th century edition of a novel (or, frankly, many contemporary editions of 18th century novels, either. The earliest book I’ve read is Huck Finn.) so many of the day’s common paratexts and features were unusual and surprising to me. I’d never seen the word ‘advertisement’ used to refer to anything besides a product listing, I’d never heard of a subscribers’ list. I’d never seen books with titles 20 words long, and didn’t understand why those titles had so many semicolons in them. I’m a social scientist at heart, however, so the vast majority of my questions focused on the social world that produced the book-artifacts I held in my hands. How did this book move from my mind of the original author to my foam cradle at UPenn’s Van Pelt-Dietrich Library, summer 2016?

As a result, I came to love the 700 and 710 fields of my catalogs. In these fields, all of the nonfictional names in a book’s paratexts are listed and, if possible, authorized. The author of the text, the authors of the paratexts, the author of the epigraph, a former owner whose name appears on a bookplate or in an inscription, etc. In the best of circumstances, I can just search a given full name on VIAF, find it, an authorize it. Rarely, however, is this the case for the set of novels we’re cataloging. For many names, only a last name exists, or none at all. In more challenging circumstances, a library bookplate has covered up an inscription, or else the inscription was written in some illegible hand. In many cases, as with the names of subscribers, the identities of these names are impossible to authorize, if I were to embark on such a task, even with a full name given.

While I could easily just leave a name unauthorized, I have come to enjoy the obscure successes of matching the name in a book to a name online. I’ve become a literary internet sleuth, combing through bad OCR of a dictionary of Scottish emigrants to Canada, or census lists and marriage licenses for small Virginia towns, or, my slightly morbib favorite, entries on A book I cataloged recently featured a bookplate signed by the man whose residence, (according to an odd post on an odd website, hosted one of the first meetings of the Westmeath Hunt Club, an Irish organization of recreational hunters who made use of foxhounds. Another had a subscriber named Preserved Fish, a name that is, amazingly, not exclusive to this subscriber–there are at least two others, but this subscriber is the only one from Vermont. One inscriber was the close relative of an Australian colonist responsible for instigating biowarfare on an Aborigine community (He sold them poisoned flour).

My most frequent and most successful 700s Google expeditions are for the first names of publishers, printers, and booksellers listed only by their last name. Only rarely do I have the pleasure to find interesting back stories. Regardless, my frequent Internet detours have all been an incredibly interesting exercise in what search engines can and cannot do. In my conversations with librarians, and the history of librarianship, I’ve heard often that the advent of the Internet and Google at one time appeared to threaten the entire profession. If someone can simply type in keywords into a search engine, then of what use is a librarian’s research skills and resources? Though I knew when I started that libraries and librarians are indispensable institutions, my constant (but enjoyable!) slog through the 700s has proven that to me in full.

Google’s ability to predict what exactly it is you are looking for continues (terrifyingly) to improve, but I’ve found that its powerful algorithms often still don’t get me where I need to go. I hit paywalls, French blog posts I can’t read, OCR too gibberish-y for me to do a successful command+F search. Google doesn’t know that when I search, for example, “smith dublin printer,” I’m looking for someone, last name Smith, who worked as a printer in Dublin, and not looking for someone, first name Smith, in Dublin, who happens to be selling their ink jet printer on Craigslist. More scholarly search engines like VIAF or WorldCat or the ESTC or ECCO or the Oxford Biography Index &c. &c., are helpful in some ways but not in others: they often store more relevant and specific information, but at the cost of navigating a badly designed user interface and poorly linked data.

While I look forward to the web sleuthing of the 700s fields in each book I catalog, I’m so grateful when I find the information I need quickly and accurately. Working at END has encouraged me to rededicate myself to providing accessible, precise, user-friendly data, online and in print. I think more critically now about tagging, error-free text transcriptions, data organization, and online interfaces. The internet does a lot for libraries, and libraries do a lot for the internet, and my trials and victories in the 700s fields have me excited for the future of that relationship.

Reflection: The Social Side of Cataloging

A lot of the Early Novels Database project feels like common data entry. We see the paratexts, learn the various data fields, and plug and chug from one record to the next. Except for the very first week on the job (which was spent learning with our more experienced peers), the day to day protocols have little to do with social interactions. Looking at it this way, the act of cataloging itself seems like it should be a solitary experience. That couldn’t be more wrong. The END Project in its entirety revolves around the idea of access, shared knowledge, and communal interest. Every step of the novel documenting process we employ here exponentially expands the books’ sphere of influence, meaning the availability of the book and the amount of discourse on it increase constantly.

Let’s start from the very beginning: many of the novels we work with were once part of private libraries. Their original owners and maybe a couple of friends and acquaintances had the opportunity to handle them, and that’s it. From there the novels passed from hand to hand until eventually they ended up here in one of our libraries’ collections. With this simple move, private became public. However, though thousands of curious people can get in to interact with the novels, this level of access is not enough; for the most part these lesser known texts are still hovering in dusty obscurity in dark shelving units.

Now END plays a more active role in the socializing of these novels. Why are we cataloging these things? Who really cares that Blah-Blah a Novel was written in 1863? The answer is we do. There is a reason we, the catalogers, sit in a single room together for nearly seven hours Monday through Friday when there are plenty of places (warmer places) we could spread out to. Instead of completely burying our heads in clicking keys and xml displays we ask questions, share amusing footnotes, and work together to puzzle out whether messily written inscriptions say “Bill” or “Belle.” This isn’t just for the sake of accuracy in the records, but for our own curiosity as well. There is something exciting about the act of discovery that compels us to share our finds, if for no other reason than one person finds it interesting and therefor another might as well.

So far this is only one group of catalogers in one room, but the END has branched off to another school as well and is hoping to bring in even more. Now there is twice as much exposure for the novels as we swap back and forth with check-ins, discussions, and all manner of live interaction. We get to know each other in this digital humanities community. On top of that we are generating a discourse through our records.

By far the largest step cataloging takes into the social sphere—and the last step of our process—is digitizing various pages of the novel and throwing them up on to a several social media platforms, notably Flickr, where we post pictures of every novel we catalog. Illustrations and titles that used to only be seen by a handful of people over the course of a lifetime have become available for literal millions to observe. It is quite common of the course of a cataloging day to hear someone casually mention tweeting one of their pictures or commenting verbally on another person’s withknown post. To put it simply, cataloging with this project is a highly social experience not just for the catalogers but for the novels as well.

Reflection: Digital Confidence

This post is a collection of thoughts that came out of a conversation I had with Colette about our experience at END this summer, and our feelings about learning new digital tools.

As English majors without a lot of prior experience with or exposure to digital humanities, Colette and I found the idea of using digital tools in conjunction with studying eighteenth century novels to be somewhat foreign. We agreed that one of the great things about END is the fact that it helps us reconcile these two fields, by demonstrating how digital tools can be integrated into the study of English. END emphasizes the fact that the humanities and the digital realm are not two totally separate spheres, and it’s been useful to see how the two can work together and reinforce one another. It’s been especially helpful to learn this through actually working on a project, as opposed to reading theory about digital humanities. Writing code in MarcXML while simultaneously paging through fragile eighteenth-century novels may sound odd, but it’s become routine and feels completely natural at this point, which says a lot about how successfully END links the material and the digital spheres.

Because this feels so natural, coding and digital tools as a whole seem less intimidating now. As Colette mentions in her post, so much of my intimidation in this realm stemmed from my lack of exposure to it, while what exposure I had often felt discouraging. Here, however, the fact that I had no experience with MarcXML, and had never heard of topic modeling prior to this summer wasn’t treated as a drawback, and I echo Colette in saying how much this has impacted me. For instance, we’ve started playing around with the command line, and while we may not be able to do anything significant there (or even fully understand what it is, let alone what it does) knowing that it exists still feels significant. While I doubt I’ll use the command line much outside of END, it’s been so affirming to be treated as though I can learn these tools.

Because we’re in such a supportive and comfortable environment, and because we’re beginning to see how digital tools and the humanities can be related, we’re also motivated to learn more about digital tools. In the past, my perception of computer science, coding, and digital tools was that they belonged in the realm of the sciences or STEM. I felt intimidated by them, didn’t understand them, and didn’t really want to understand them, because it didn’t seem like there was much point for someone with my interests. As a result, the fact that END makes coding and the digital realm relevant to English and the humanities feels like a huge deal to me. If I’m being completely honest, my heart still belongs more to the humanities aspect of DH than to the digital one, but I do feel like I have a much better understanding of what it actually means to work with digital tools, and my attitude towards them is much less reluctant than before.

In discussing this with Colette, we both agreed that this evolution has been possible largely because of the encouraging, open setting we’re in at END. Being able to talk casually with one another and pose questions to the group makes the job less stressful and more fun, and it removes any degree of intimidation we may have felt at the start. We’ve both been struck by how we’ve never been made to feel bad about our lack of knowledge or understanding about a topic, and the way we’ve been encouraged and patiently taught has also done a lot to motivate us to learn more.

Reflection: The book as an object, in two lectures

We asked our END student researchers to reflect on some aspect of their experience this summer. Katy Frank wrote about two of the lectures from the team at NYU.

Both Charlotte and Jeremy investigated the book’s status as an object in their lectures. Charlotte’s lecture was about Eliza Gifford, a female book collector in early 19th century England (and Wales). Collecting was the province of wealthy men; the collectible-object status of the book was a privileged, gendered status. This means that to treat the book as an object was a male privilege that Gifford appropriated. She collected many non-canonical books, further subverting gendered methods of privileging books as objects, whereby both the collector, and to a lesser extent, perhaps, the books themselves, would be by or about men. Gifford collected both fiction and nonfiction, and about 75% of her novels were by women. Here, Gifford used the typically male position of collector to elevate the voices of women. It’s impossible to tell exactly how purposeful this was, but regardless of her intention – ur-feminist gender loyalty or literary preference for the style of book women tended to write, or some combination – the effect is one of magnification of female voices in a space typically reserved for men.

Reading was a contested space for women and girls, with moral panic over how much they ought to be reading, what they should be reading, and concern that every story ought to have a moral and depict a virtuous heroine. In this climate Gifford’s status as a female collector, elevating books to objecthood despite their femininity, (as both possessions of a woman and books that frequently were about women), is especially noteworthy. Charlotte’s talk addressed the book as an object by describing the ways in which Gifford participated in that construction through collection – and simultaneously destabilized the notion of objecthood/collection by being a not only a female book collector, but a female book collector who collected books not typically deemed worthy as objects, either for their monetary or literary value. The conflation of those two types of value is what renders the book an object, and it was exactly this conflation that Jeremy discussed in his talk.

Jeremy took as his starting point William St. Clair’s lecture “The Political Economy of Reading,” which discussed the supply and demand curve of novels in varying stages of physical preciousness. Large, fancy, editions that were very much objects in and of themselves were marketed towards the upper classes and didn’t sell in very high numbers; as the quality and size decreased they sold in higher numbers (and of course this was all very calculated on the part of publishers/booksellers.) The Magnum Opus style of book, starting with the publication of a lavish, collected edition of the Waverly novels by Sir Walter Scott, ushered in a steroidal version of the ideology that dictated the more expensive, precious versions of books be produced first – that is, the notion of a mutually constitutive literary and monetary value of a book. The paratexts in this book served to add value to it, both literary and financial; they gave it added credibility as something worth purchasing when one already owned some of the same content in a different, less majestic form. Jeremy discussed the growth and new versions of this phenomenon, just as Charlotte discussed the ways in which Gifford both participated in it and subverted it in her collecting practices. Both lectures were concerned with the gendered status of objecthood, and the ways in which the gendered status of collecting can be reinforced or subverted depending upon who does it and what they collect, ultimately showing its fragility as a masculine practice.

Reflection: Learning Through Glitches

This post is a collection of thoughts that came out of a conversation I had with Abby about our experience at END this summer, and our feelings about learning new digital tools.

Abby and I discussed our experiences at END this summer as women and as humanities majors. Both of us have often felt uncomfortable learning STEM-related or digital techniques in school, particularly in male-dominated environments; we ended up sharing many sentiments about this summer’s END program as a space for learning and gaining confidence with methods and technologies that we have often felt averse to, or shut out from, in classroom environments.

I shared my memory of learning Marc XML, the computer language we use at END to catalog books and their metadata, during training week. I was a little nervous to start this learning process. My experience as a woman has included countless classroom environments in which teachers and peers have discouraged me from approaching numerical sequences or computer languages, subtly conveying to me that I am not naturally apt at problem-solving and that someone else will always approach and complete a puzzle more quickly and thoroughly than I could. I appreciated how during this initial Marc XML lesson at END, Alice taught the language in a way that was both welcoming and validating; I felt that I was given time and room to learn, but also that my intelligence and capability were affirmed and acknowledged. Coming into work every day in an environment that recognized my capacity to learn made me more deeply consider how my aversion to learning about computers and numbers is not due to any “natural” inaptitude in this subject, but instead, to the vast amount of subtle messaging I have received over my life that has caused me to internalize the narrative of my own inaptitude.

In addition to agreeing that END’s status as an all-woman space this summer has enhanced our experience gaining confidence with computers and digital methods, Abby and I also discussed how END emphasizes the digital as a helpful complement to more traditional humanities methods. This emphasis has been useful for me and Abby because it contrasts with widely-circulating narratives of the digital and the humanities as opposing, antagonistic fields. END’s environment of validation of our intelligence as women, as well as its demonstration that digital methods are relevant to us as humanities majors, have made us much more comfortable with (and interested in) using digital methods.

Abby pointed out that in classroom environments, attempting new, computer-related methods for research has often seemed intimidating, and that she often stuck with her usual research methods and avoided trying new techniques. I identified strongly with this experience. We both feel that END has changed this instinct in us, and has made us more likely to try computer-related methods in the future. We talked about feeling less inclined to give up if a digital method does not work immediately, understanding that its not working is probably due to a fixable issue, not to our inherent inability to understand computers. Abby said that she now knows it’s not necessarily her fault if something goes wrong in a digital process—it’s probably just an issue with Java! I feel similarly; I now know that there are many glitches that can come up when using digital methods. Usually, the glitches are in the computer, not in me.

Dublin Printing: The Rise of the Provincial Novel, and what that meant for the rise of the Novel itself

“The making of provincial literature is best understood through attending the production of books and the circulation of material texts between London and the provincial literary centres of Dublin, Edinburgh, and Philadelphia… they formed the condition of possibility for provincial literature to emerge.”– Joseph Rezeck, London and the Making of Provincial Literature


Literature in the 18th century was a means of spreading a culture amongst its consumers. More than anything, the rise of the novel gave birth to a new sect: “provincial literature” that aimed to spread the metropolitan lifestyle of the urban areas to readers who could not afford to live in the city. Life in London was being qualified, commodified, and sold in the form of the novel in three volumes.


The print industry in Ireland fits into this in several ways. As I found out in my research last summer, the print industry in Ireland was going through a particularly interesting period, where the copyright of books printed in London did not prevent the reprinting of London novels in Dublin, meaning that books could essentially pirated in Dublin for a much lower selling cost, albeit being of a lower quality. Last summer, I started “The Dublin Print Project”, a personal DH project where I compared and contrasted the qualities of books published in London and their Dublin counterparts. Ultimately, the differences I began to notice between the Dublin books and the London books were mostly physical. However, although the novels were the same, the London books and the Dublin books would occasionally differ in the errata corrections, changing the text between the two publishings contextually.
This brings about the question– we know what the relationship between the book and the Dublin print industry is. But what is the novel, and in what ways have the emergence of Dublin publishing affected the novel? In the research that I have done, I have discovered that the boom of the print industry in Ireland, and the transition of the novel as a metropolitan commodity actually go hand-in-hand. The popularity of spreading London books as a means of creating a “provincial literature” market, in turn, has promoted the publishing of novels in London that depict a metropolitan life.


The challenge of this project, however, was how to quantify my findings, to digitally represent the impact of Dublin novels on the print industry in the 18th century. In this, I found it very difficult to do this, as not all the novels I had access to had discernable contextual differences, at least from the ones I could find at Penn. Moving forward, I think “The Dublin Print Project” should focus less on comparing London and Dublin prints of books, but look into quantifying exactly what kind of book is reprinted in Dublin, and, through that, qualify what defines a popular eighteenth century novel.

An alternative archive of 1760s novels, in a series of Vines

This project aims to create an alternative archive of Vine videos on the subject of 1760 self-labeled novels (books that include the term “novel” in their titles). It was conceived of as a response to a data problem in more traditional archives relating to these books, one that I believe represents a more integral problem in those archives and, by extension, in the research that takes place in them: data and metadata tend to be available for works which we have historically valued, and so research centers on those works, and canonical, traditional views of literature and its history are cyclically perpetuated (more on that here). I contributed to this problematic cycle when I eliminated the genre-category of “novel,” and so a large group of canonically “unimportant” books, from a project I worked on this past summer; I was deterred by the historically-determined data problems I encountered (specifically, lack of OCR digitized copies of these works). With these videos, I am hoping to in some sense work against this cycle, firstly by by creating an alternative archive to record and preserve 1760s novels, and secondly, by involving as many people as possible in a project and a conversation around these works. The project is inspired by and connected to END’s metadata project; hopefully, this archive and the linked END archive will provide alternative sources of information on often ignored works.

The Vine archive is available here!

Adventure, History, Letters, and Memoir: Mapping title to text in the 18th century novel

Eighteenth-century fictions often announce their genre in their titles: adventures, memoirs, etc. But what, if anything, do these “genre keywords” in titles actually indicate about the texts? Because researchers in the digital humanities frequently use metadata, like these titles, as a representation of a full work, it is important to investigate the connection between these two elements – here, title and text. To begin to analyze connections between title metadata and full-text data, I focused on the 1760s, using a small dataset of texts specifically from 1760-1770 (these texts represented the entirety of the “genred” texts I found searching through clean full text databases for titles on a full list of 1760s fiction in English created by gathering titles from ESTC and Raven’s bibliography of English fiction 1750-1770). I encountered significant data problems here – see this blogpost about those. My analysis of the four genres I ultimately worked with (adventure, history, letters, and memoir) suggest some very preliminary conclusions we can draw about the connection between the “genre” of a text based on the genre indicated in its title in the 1760s.



I collected as many titles published from 1760-1770 that included one of four “genre keywords,” including both titles first published in that decade and reprints. I chose these four keywords after making a wordcloud using full list of 1760s fiction titles compiled by END;[1] the wordcloud helped me determine some of the major, repeated keywords that might plausibly indicate some kind of category for the books they were labeling. Novel was originally a fifth category that I planned to use as one of my genres, but (see separate END post on this) I had to eliminate it due to a lack of available clean full texts of “novels” from this period. Some of the works I found fell into multiple genre categories, and I assigned them to one based on the genre categories that had fewer or more works.[2]

I ran a topic model across all my files,[3] which resulted in two pieces of output that I used in this analysis: a “composition” file, which shows the percent of each individual piece of text in the corpus that appears in each “topic;” and a “topic key,” which shows topics, or sets of words that probabilistically appear together throughout the corpus, and the relative “weight” or prominence of those topics. By going through this composition file, I was able to take the individual texts from each genre and re-calculate the percentage with which each of the genres appears in each topic. These are the percentages that appear throughout the analysis. The topics in the topic key are each actually very large, as all the words (minus stop words) in the entire corpus are divided amongst them, but what appears in each “topic” of the key is the top twenty most significant words of those keys – my analysis focuses on these words as indicative of the full topic. The “topics” in the topic key each have a numerical label, but because the computer generates them using probability, without any understanding of their “meaning,” it is up to the researcher to determine what the topic output actually means.

This meaning-assignment stage of analysis is one where the researcher’s subjective interpretations can get confused with the computer’s “objective” output. Throughout the analysis, I try to refer to the contents of each topic and explain the logic behind the “working titles” I used to compare the different topics and by extension the genres. But these working titles inevitably limited (even as they enabled) my analysis. For example, I called topic 1 “general positive human world,” certainly an interpretive leap from the computer-generated cluster of words in my topic key. For this and many of the other topics, I could have interpreted the key differently, or even given the topic a slightly different “name” and thus worked with the corpus at large differently.

The immediate work of looking through the differences between these title-defined genres resulted in some skewing of the data in both the memoir and letter categories. 29% of memoir appeared to be in a single topic in which no other genre appeared, and 35% of letters likewise appeared in a topic with a 1% showing for each of the other genres. Each of these topics seemed to correlate with a single work (in letters, Pamela, and in memoirs, The adventures of Peregrine Pickle. In which are included, Memoirs of a lady of quality), both of which were originally published before the 1760s, both of which were very long compared to the other works in their genre sections, and one of which included two genre keywords in its title. All of these factors may have contributed to their skewing of the data; to deal with the problem, I recalculated the percentages in each topic for the two genres without these particular works. All further analysis was done with these altered percentages in mind. The problem with my solution to problem #1 is that it makes my corpus of texts in both these categories smaller, and more subject to the particularities of other individual works, something I tried to bear in mind as I did my analysis. Another note on the particularities of this data as a product of topic modeling is that some of the topics should more accurately be split into two topics that happen to frequently appear together, or two topics might really be practically identical and could easily to be combined into one; this, too, I tried to account for in the analysis.




Although the majority of my analysis is genre-based, there are a few interesting categories that showed up with similar rates in all of the genres, some of them in very large quantities. The first, and largest, of these is topic 1, which I gave the working title “positive present human world.”

This is what that looks like from mallet:

1     1.23702    make give time good great life world present part mind thought person reason find pleasure manner love till kind heart

Broken down by genre, all the categories of texts fit into the topic with between 19% and 22%. This seems to suggest that all the texts approach the present human life as something that is ultimately positive: pleasurable, kind, reasonable. The verbs here are make, find, and give, and suggest that action in this positive world is generative and creative, with a purpose that propels into the future and is positive in the moment. More on this specifically in the analysis of the histories.

Another, much weaker category (9% in letters and 4% in all the other categories), but still notably evenly distributed category was number 11, which I dubbed “English people and especially men as intelligent and powerful.”

11    0.31762    man country men people letter great nature learning genius english proper beauty public human found author wisdom history taste china

This topic seems to locate wisdom, genius, and perhaps a history of those things with nature and English people, particularly men. It presents a nationalistic, patriarchal and (again) positive understanding of the world, when the world is premised on those nationalistic patriarchal terms. This is a weak topic, but its relatively even distribution suggests that the “weakness” of the topic refers to its lack of centrality in any particular text – it isn’t the “point” of any of the narratives, but rather a simply accepted fact in all of them, always present but never prominent, just as the kind of implicit assumptions the topic suggests tend to be.


The adventure texts, along with the history texts, grouped very clearly in a certain set of topics (perhaps because they are the categories I have the best data on, perhaps for other reasons – more on that when I get to the more ambiguous “letter” and “memoir” categories).

By far, adventure had its highest percentage in topic 19, at 25%, and (leaving out topic 1) topic 16, at 14%. Topic 19, which I named “public masculine economic activity,” looks like this:

19    0.10232    master guinea adventures sir made directly proper general chapter moment success power service business gave raised nature make human money

It suggests that “adventures” show up around business and money, and that they yield success, with power mixed in somewhere, perhaps in the conditions or the yield of adventures. All of this is (or at least shows up around things that are) natural, human, and proper. The topic is “masculine” in that the few person-indicators here are masculine, but probably first because the public economic world “adventures” seem to take place in are predominantly masculine in this period.

Topic 16 is formatted similarly, in a different location – if 19 is a public economic world for adventures to inhabit, 16 is the social world it inhabits:

16        0.65228           made man time gentleman place money company day young put house immediately gave master set honour friend people good gentlemen

As with topic 19, 16 is masculine-specific, but it focuses on the home; where the economic movements of the adventure meet the social world, they result in this “masculine domestic” topic, interesting in the context of debate that often focuses on hard-defined edges between male-female, public-private binaries. There aren’t a lot of verbs visible in the topic here, which implies a contrast to the economic activity that dominated topic 19. But the non-verb words that do appear in the topic suggest movement and action (e.g. day and immediately); the kind of action here suggests, however, the way in which the masculine domestic plays into the masculine public of topic 19. Immediately and day imply an outward focus, the possibility of motion in the future pointed towards windows signaling morning; even the word “company” suggests a porous boundary between the home and outside world. These promises of motion and references to the outside, without the overt action of verbs, suggest that the masculine-domestic allows adventures a connection to, and perhaps purpose within, the social schema it largely eschews, while still decentralizing those things from its narrative (this may function similarly to the socially-defined positive qualities like “proper” that appear with adventures in topic 19). Topic 2, which is adventure’s third-largest topic and references family positions and roles of both genders, totaling 11%, is actually the lowest genre showing in that category. And adventure is the only genre category with 0% in all the topics that suggest social/familial roles and relationships. It is focused on a world that lies outside of the social world of family and women in general, and it is perhaps for that reason that it seems to take characters that embody the norms of the social world – they are “gentleman” (although this could refer to their social position rather than their social behavior); they display “honour”; and they are significantly attached to their homes and households.

It is surprising that these proper, active men seem to be openly pursuing economic activity; markedly absent are words that signify glory – although adventure shows up with 6% in a maritime-focused topic, #7, the words there seem to signify means towards an end rather than the “ends” we associate with military and often adventure, e.g. commodore, hatchway, and consequence rather than glory, justice, freedom, etc. 19 is also one of two categories (the other is concentrated in history) that includes both the past and present tense of the verb “make.” It is worth noting that “made,” in both present and past tense, is the most frequent verb to appear in the top 20 words across all topics (what we see in the “topic key”), with 7 instances. But this particular combination of made and make in one topic suggests something generative in whatever else is happening in the topic, something that is creative in the past tense and moves forward into the present. It isn’t completely clear here what is being “generated” or made in this topic; perhaps it is money, perhaps it is the kind of proper rather than dangerous masculinity that adventures seem to rely on. Or perhaps it is the social order that adventure stakes a strong, if inattentive, claim to.


When I broke down the topic model I ran across all the texts by genre, the group of genre texts with the widest distribution amongst the topics was the history group (history texts appear in 16 of the 20 topics, followed by memoir at 15 and letters and adventure at 13). This means that “high” instances in history are comparatively lower than in other genres. The topics within which histories cluster most significantly, however, could all be grouped together as social category-focused. Aside from topic 1 (21%) the top categories for history are 2, 4, 9, 12, and 15. Topic 2 is the family role topic that was comparatively weak in adventures, at 15% for histories. Topic 12 is a similar family-specific topic, but with a focus on a male led family household (father, master etc without female equivalents like mother or madam). Topic 4 (8%) seems to delineate social roles (sir, gentleman, madam) in combination with speech and social qualifiers like age (“young”) and “manner,” but eschews family-specific social roles (mother, father); it seems to present the public face rather than private face of social interaction. Topic 8 is very similar to this one, at 4%, also noting social (but not family) roles, conversation, and youth. It is interesting that the specific age that gets mentioned in these two categories is youth – perhaps this is because that is the age most worth noting in a character, or perhaps because the focus of histories is on the youth (within the context of their social world and families).

This plays into several interesting qualities in topic 9, which is almost exclusive to histories, at 8% (1% from both letters and memoir, 0% from adventures). It is the only topic that seems to focus on romantic relationships – not, notably, on the emotions of romance, but on its formal social elements; this appears in words like “love,” “hand,” “dear,” and “hope.” But the category is also, in addition to adventure-heavy topic 19, one of two topics to include both the past and present forms of the verb to make. What is being “generated” in this topic is more suggestive than in topic 19:

­­­­­9      0.07818     sir man dear miss lady madam lord love charles good heart harriet lucy woman brother made hand hope make tho

The combination of a socially sanctioned and protected romance with a generative quality, in the wider context of the social-role and perhaps youth-focused histories suggests that what is generated in these histories, the aim of the socialization of youths through their families, is the regeneration of the social structure from the past to the present.

Topic 15, also notably strong in history (10%), is the only other topic that seems to approach physical, bedroom-located relationships – it probably denotes either romantic/sexual relationships or emotional lying-on-the-bed-crying-or-praying scenes.

15    0.69551    hand eyes night head face replied hands door found began room time heard soul bed lay left stood fell cried

If it denotes the former, this topic, distinct from topic 9, is not generative, and it is not positive. “Good” and “hope” can accompany “hand” in topic 9, and even “love” is present, but in topic 15, where the body is expanded through four body signifiers and the verb “lay,” urgency and negative emotion replace positivity – “cried” is combined with “left” and “fell.” The high level of motion here isn’t directed as the simple “make” is in topic 9 – “began” is combined with “left” and falling and finding and crying. 15 is slightly stronger than topic 9 amongst the histories, but it is also evident in all of the other genres, while topic 9 is almost excusive to histories. 15, then, might represent implicit negative associations with undirected, uncontrolled sexuality or emotions, while 9, and the social history, represent a means of controlling it. If the topic is more representative of religious emotion, the chaotic motion and physicality of the topic is actually socially harnessed and controlled. In that case, the topic might rather indicate the otherwise negative and dangerous passions and directionless energies that religion contains.


The topics that letters cluster in, excluding the general topic 1, are 5 (politics and war from the social position of the aristocracy), 11 (men and intelligence) and 14 (a topic focused on you in different variations, for example “thee” and “thou,” and on proper names). They also have a notably low presence of 7% in topic 2 (family roles).

Because the corpus of data, with one text removed, was very low for the letters, I am wary of assuming these results are due to qualities of letters in general rather than to specific texts. Topics 5 and 11 particularly seem like they may have been skewed towards their high percentages (23% and 9%, respectively) by particular texts: one of the “letters” is a history of England from the perspective of letters between a nobleman and his son, hence 5, and two are letters between two men, so perhaps mutual compliments of one another leading to 11.

The skewing here, first the extreme skewing from one text and then the possibility that the remaining texts are still skewing the results, may be a product of the fact that a letter is a form as well as a “genre.” As a form, “letters” can be filled with different kinds of content, across the 1760s and certainly over time.

If taken as a form that is the basis for something I decided to call “genre,” letters’ focus on “you” and on proper names (topic 14) in combination with their diminished focus on the family and general low showings in all the social role/relationship may suggest something about a particular perspective inherent in the (mostly second person) letter.

Letters approach topics through the filtered lens of a particular personal relationship, addressing “thou” and using lots of personal names. If the names stand in for individuals other than “thou,” than perhaps this personal relationship on which a letter is based leads to a more personal focus on other individuals immediately present in that filtering relationship. This personal focus doesn’t mean that letters are “emotional” or intimate in the way we might imagine a personal relationship, because the text itself is not about the personal relationship (see: about anything). It simply means that the approach to the topic is through a particular relationship and the particularities of that personal relationship rather than a larger social schema.


The memoirs, like the histories, were pretty evenly distributed, with low percentages across the topics. But they were distinctively strong in two topics: 2 (28%), the “family positions and roles” topic, and 14 (10%), the “you/proper names” topic. This is interesting in that the only other genre with a significant showing in 14 is letters, and in the case of letters, high percentage in 14 is paired with a distinctly low showing in topic 2 and in general across all the social category topics, both family-centric and more general or public. Memoir, in comparison, has a fairly even and high distribution across social category topics, second to history and ahead of the low letter showing and the almost-absent adventure showing. This is an interesting pairing, then: unlike letters, which perhaps privilege the one-to-one personal relationship over familial and social relationships, memoir seems to privilege one-to-one relationships in addition to social and, especially, familial relationships (re: high presence in topic 2). If a memoir is generally expected to focus on the life of an individual, whatever that individual’s life might contain, this is perhaps surprising. But it makes sense that immediate relationships, to “you”s and to the family, take extra precedence, followed by broader social relationships. This might place memoir somewhere in between the “form” that unites letters and the more clearly “genred” histories and adventures. Any life can be recorded in a memoir, but that life, at least in 1760s memoirs, seems to start with one-to-one relationships, expand to familial relationships, and generalize into social relationships.

[1] I also made wordclouds of the 1760s titles in the END database and of all the titles END has catalogued thus far, which span the 18th century. The results [available in this public file] for specifically END 1760s titles and all 1760s titles are approximately the same, which suggests that the END database is a fairly representative sample of all texts! The 18th century results are unsurprisingly different, with, among other things, notably higher rates of romances and tales.

[2] I chose not to double count these texts in an effort not to keep my categories as even as possible with the texts available, so as not to overweigh certain genres in the topic modeling output, but with a better dataset I would have preferred to double count these multi-genred texts.

[3] The program I used for this topic modeling was MALLET, an open source program created at UMASS Amherst, using their automatic settings (stop word list, 1000 iterations, etc.). It runs best with a large number of shorter pieces of texts, so I split all the books I was working with into 500-line documents before feeding them into the program. The data that I got from MALLET and used in this analysis is available [here.]

The Preface Project

[Abstract]   The Preface Project is a multi-modal digital archive exploring the relationships among truth claims, direct address of the reader, and authorial voice in the prefaces of 1760s novels. The archive conceives of prefaces as products of and catalysts for relationality, while weaving together new layers of enmeshment, through curated cataloging and audio, visual, and textual digital reproductions.  Through the proliferation and linking of metadata, through the multimedia presentation of the prefaces, and through the open-access publication of the archived materials, the Preface Project generates new networks, entangling them with the conversations and relations of 1760s prefaces. While currently housed in a publicly-accessible Google Drive Folder, by the spring of 2016, the materials will transform into an Omeka exhibition.

Page viii-ix, preface of Mr. Cleveland.

Page viii-ix, preface of Mr. Cleveland.

There he first shewed me his father’s papers, which gave me so much pleasure and satisfaction, that I was very urgent with him to have them printed, persuaded that they would be, a very acceptable present to the public. The only objection he made to my proposal, was, the confused method in which they were writ, and the difficult task it would be to digest them…[Mr. Cleveland]

This quote is an excerpt from the preface which launched a thousand xml record searches. It comes from one of the first books I cataloged, The History of Mr. Cleveland.  I expected a preface about the “natural born son of Oliver Cromwell” to be salacious and scandalmongering. Instead, it provides a staid and detailed backstory of the book: how the memoirs of Mr. Cleveland came to be edited, printed, and bound as a book for public sale. It elaborates on the staunchly virtuous character of Mr. Cleveland, who according to this preface, is nothing like the Merry King Charles II type of rake that I imagined. One should not, however, according to the preface, rely on imagined fancies. The preface claims authorship and historical existence for Mr. Cleveland. And it makes these truth claims through direct address of the reader: “…The histories of… private persons…serve as an excellent lesson to all who are desirous of avoiding those rocks on which others have split, and of meriting the highest character to which human nature can attain, that of wise men. That the following piece may justly be ranked among the latter, will, I believe, be readily granted by all judicious readers.” The reader! Seeing those words was like being called out of hiding; I was re-positioned from a pair of occulted eyes spying on an unaware text to a participant in a space-time crossing conversation. The preface acknowledges the reader’s role in the book, in the transmission of content; this acknowledgement signifies a mutual constitution of textual and extratextual worlds. This experience inspired a desire to learn more about how 18th century novels conceived of the reader. And so was planted the seed for my project, an archive probing the triangulation and construction of readership, truth claims, and authorial voice in the prefaces of 1760s novels.

First page, preface of  Dialogues of the dead.

First page, preface of Dialogues of the dead.

Lucian among the ancients, and among the moderns Fenelon, Arch-bishop of Cambray, and Monsieur Fontenelle, have written Dialogues of the Dead with applause. But in our language nothing of that kind has been published worthy of notice: for the very ingenious and learned dialogues written by Mr. Hurde are all supposed to have past between living persons. The plan I have followed takes in a much greater compass…[Dialogues of the dead]

The centerpiece of the project was the creation of digital copies of prefaces from 1760s novels, to address the fact that early novel repositories like Google Books, Hathi Trust, and ECCO, do not consistently include paratext in their digitization and OCR processes. I obtained my sample by searching the 520 and 500 fields of END’s catalog records, metadata without which I would not have been able to execute this project. The sample is based on the 1760s novels held in the British and American Fiction Collection of the University of Pennsylvania’s Rare Book and Manuscript Library; on novels draw from that collection and cataloged by END; and on novels whose prefatory invocations of the reader were caught by catalogers

To make the prefaces as accessible as possible, I photographed and transcribed them. This means that questions of emphasis (are certain words printed with capital letters or set off from the rest of the text, for instance) can be resolved by looking at the photographs, while issues of searching (for example, how a researcher is supposed to know which prefaces may be relevant to their line of study) can be resolved by doing a command F on the transcriptions. To the greatest extent possible, a balance has been struck between preservation of the preface-as-book object and the operationalization of the preface for research.

With the assistance of other END team members, I also created audio recordings of the prefatory texts. The audio recordings demand time and a deep listening—a close ‘reading’ of the prefaces, an attention to the prefaces as works of literature and not mere addenda. Hearing the texts read aloud highlights their invocatory nature: the strength of a narrative, authorial voice in the preface, and the importance of that voice’s interaction with the reader.

First page, preface of the Faithful Fugitives.

First page, preface of the Faithful Fugitives.

As curiosity is natural to the mind of man, and as every thing which tends to excite, without satisfying it, must prove, in some degree irksome, I have thought proper to give the reader some account how these memoirs fell into my hands. [Faithful fugitives]

Because utility for other scholars is one of my primary goals for the archive, I have kept the new, digitized representations of the preface I have created–images, audio files, and transcriptions–connected with the catalog metadata on the prefaces’ books. This ensures that future scholars can put the prefaces in their respective books contexts. The archives elevates the prefaces without divorcing them from the novels in which they were published in the eighteenth-century. Researchers can easily view information about the other paratext in the book (footnotes, table of contents), the narrative form of the main text, the people associated with the writing and publication of the book, and so on. Transparent metadata also allows a scholarly conversation to take place around my archive and the future exhibition. Without context for the paratext and clear paths back to my sources, it would not be difficult for me to make wild claims about prefaces. Without this metadata, no one could engage or critique my work, unless they sorted through the entire collection of British and American fiction at Penn by hand.

First page, the preface of the Hermit.

First page, the preface of the Hermit.

The preface. Truth and fiction have, of late, been so promiscuously blended together, in performances of this nature; that, in the present case, it seems absolutely necessary to distinguish the one from the other. If Robinson Crusoe, Moll Flanders, and Colonel Jack, have had their admirers among the lower rank of readers; it is certain, that the morality in masquerade, which may be discovr’d in the travels of Lemuel Gulliver, has been an equal entertainment to the superior class of mankind. Now it may, without the least arrogance, be affirmed, that tho’ this surprising narrative be not so replete with vulgar stories as the former, or so interspersed with a satirical vein, as the last of the above-mentioned treatises; yet it is certainly of more use to the public, than either of them, because every incident, herein related, is real matter of fact. [Hermit]

“Promiscuously blended together” is an apt description for the relationship among fact, fiction, reader identity, authorial voice, and textual authority in the 1760s prefaces I have encountered. Prefaces are places for the provision of the context of a story (how it came to be written, why it is published with a frontispiece) and for the establishment of legitimacy (how a story will improve the reader’s morals, why moral improvement is utterly irrelevant in an entertaining tale, why its use of ‘pagan’ allegories is not heretical). This is somewhat obvious.

More illuminating is a focus on the fundamentally conversational nature of prefaces. They are not the print-version of a single voice piping context and the air of legitimacy into the willing ear of a generic reader. Rather, the prefaces are a melee of voices in negotiation. “Promiscuously blended together” aptly describes the relationships under contestation in the prefaces, relationships among fiction, fact, authorial authority, and the identity of the reader. A preface may envision and address multiple types of readership, may respond to the idea of the preface as an obligatory writing convention, may compare the novel of which it is a part to popular published works, may include quotes from Greek philosophers long dead, may claim to be historical fact and yet also claim that a factual text could be one that has fictional events , as long as those events could have conceivably taken place (even if they hadn’t in the particular manner and circumstances the novel imagined).

There is a palpable transactionality to the prefaces, one that reminds the contemporary cataloger that these novels were not always stored in air-controlled, dark rooms, but had lives embedded in the economic and cultural webs of the 18th century. The novels and prefaces, although set in print, were not static or insulated objects, but vehicles for and responses to human interaction.

Last page, preface of the Hermit.

Last page, preface of the Hermit.

Whate’er we do, or wherefoe’er we’re driv’n–Still, we must own, such is the will of heav’n. [Hermit]

The Preface Project is still a work in progress. Eventually, the photographs, transcriptions, and audio records will be available, along with their corresponding novels’ END catalog records, as part of an Omeka exhibition.  I am only now developing a full plan for the organization of the exhibition, the secondary sources on which it will draw, the data visualizations and analyzes it will contain, and the more granular themes it will address. For now, a folder containing preface transcriptions, audio recording, catalog records, and images is publicly available through END’s website. Documents further detailing my sampling and archiving methods are also be included in the folder.

Works Referenced

Genette, Gerard. Paratexts: Thresholds of Interpretation. Translated by Jane E. Lewin. New York: Cambridge University Press, 1997.

Langford, Larry L. “Retelling Moll’s Story: the Editor’s Preface to ‘Moll Flanders.’” The Journal of Narrative Technique 22, no. 3 (1992): 164-179.

Ratner, Joshua Kopperman. “Introduction” to “American Paratexts: Experimentation and Anxiety in the Early United States.” University of Pennsylvania ScholarlyCommons: Publicly Accessible Penn Dissertations. 2011.

Barchas. Janine. “Expanding the Literary Text: a Textual Studies Approach.” Graphic Design, Print Culture, and the Eighteenth-Century Novel. (Cambridge: Cambridge University Press, 2003), 1-18.

Barthes, Roland. “The Death of the Author.” Image Music Text. Translated by Stephen Heath. (New York: Hill and Wang, 1977), 142-148.

Geographical Locations in Footnotes

Early in the cataloguing process I catalogued “Vaughan’s Voyages” and it inspired me to work with footnotes for my END project because of the abundance of geographical locations in its footnotes. I started thinking about why the author had chose to include footnotes,  if the extreme frequency of geographical locations could tell me anything about the importance of footnotes in assigning genre, and the relationship between the reader and the explanation of referential and fictional locations. Because these geographical locations are in footnotes, I wondered if I could figure out if certain places required more explanation, or if the presence of a location in a footnote meant that the reader was unlikely to be familiar with it. I also wanted to see if I could compare the presence of geographical locations in footnotes with that in the titles of novels END has worked with, and whether the comparison could reveal anything about the number of footnotes with geographical locations or any such correlation. If a title has a geographical location, is it more likely to also have footnotes with geographical locations?

I will now outline my methods in collecting the data for this experimental project. My main source for gathering the data is the END Flickr and the photos that have been added to the Footnotes album. These photos are added as the END cataloguers find and upload photos with footnotes. The first novel in the album I looked at was “Giphantia” and worked my way back through the preceding novels towards the beginning of the album. I had no idea how many footnotes a given title would have or if there would be geographical locations in any of them, but I knew that if “Vaughan’s Voyages” did then there must be other novels where referential and imaginary places required footnote explanations. I created a spreadsheet and organized my data in columns: Title, Place, Flickr link, Tag, Notes, Year, and Franklin link (to the record of the title). I collected data from 23 novels, which is an arbitrary stopping point. Ideally for this project I would have looked through every photo in our footnotes album for all the novels that END has catalogued, and I hope to eventually expand this project. Of these 23 novels, 22 are from Penn and one is from Swarthmore. In the time allotted I looked through 657 photos of our footnotes, and of those 206 had at least one geographical location in a footnote. In these 206 photos I found 510 separate instances of geographical location. I was able to organize and sort my collected data into Excel spreadsheets to be used for analysis and visualization. To visualize my data I used RAW to show all the locations and their frequency values, I used to collect the data needed to make maps on CartoDB, which I used to map the locations I tagged as city and country. All the spreadsheets I used for data analysis and organization can be found in the public Google Drive folder Footnotes.

During the project there were challenges that came up due to the amount of data I collected and its content. I struggled to decide how to sort and organized the data because there were so many unique locations and I needed to figure out how to make the data more easily analyzed. I also needed to figure out what to do with instances of geographical locations that are ancient, imaginary, fictional, and how or if to map them on a modern geographical representation of the world. For the ancient locations in my data I was able to map them as their modern counterparts, but for the majority of the fictional and imaginary places or ones I tagged as “unknown” I was unable to map the data. Some alternative mapping examples: I added the frequency value of “Cordova” to the frequency value for “Cordoba” because it was an older spelling for the Spanish city, I added the values for “The City of Jupiter” and “Diospolis” to the value for “Cairo” because these are ancient names for modern parts of Cairo,  and I left out locations such as “Forrest’s Coffee-house” and “Wenlo” that are fictional. Cleaning my data in such a manner allowed me to better map the cities and countries represented in footnotes and use the maps for analysis. I had a large amount of data but I had to work with the automated mapping program and this left me with only a percentage of my data I could easily work with. The unmappable locations can be found in the spreadsheet titled “fictional and imaginary locations” in the public Google Drive folder Footnotes.

From the RAW visualization of all the frequency values for the locations I was able to clearly see which locations occurred most frequently in my dataset. These locations are Egypt, China, Africa, Spain, Paris, Europe, France, Mediterranean, and Constantinople, and clearly stand out on the bubble visualization. I learned from looking at this visualization which locations are mentioned the most and therefore were most important relevant to the understanding of the corpus of my dataset.

footnotes all values

While this is a broad and mostly clear way of looking at the bulk of my data, delving deeper into the questions the footnotes raise requires a closer look and a different organization of the data. I made a chart for the novels I looked at comparing the number of photos on Flickr with footnotes with the number of those photos with geographical locations in the footnotes. The spreadsheet with this data can be found in the public Google Drive folder Footnotes and is labeled “# of photos vs # with location in footnote”. The image of the bar chart is labeled “photos with footnotes comparison.png” in the folder, and can be seen below:

photos with footnotes comparison

Looking at individual footnotes in a few of these novels allowed me to start to think about whether the number of footnotes with geographical locations is an indicator of type of narrative. In order to answer this question and to think more about the purpose of the footnotes, I took a closer look at “Vaughan’s Voyages”, which has 24 photos with footnotes and 23 of which have locations in the footnotes, and compared it with the kinds of footnotes in “Adventures of Captain Greenland” and “The history of Sir Charles Grandison”, both of which have 20 photos with footnotes, a similar number to that of “Vaughan’s Voyages”. The footnotes in “Vaughan’s Voyages” seek to explain customs and provide referential information in order to enhance and illuminate the text. A quote from “Vaughan’s Voyages” is is as follows: “Tho’ perhaps I have not, in my past days, had any great regard for religion, and might leave it to be decided by chance, as the king of Macafar did*:”(141)  and is enhanced by the following footnote: “*Macafar is a large kingdom on the south part of the Celebes, an island in the Indian Sea. Near three centuries ago, they worshipp’d the sun and moon, as the most worthy objects of their adoration…”. This footnote gives more information about the place the author is writing about. “Vaughan’s Voyages” is a travel novel, so it appears as though there could be a connection between the genre of travel narrative and the presence of explanatory footnotes with geographical locations. Both “Adventures of Captain Greenland” and “The history of Sir Charles Grandison”, following this conjecture, do not fit into the travel narrative, seeing as out of 20 photos with footnotes, only 2 and 5, respectively, have geographical locations in the footnotes. The footnotes for “Adventures of Captain Greenland” seem to be more comical and less a serious addition to the information in the main text. An example of this is this quote: “…seasonable as well as unwholesome, for people, who have good healthful appetites to hold a fast at such a * biting time of the year.” (110), which has the footnote: “*Being about the middle of winter”. The footnote does not add interesting or explanatory information, but is instead entertaining yet unnecessary. Similarly, in “The history of Sir Charles Grandison’, there is no indication in the title that it is a travel narrative, and the kinds of footnotes it employs to explain parts of the text give information and the particulars of the characters and the content. For example, the heading for Letter VII on page 39 includes the following information “Miss Byron. In continuation. [On Sir Charles’s first letter from Bologna, Vol. IV. Letter XL. p. 277.]” which has the footnote: “*Several letters of Miss Byron, Lady G. Lady L. and Miss Jervois, which were written between the date of the preceding letter and the present, are omitted”. What is interesting about this quote is that there is a geographical location in the text itself, but one that does not require an explanatory footnote. According to the content of the footnote, what is happening concerning the letter and the people involved is what is interesting, and not the geographical location in which the letter is taking place.

The last way I visualized and interpreted my data was through mapping visualizations. Using the tags “city” and “country”, among others I assigned to my raw data, I was able to map the frequencies of these two types of locations on a modern map of the world using CartoDB. The spreadsheet with all the instances of geographical locations and the tags I assigned to them (some but not all have explanatory notes in the spreadsheet) can be found in the Footnotes folder under the title “Geographical footnotes raw data”.

For the purpose of comparison, it should be noted that a similar project was done by a former END researcher, Emma Madarasz, who mapped the geographical locations in the titles of 18th century novels (link to her project on our END With Known site). Her data shows that a significant number of places in the United Kingdom are mentioned in titles, something that contrasts with the locations in the footnotes of the novels I have chosen to work with. The titles she looked at also include countries such as Italy, France, and India, though not as many unique countries appear in titles as do in footnotes, as shown by my map visualization below. However, there are a significant amount of titles that include places in America or are connected with America, a place that is only represented three times in my raw data. Another difference between my project and Emma’s that I would consider for future work done with this project is that she included adjectives in her mapping of geographical locations. In the example of “America”, she included in her data the title “Amelia; or, the faithless Briton. An original American novel, founded upon recent facts.”, whereas I only recorded place names, and would have left such a title out of my data. For the future of my project I would consider collecting adjectives that related to geographical locations, and it would be interesting to see the effect of that decisions on the frequency values of places in my dataset.

countries in footnotes

The map visualization for “countries” I created is shown above. Using the locations tagged as “countries” allowed me to narrow the scope of my data and it was easy to map because almost all the countries in the footnotes I looked at were referential. The spreadsheet with the mapping information for the map seen above can be found in the Footnotes folder and is titled “countries mapping data”. The only location I had to alter was Turkey, which was spelled “Turky” in a footnote for “Adventures of Sig. Gaudentio”. The map shows that there are a significant number of footnotes that mention countries in Europe, especially Spain, France, and England, but that there are also a good number clustered around the Middle East and a few in North America, Asia, and Africa. Places in Europe would have been more accessible to people in England, and I would argue that these referential countries would have helped orient the reader to familiar places as well as foreign. The absence of places in South America and Australia shows that these places were either not well known to the authors or readers, or not relevant to the narrative of the novels I looked at for this project. 

cities in footnotes

The mapping graph above shows the location and frequency of the locations I tagged as “city” in my raw data. Mapping these locations was more of a challenge because many were ancient cities or cities that no longer have the name used in the 18th century. An interesting example of this is that one of the cities mentioned in a footnote for “Vaughan’s Voyages” is Leghorn, which I found out was the English pronunciation for the Italian port city Livorno, and therefore I mapped Leghorn using the coordinates for Livorno, Italy. I had to do a lot more data cleaning for this visualization, and all my notes for this can be found in the Footnotes folder in the spreadsheet labeled “city mapping data”. From this map I can conclude that just like the “country” visualization, the nodes are mostly clustered in Europe and also around the Middle East. Therefore it was important for the authors to included clarifying notes about the cities in these regions, and that although many of these places could have been known or accessible to readers, it was still necessary to explain or embellish in a footnote. Maybe this gave readers additional information about places already within their grasp of knowledge or it simply added to the scope of their geographical understanding. It is also interesting that there is only one city mentioned in footnotes that is in the Americas, and that is Santo Domingo.

The thing that pops out from these maps is the centralization of this mappable data around Europe. One thought that comes to mind is that the types of locations I assumed would need explanation would be more faraway and exotic locations. These locations do exist in my raw data, but present the problem of not being easily mapped, and they do not fall into the more modern and restrictive categories of “city” and “country”. These two mostly Eurocentric maps put a spotlight on Europe and the nearby places in the world that would have be more accessible through travel, and moreover ones that are more conveniently referential.

This experimental project on geographical footnotes has been beneficial to me not only in the collection, cleaning, and piecing together of the data, but it has taught me the value of looking back on my methods in retrospect and realizing what things can be done to make this project better and more comprehensive in the future. This ranges from the simple task of making sure to record the number of footnotes of each novel and recording the novels I look at that do not have footnotes, to the larger conceptual choices of what novels to specifically look at and making educated decisions based on the topic of the project and the available information about the novels. The future of this project would no doubt include even more meticulous data collection and recording of notes. For example, in the future when an instance of a geographical location in a footnote is recorded, all the information about that place including tag, explanatory notes for the tag both from the footnote itself and the recorder, latitude, and longitude will be inputted into the raw data spreadsheet. I went back and did many of these things after all the instances had been recorded, and it would be infinitely beneficial to already have this detailed information for each instance at the end of data collection. I would also recommend that this project delve deeper into the content of the footnotes and the novel itself, and I am sure this would help me answer a lot of the research questions that this project has posed and that I have grappled with in attempting to determine the purpose of footnotes in these 18th century novels. Ideally this project for the future will include every instance of geographical location in every footnote of every novel END catalogues, and the possibilities for the interpretation and organization of this information are endless.

Page 1 of 3123