Georeferencing the Early Modern London Book Trade: 3. What’s in an Imprint?
In my previous blog post, I proposed a model TEI-XML tree for encoding and geocoding bibliographic datasets. In the conclusion of that piece, I suggested that, in order to implement such a model, print historians must pool their resources and expertise through collaborative data mining and sharing. Over the past year (2014-2015), Janelle Jenstad and I have been working with David Eichmann and Blaine Greteman of the University of Iowa to extract precise geographic data points from the English Short Title Catalogue (ESTC). Dr. Eichmann and Dr. Greteman have developed some groundbreaking methods to datamine the ESTC as part of the Shakeosphere project, which enables researchers and students
to search, analyze, and contribute to a visualization of the early modern print and manuscript network.As I suggested in my first blog post, print historians have hitherto been unable to extract meaningful geographic data points from the ESTC because book imprints (data field
260in MARC-tagged database entries) remain unparsed. Record number
006182591served as my former example:
Note that field
260 |a London : |b Printed by Thomas Creede, for Tho. Millington, and Iohn Busby. And are to be sold at his house in Carter Lane, next the Powle head, |c 1600.
260|bcontains the printer’s name, the booksellers’ names, and the bookshop location, each undemarcated. To surmount this technical barrier and realize the full potential of ESTC data, David Eichmann has developed a name-entity recognition algorithm that recognizes and parses the various data points contained within field
260|bof ESTC entries based on linguistic patterns. Dr. Eichmann’s algorithm parses an entry and wraps custom tags around the individual pieces of information in the imprint. For example, his algorithm transforms record number
006182591into the following XML code:
<location>London</location> : Printed by <stationer><forename>Thomas</forename> Creede</stationer>, for <stationer>Tho. Millington</stationer>, and <stationer>Iohn Busby</stationer>. And are to be sold <locational>at</locational> <location>his house</location> <locational>in</locational> <location>Carter Lane</location>, <locational>next</locational> <location>the Powle head</location>, <date>1600</date>.
From this initial mark-up, it becomes apparent that book imprints, though pointing to one location of print activity (i.e., a single set of supposed geo-coordinates), contain multiple toponyms. Each toponym expresses a different level of spatial precision (e.g.,
his houseis more precise than
Carter Lane). Moreover, the spatial hierarchies between toponyms are expressed by way of locational prepositions. Insofar as geography may be described as the
depiction or analysis of the way the constituent parts of something interact, or of their arrangement in relation to one another(OED geography n.5), the
geography of the bookemerges from the intersection of relations between toponyms and names associated with a book, expressed in a book imprint.
Dr. Eichmann has been kind enough to share his parsed imprint data with Dr. Jenstad and me. Using XSLT, I was able to transform his data into a set of five TEI-XML databases:
<listBibl>database of sources (i.e., ESTC entries), each identified by its Shakeosphere ID number and date.
<listPers>database of identified stationers, containing each stationer’s forename, surname, and floruit dates.
<listPlace>database of identified locations, containing the various toponyms used to refer to each location across imprints.
<listRelation>database of relations between identified locations, expressed by locational prepositions.
<listRelation>database of relations between identified stationers and identified locations expressed by locational prepositions.
I would like to suggest that the fifth and final database, which cross-references data contained in the other databases, holds enormous potential for early modern print historians interested in the geography of the book. As I mentioned in my first blog post, in order to interact meaningfully with the burgeoning field of geohumanities, print historians need to be able to make large-scale queries about
locations of print activity.What is meant by this term is a matter of precision: a location of print activity may be a continent, a country, a county, a city, a ward, a street, or a building. For some print historians, it may suffice to georeference data according to broader categories such as country or city; however, for print historians interested in the early modern London booktrade, high precision is critically important. The ESTC, for example, contains almost 66,000 entries of books published in London between 1475 and 1666. By situating stationers in relation to locations, the fifth database enables print historians to make queries about individual stalls, stands, and shops occupied by individual stationers in London. Consider the following entries taken from the prototype database:
<relation name="in" active="16665" passive="8762" notBefore-custom="1624" notAfter-custom="1679" datingMethod="mol:julian" source="109143; 141029; 149836; 154172; 165144; 165554"></relation> <relation name="at" active="14575" passive="7562" notBefore-custom="1633" notAfter-custom="1641" datingMethod="mol:julian" source="89376; 188336"></relation>
This first XML element essentially states that, according to Shakeosphere entries
141029, etc., stationer
<listPers>database (i.e., Francis Coles) worked
<listPlace>database (i.e., Vine Street) from 1624 to 1679. From this
relationwe can infer a location of print activity, namely
The second XML element similarly suggests that, according to Shakeosphere entries
<listPers>database (i.e., John Rothwell) worked
<listPlace>database (i.e.,The Sun) from 1633 to 1641. From this
relationwe can infer a location of print activity, namely
sign of the Sunwas located in St. Paul’s Churchyard. Further research could then determine the spatial coordinates that correlated with this qualitative description of a place.
A wealth of information is embedded in the syntax of book imprints. My research suggests that, in its simplest form, a book imprint expresses three data points in prose: (1) a stationer, (2) a toponym referring to a location of print activity, and (3) a preposition describing the relationship between 1 and 2.1 However, until recently, the resources (i.e., software, knowhow, funding, etc.) required to retrieve and cross-reference these three data points has not been available to print historians. Collaboration has provided Dr. Jenstad and me with an innovative and effective way to transcend this barrier. Working with the Shakeosphere team in Iowa, we have successfully developed prototype methods for parsing the relational data contained in early modern book imprints.
- Further research might consider the way that different prepositions signify different geographic relationships. Based on my rudimentary work in this area, I believe that it may be possible to develop a taxonomy of prepositions used in book imprints, describing the different types of relationships stationers had with locations of print activity. (TLG)
Last modification: 2016-05-27 14:37:29 -0700 (Fri, 27 May 2016) (tlandels)