For thorough documentation on how to encode primary sources, see Encode a Primary Source Transcription. The following documentation explains how we encode semi-diplomatic transcriptions of primary source texts—in particular, the semi-diplomatic transcriptions housed in MoEML’s library. The purpose of these revised guidelines is to 1) standardize the encoding of our library texts, and to 2) limit and simplify the CSS required to adequately render these texts.
In our library texts we encode:
  • Front matter (<titlePage>)
  • Textual gaps (<supplied>)
  • Page breaks with linked facsimile images (<pb>)
  • Woodcut images (<figure>)
  • Foreign words (<foreign>)
  • Dates, names, organizations, and toponyms (<date>, <name>, <ref>)
In our library texts we do not encode:
  • Line beginnings throughout prose (<lb>)
  • Formeworks (i.e., running titles, signatures, and catchwords)
  • Last-word wraps
  • Printer’s ornaments or line rulings
To lessen the amount of time spent on CSS, we have created a set of standard renditions for our library texts. Within <tagsDecl>, there are standardized renditions for: General information about renditions and how to use the <tagsDecl> element can be found here.

Transcribe a Library Text

Transcription Conventions

Semi-diplomatic transcriptions are transcriptions of texts that are not modernized or corrected for clarity. These transcriptions are not as strict as facsimile transcriptions which attempt to replicate the exact layout of the page. Rather, our goal is to normalize and regularize the features of the text that cannot be adequately captured through encoding (e.g., spacing, font-size, typographical ligatures) while retaining other significant features such as spelling, punctuation, abbreviations, and typographical errors. Our conventions for semi-diplomatic transcriptions can be found here. In summary, we:
  1. Silently normalize the long ſ
  2. Silently expand typographical ligatures (e.g., fl)
  3. Preserve capitalization, italicization, interchangeable characters (i.e., u/v, i/j, vv/w), vowel digraphs (i.e, æ, œ), nasal tildes over vowels (i.e., ã, ẽ, ĩ, õ, ũ), macrons over vowels (i.e., ā, ē, ī, ō, ū), and quotation marks
  4. Close up extra spaces between words and punctuation marks
  5. Preserve the line breaks in verse but not in prose

Unicode Characters

If you run across a unique character while transcribing, you may be able to find it as a unicode character. For example, note the fleuron in this heading:
See The Magnificent Entertainment sig. A3r for this image in context.
See The Magnificent Entertainment sig. A3r for this image in context.
In this case, the encoder can use the unicode character U+2767:
<head>❧ A DEVICE (projected downe, but till now not <hi style="font-style:italic;">publisht) that should have served at his Maiesties first accesse to the Citie</hi>.</head>
If you cannot find an appropriate unicode character for the character you need to transcribe, bring it to the MoEML team so a protocol can be established.

Special Characters

A common non-standard character that appears in early modern texts is a thorn (þ) that looks like a small Latin letter y with a reversed hook above:
See The Queen’s Majesty’s Passage sig. A4v for this image in context.
See The Queen’s Majesty’s Passage sig. A4v for this image in context.
If you run across this character in your text, you will need to add a <charDecl> to your document. General information about encoding non-standard characters can be found here. Since we have already written a <char> for this particular figure, all you need to do is paste the following <char> into the <charDecl> of your document:
<char xml:id="QMPS1_ye">
  <desc>An abbreviated form of <mentioned>the</mentioned>. This character takes the form of a small latin letter y with a reversed hook above. The closest Unicode character we have to represent this is a small latin letter y with a combining left half ring above. This character appears twice in the text, which is in black letter gothic. </desc>
  <mapping type="standard"></mapping>
  <mapping type="simplified">ye</mapping>
  <mapping type="medieval">þe</mapping>
  <mapping type="modern">the</mapping>
Make sure to change the xml:id on <char> to match your document and update the prose to reflect how many times the character appears throughout the text. When you come across this character in the text, transcribe it as (regular y + U+0351) and tag it with the <g> element, @ref attribute, and "xml:id_ye" value:
<p>eche cōteining <g ref="#QMPS1_ye"></g> title of those two princes. And these personages wer so set, <g ref="#QMPS1_ye"></g> the one of thē ioyned han-</p>
See The Queen’s Majesty’s Passage or Orders Appointed to be Executed in the City of London for examples of this encoding in practice.


While transcribing early modern texts, you will likely stumble across something that does not lend itself well to encoding. It is up to the MoEML team to decide on a case-by-case basis how these irregularities should be encoded. In our library texts, we want to avoid the use of extensive in-line CSS. For example, note this passage from The Magnificent Entertainment:
See The Magnificent Entertainment sig. E4r for this image in context.
See The Magnificent Entertainment sig. E4r for this image in context.
Now see how it was transcribed:
<p><name ref="mol:AGLA1" style="font-style:italic;">Aglaia</name>, <name ref="mol:THAL1" style="font-style:italic;">Thalia</name>, <name ref="mol:EUPH1" style="font-style:italic;">Euphrosine</name>, } Figuring { Brightnesse, or Maiestie. Youthfulnes, or florishing. Chearfulnes, or gladnes.</p>
This encoding avoids the use of extensive CSS while preserving the author’s intent.

Textual Gaps

When you are transcribing a library text—especially if you are working from an EEBO TCP transcription—you will need to supply textual gaps. Our documentation on how to supply gaps can be found here.

Encode a Library Text

Front Matter

If you are only encoding an excerpt from a primary source text, it is unlikely that you will need to encode front matter. If you are encoding a full text that includes a title page and other preliminaries (i.e., a dedicatory epistle, a letter to the reader, an introduction), you will want to nest this information in <front>. Our documentation on how to encode front matter can be found here. Here is the title page from A Remembrance of the Worthy Show and Shooting by the Duke of Shoreditch without entity tagging or styling:
  <pb facs="https://search.proquest.com/eebo/docview/2240956608/pageLevelImage/?imgSeq=25" n="D1r" xml:id="REME2_sig_D1r"/>
      <titlePart type="main">A REMEMBRANCE Of the worthy SHOW and SHOOTING BY THE DUKE of SHOREDITCH, AND HIS ASSOCIATES THE Worshipful Citizens of London, UPON Tuesday the 17th of September, 1583.</titlePart>
      <titlePart type="desc">Set forth according to the Truth thereof, to the everlasting Honour of the Game of Shooting in the Long bow.</titlePart>
    <docAuthor>By W. M.</docAuthor>
    <docImprint>London, Printed in the Year 1682.</docImprint>
Note that we do not use <lb> elements to add padding between lines. Guidelines on how to style title pages with standardized renditions can be found below. All text that appears after the title page and other preliminaries should be nested within <body>.

Page Breaks

After you have encoded the basic structure of your text, you will need to add page breaks. In our library texts, we:
  1. Mark all page breaks with the <pb> element
  2. Link to the facsimile image of each page with a @facs attribute on the <pb> element
  3. Note each page’s signature number with an @n attribute on the <pb> element
  4. Add an xml:id to each <pb> element so we can create links to specific pages throughout the website
Our documentation on how to link to facsimile pages on EEBO and EEBA can be found here. Once you have linked to the correct facsimile image, you will need to add an @n attribute with the page’s signature number and an @xml:id attribute with the page’s xml:id (i.e., xml:id of the text + sig + signature number of page):
<pb facs="https://search.proquest.com/eebo/docview/2264202925/pageLevelImage/?imgSeq=148" n="2M1v" xml:id="PRAI1_sig_2M1v"/>
If the text you are encoding is a broadside, it will not have any signature numbers. To exclude broadsides from our diagnostic that requires an @n attribute on all <pb> elements (see our diagnostics here), you will need to give them the mdt category mdtPrimarySourceLibraryBroadside. More information about document categories can be found here.


In our Library texts we no longer encode printer’s ornaments or line rulings. If there is a woodcut image in the text, you can use <figure> to describe the image. For example, see description given of this woodcut in The Great Boobee:
See The Great Boobee for this image in context.
See The Great Boobee for this image in context.
  <figDesc>Woodcut of a traveller with black hat, satchel, and walking stick being approached by man in black clothes and cape, with ruffled white cuffs and prominent white collar. Both men are bearded with moustaches. The pair appear on a white background, with shaded ground beneath their feet.</figDesc>

Style a Library Text

Page Width

The first part of the text you will need to style is the page width. This styling goes on the <text> element:
<text style="width: 34em; padding-left: 7em; padding-right: 7em; line-height: 1.2; margin-left:auto; margin-right:auto;"></text>
Currently our standard page width is "34em", which allows for easy reading. In CSS, you can use absolute length units (e.g., cm, mm) or relative length units (e.g., em, rem) to describe length. Relative length units specify their length in relation to another length property. We used relative length units at MoEML because they scale better when aspects of rendering—such as browser size—change. An em specifies its length in relation to font-size. "34em", therefore, means that the page width will be 34 times the size of the font. In the future, set page widths may be created for different book sizes (i.e., folio, quarto, octavo, broadside).

Text Alignment

A quick way to add simple styling to your text is with "text-align". This value can be used to align text to the left, right, or center and is particularly useful when styling lines of text that are not headings:
<l style="text-align: center;">Vnicus à Fato surgo non Degener Hæres.</l>

Standardized Renditions

While we want to use CSS to describe how a text looks, we do not want to add CSS that takes a lot of guesswork and tweaking on behalf of the encoder. Our overriding concern when encoding primary source texts is to tell the truth. For example, we cannot discern exactly how many ems an indent or dropcap may be, especially when we are working with scans of facsimiles. Standardized renditions, therefore, provide a quick way to style the main components of a text (i.e., headings, dropcaps) for the reader.
Below is the <tagsDecl> from The Magnificent Entertainment:
  <rendition scheme="css" xml:id="MAGN3_mainHead">font-family: Georgia; text-align: center; font-size: 150%; margin-bottom: 1em; margin-top: 1em;</rendition>
  <rendition scheme="css" xml:id="MAGN3_subHead">font-family: Georgia; text-align: center; font-size: 100%; margin-bottom: 1em; margin-top: 1em;</rendition>
  <rendition scheme="css" xml:id="MAGN3_dropCap">float: left; font-size: 250%; margin-right: 0.05em; padding: 0; line-height: 90%; display: inline-block;</rendition>
  <rendition scheme="css" xml:id="MAGN3_indentedLine">text-indent: 2em;</rendition>
  <rendition scheme="css" xml:id="MAGN3_indentedLineExtra">text-indent: 4em;</rendition>
  <rendition scheme="css" xml:id="MAGN3_lmlabel">display: block; float: left; margin-left: -8em; width: 7em; font-size: 80%; line-height: 1; clear: left; font-family: "Georgia"; text-indent: 0; font-style: italic</rendition>
  <rendition scheme="css" xml:id="MAGN3_rmlabel">display: block; float: right; margin-right: -8em; width: 7em; font-size: 80%; line-height: 1; clear: right; font-family: "Georgia"; text-indent: 0; font-style: italic</rendition>
Note that there are currently seven standardized renditions. These renditions should be pasted into the <tagsDecl> of all future library texts. If you believe that a rendition should be tweaked or a new rendition should be created, bring your proposal to the MoEML team.

Heading Renditions

There are two different renditions for headings: "mainHead" and "subHead". The "mainHead" rendition is used for substantial titles and headings and the "subHead" rendition is used for less substantial titles and subheadings. If your text does not have a title page, it is likely that you will use "mainHead" to style the title. If your text does have a title page, you can use "mainHead" and "subHead" to style individual <titlePart> elements, depending on what the text calls for:
  <titlePart rendition="#MAGN3_mainHead" type="main">THE MAGNIFICENT Entertainment:</titlePart>
  <titlePart rendition="#MAGN3_subHead" type="desc">Giuen to <name ref="mol:JAME1">King <hi style="font-style:italic;">Iames</hi></name>, <name ref="mol:ANNE2">Queene <hi style="font-style:italic;">Anne</hi></name> his wife, and <name ref="mol:HENR9" style="font-style:italic;">Henry Frederick</name> the Prince, vpon the day of his Maiesties Trvumphant Passage (from the <ref target="mol:TOWE5">Tower</ref>) through his Honourable Citie (and Chamber) of <ref target="mol:LOND5" style="font-style:italic;">London</ref>, being the <date when-custom="1603-03-15" datingMethod="mol:julianSic" calendar="mol:julianSic">15. of March. 1603</date>.</titlePart>
Note that "mainHead" and "subHead" can be used throughout a text. In Nine Worthies of London, substantial headings appear every time a worthy is introduced:
See Nine Worthies of London sig. B3r for this image in context.
See Nine Worthies of London sig. B3r for this image in context.
The differing font size of Sir William Wallworth Fishmon- versus er, sometime Maior of London demonstrates the type of CSS styling that we do not want to do in our library texts. In this case, we would style the entire heading with "mainHead":
<head rendition="#NINE2_mainHead">Sir William <hi style="font-style: italic;">Wallworth</hi> Fishmonger, sometime Maior of London.</head>

Dropcap Rendition

While dropcaps come in various sizes and styles, the "dropCap" rendition provides standard rendering for all dropcaps in library texts:
See Cheapside’s Triumphs and Chyron’s Cross’s Lamentation for this image in context.
See Cheapside’s Triumphs and Chyron’s Cross’s Lamentation for this image in context.
  <l><hi rendition="#CHEA4_dropCap">S</hi>Ee the guilding</l>
  <l>Of <hi style="font-style: italic;">Cheapsides</hi> famous building</l>
  <l>the glorious Crosse,</l>
</lg> <l><hi rendition="#CHEA4_dropCap">S</hi>Ee the guilding</l>

Indentation Renditions

Indents appear throughout the library texts in both poetry and prose. The "indentedLine" rendition is most often used on the <p> element to indent paragraphs that do not begin with a dropcap. The "indentedLineExtra" rendition is often used in conjunction with the "indentedLine" rendition to indent poetry. As the names suggest, "indentedLineExtra" will indent a line more than "indentedLine". A good example of these renditions being used together is this poem in The Magnificent Entertainment:
See The Magnificent Entertainment sig. F2r for this image in context.
See The Magnificent Entertainment sig. F2r for this image in context.
  <l><hi style="font-style:italic;">Troynouant</hi> is now no more a Citie:</l>
  <l rendition="#MAGN3_indentedLineExtra">O great pittie! is’t not pittie?</l>
  <l rendition="#MAGN3_indentedLine">And yet her Towers on tiptoe stand,</l>
  <l rendition="#MAGN3_indentedLine">Like Pageants built on Fairie land,</l>
  <l rendition="#MAGN3_indentedLineExtra">And her Marble armes,</l>
  <l rendition="#MAGN3_indentedLineExtra">Like to Magicke charmes,</l>
  <l rendition="#MAGN3_indentedLine">binde thousands fast vnto her,</l>
  <l>That for her wealth & beauty daily wooe her,</l>
  <l rendition="#MAGN3_indentedLine">yet for all this, is’t not pittie?</l>
  <l><hi style="font-style:italic;">Troynouant</hi> is now no more a Cittie.</l>

Marginal Label Renditions

The renditions "lmlabel" and "rmlabel" are used to style the marginal labels that appear in many early modern texts. Note that in The Magnificent Entertainment’s <tagDecl>, font-style: italic is included within the rendition since most of the text’s marginal labels are italicized. If most of the marginal labels in your text are not italicized, font-style: italic can be removed.
<rendition scheme="css" xml:id="MAGN3_lmlabel">display: block; float: left; margin-left: -8em; width: 7em; font-size: 80%; line-height: 1; clear: left; font-family: "Georgia"; text-indent: 0; font-style: italic</rendition>
<rendition scheme="css" xml:id="MAGN3_rmlabel">display: block; float: right; margin-right: -8em; width: 7em; font-size: 80%; line-height: 1; clear: right; font-family: "Georgia"; text-indent: 0; font-style: italic</rendition>
These renditions are placed directly on the <label> element:
  <l>And put his bounty off with a demurre.</l>
  <label place="margin-left" rendition="#COLD2_lmlabel">* An vnconscionable Broker.</label>
  <l>The third a Broker*, a base Houndsditch hound,</l>

Changes in Typeface

It should be noted that we differentiate between typeface within texts, but not text to text. For example, note how we encode Roman typeface in a mostly Blackletter Gothic text:
See The Queen’s Majesty’s Passage sig. B1r for this image in context.
See The Queen’s Majesty’s Passage sig. B1r for this image in context.
<p>wreathe was written the name, and title of the same, which was. <hi style="font-style: italic;">The vniting of the two howses of Lancastre and Yorke</hi>. Thys pageant was grounded vpon the Queenes maiesties name.</p>
Now note how we encode italics in a mostly Roman text:
See The Cold Tearm for this image in context.
See The Cold Tearm for this image in context.
  <l>Then there past Wherries in a month and more,</l>
  <l>’Twixt <hi style="font-style: italic;">Essex</hi>, <hi style="font-style: italic;">Middl’sex</hi>, <hi style="font-style: italic;">Kent</hi> and <hi style="font-style: italic;">Surry</hi> shore.</l>
  <l>And though for two mon’ths time, that fell together,</l>
In both cases, we use the <hi> element, @style attribute, and "font-style:italic" value to mark the change in typeface.


While superscripts are not very common in early modern texts, you may stumble across some that need to be styled. Many superscripts appear throughout The Praise and Virtue of a Jail and Jailers:
See The Praise and Virtue of a Jail and Jailers sig. 2M1v for this image in context.
See The Praise and Virtue of a Jail and Jailers sig. 2M1v for this image in context.
We would style this superscript as follows:
  <l>That it in History is not enrold.</l>
  <l>And <hi style="vertical-align:super; font-size: 50%;">h</hi> Woodstreet Counters age we may deriue,</l>
  <l>Since Anno fifteene hundred fifty fiue.</l>

Tag a Library Text

We tag all dates, names, organizations, and topynyms in our library texts. For a brief overview of how to tag these entities, see Tagging Dates, Companies, Toponyms, and People. While this quickstart is directed at those encoding Survey of London, the principles are the same for those encoding library texts.

