Scholarly contribution

Conventions for Semi-Diplomatic Transcriptions

Regularization Practices

Our practice is to preserve most of the typographical, orthographical, and compositorial features of the original text. To do this, our transcribers, editors, and encoders follow these conventions:
Textual Component Rule
Long ſ
We silently normalize the long ſ by replacing it with a short s. We do not tag the s in any special way.
Capitalization
We preserve the capitalization of characters in the source, including the second upper-case letter after a woodblock dropped capital.
Italicization
We preserve the italicization of words by tagging them with a <hi> element, @style attribute, and "font-style: italic" value. We consider italicization to be a bibliographical code rather than a linguistic code.1
Interchangeable Characters
We retain the interchangeable u/v and i/j and the use of vv for w.
Ligatures
We retain the vowel digraphs using the appropriate Unicode characters (e.g., æ). We silently expand typographical ligatures (e.g., fl).
Nasal Tildes
We retain the nasal tilde over vowels (e.g., õ) using the appropriate Unicode characters.
Abbreviations
We retain abbreviations of words (e.g., thē) using the appropriate Unicode characters. Abbreviations often appear as vowels with macrons (e.g., ā, ē, ī, ō, ū).
Spacing Within Lines
We close up extra spaces between words and punctuation marks. However, we retain the spacing in authorial initials, such as A. M. (for Anthony Munday). We add a single space after a comma when the comma has been used to separate two words.
Lineation
We preserve the line breaks in verse sections. All line breaks in verse are produced by the use of <l> elements contained by <lg> elements. We no longer preserve the line wrapping in the prose sections of our semi-diplomatic library works, with the execption of our mayoral pageant books. For more information, see Editorial Declaration for Mayoral Shows.
Hyphenation
We preserve internal hyphenation in words, but not hyphenation at the end of lines.
Quotation Marks
We retain all quotation marks in the text using the appropriate Unicode characters. We do not use the <quote> element for quotations in primary-source texts. Note that the MoEML Guide to Editorial Style calls for curly apostrophes and straight double quotation marks in both transcriptions and born-digital texts.

Notes

  1. For definitions of bibliographical code and linguistic code, see Encode a Primary Source Transcription. (JJ)