Simple documents
Most of the basic information pages on this site, such as the
About page, fall into the category of simple documents. They consist of a
<teiHeader> and a single
<text> element. The basic structure of a document looks like this:
<TEI version="5.0" xml:id="about">
<teiHeader>
<fileDesc>
<titleStmt>
<title>
Map of London: About the Project
</title>
</titleStmt>
<publicationStmt>
<availability>
<p>
[Information about copyright...]
</p>
</availability>
</publicationStmt>
<sourceDesc>
<p>
[Information about the source document, if applicable...]
</p>
</sourceDesc>
</fileDesc>
</teiHeader>
<text>
<body>
<div xml:id="content">
<head>
About the Project
</head>
<p>
[Paragraph content...]
</p>
</div>
</body>
</text>
</TEI>
Some points to note:
- The
@xml:id attribute on the root <TEI> element is the same as the name of the file, except without its extension. So when the file is called about.xml, the @xml:id attribute should be "about".
- There is only one
<text> element.
- The main content is contained in a
<div> element with the @xml:id attribute "content". This is by convention, and makes it simpler to find and render key content during processing.
Tags and attributes used to mark up the content itself are covered in more detail below.
Some simple documents will have subsections and even sub-subsections. This is achieved by using nested <div> elements. Nesting can be as deep as required, but should be kept as simple as possible. For instance, this is a page which has an introduction and two subsections:
<text>
<body>
<div xml:id="content">
<head>
Page title
</head>
<p>
Introduction
</p>
<div>
<head>
Subsection 1 title
</head>
<p>
Para one...
</p>
<p>
Para two...
</p>
</div>
<div>
<head>
Subsection 2 title
</head>
<p>
Para one...
</p>
<p>
Para two...& things.
</p>
</div>
</div>
</body>
</text>
Location documents
Location documents are like simple documents in that they contain only one text element, and usually have a straightforward structure, but they typically have three elements which don't appear in simple documents. This is the basic outline of a location document:
<TEI version="5.0" xml:id="ABCH1">
<teiHeader>
[...Normal teiHeader; more below...]
</teiHeader>
<facsimile>
<surface>
<graphic url="C6.jpg">
</graphic>
<zone xml:id="C6_ABCH1" points="266,233 256,232 254,186 274,112 283,115 264,183 266,233 266,233">
</zone>
</surface>
</facsimile>
<text>
<body>
<div type="placeInfo">
<head>
Abchurch Lane
</head>
<listPlace>
<place type="street" corresp="#C6_ABCH1">
<placeName>
Abchurch Lane
</placeName>
<location>
<geo>
[Geographical coordinates will go here when available...]
</geo>
</location>
</place>
</listPlace>
</div>
<div>
<p>
<ref target="mol:ABCH1">Abchurch Lane</ref> runs north-south from <ref target="mol:LOMB1">Lombard Street</ref> to <ref target="mol:CAND1">Candlewick (Cannon) Street</ref>. [More about Abchurch Lane...]
</p>
</div>
</body>
</text>
</TEI>
There are two key aspects to this file: the <facsimile> element, in which the location of the place on a specific tile of the map is specified, and the <div> element with a @type attribute of "placeInfo". These are both crucial.
<facsimile>
<surface>
<graphic url="C6.jpg">
</graphic>
<zone xml:id="C6_ABCH1" points="266,233 256,232 254,186 274,112 283,115 264,183 266,233 266,233">
</zone>
</surface>
</facsimile>
The <facsimile> element in the example specifies that there is one "surface" on which this place appears. The graphic for that surface is C6.jpg, the tile for C6; this is encoded in a simple TEI <graphic> tag. On that graphic, there is one <zone>. The <zone> is defined as a series of points on the graphic. Each point consists of a pair of coordinates, x and y, separated by a comma, and the points are separated by spaces. In this example, you can see that the final point is the same as the first point, thus closing the polygon to make a shape. This will almost always be the case.
The <zone> element has an @xml:id attribute, which enables us to point to it later in the file. The @xml:id attribute value, "C6_ABHC1", is constructed from the name of the tile (C6), an underscore, and the @xml:id of the file itself.
<div type="placeInfo">
<head>
Abchurch Lane
</head>
<listPlace>
<place type="street" corresp="#C6_ABCH1">
<placeName>
Abchurch Lane
</placeName>
<location>
<geo>
[Geographical coordinates will go here when available...]
</geo>
</location>
</place>
</listPlace>
</div>
The <div> with @type="placeInfo" starts with a <head> element with an overall title for the place/page. Then there is a <listPlace> element. In this case, it contains only one <place> element, but in some cases there will be several; the same street may appear on multiple map tiles, and may be split by other features into separate sections. Each section might have a slightly different title (Main Street West, Main Street East), so there is a <placeName> tag inside each <place>.
The attribute @type="street", which defines the type of place (other values are "church", "ward" etc.). These determine which index page the location will appear on ("Streets", "Churches" etc.). The second attribute is @corresp="#C6_ABCH1". This points to the <zone> element in the <facsimile> section above. It specifies that this place element is located at the position specified by the <zone> with @xml:id="C6_ABCH1".
More complex locations will have multiple
<surface> elements with multiple
<zone> elements, and multiple
<place> elements below. See
/data/locations/THAM1.xml for a good example.
Each <place> element has a <location> element with an empty <geo> tag inside it. In the future, this will contain the GIS coordinates of the place, so that we can provide links to shapes on Google Maps or other GIS-based mapping systems.
Following the <div> with @type="placeInfo", the remainder of the document consists of regular <div> elements containing the content of the document. See below for more information on textual markup.
Complex documents
The final type of document is the much larger contribution which typically consists of a long essay divided into several sections. These are often found in the Topics or Library areas of the site. Sections in the same document may be written by different people, so we constitute them in the form of separate <text> elements. The typical document structure looks like this:
<TEI version="5.0" xml:id="HENS2">
<teiHeader>
[...Normal teiHeader; more below...]
</teiHeader>
<text>
<group>
<text>
[First text...]
</text>
<text>
[Second text...]
</text>
<text>
[Third text...]
</text>
</group>
</text>
</TEI>
Each
<text> element will have its own
<body> containing
<div> elements, which may of course be nested. When a document like this is processed for display, a table of contents will automatically be created based on the hierarchy of
<text> and nested
<div> elements with their
<head> children. See the
/data/topics/HENS2.xml for an example of this kind of XML structure, and its
HTML rendering to see the result.
One final note: when a table of contents is created for a complex file like this, XHTML @id attributes are generated automatically to support the links between the TOC and the various sections. However, these automatically-generated @ids may be different each time the page is rendered. If you want to create static internal ids to link to, add and @@xml:id attributes to each <text> element, or to each <div> element inside a <text>. The page you are currently reading provides an example of this; click on the XML link above to see the code.