Copyright held by
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Further details of licences are available from our
Licences page. For more
information, contact the project director,
Born digital.
Most MoEML documents, or significant fragments with mol:
prefix and accessed through the web application
with their id + .xml
.
The molagas prefix points to the shape representation of a location on MoEML’s OpenLayers3-based rendering of the Agas Map.
Links to page-images in the Chadwyck-Healey
Links to page-images in the
The mdt (MoEML Document Type) prefix used on
The mdtlist (MoEML Document Type listing) prefix used in linking attributes points to a listings page constructed from a category in the central MDT taxonomy in the includes file. There are two variants, one with the plain _subcategories
, meaning all subcategories of the category.
The molgls (MoEML gloss) prefix used on
This molvariant prefix is used on
This molajax prefix is used on
The molstow prefix is used on
Our editorial and encoding practices are documented in detail in the Praxis section of our website.
It is easy for MoEML’s automated diagnostics to check that every internal link between pages, people, places, and so on is functional. However, it is much more difficult to do automated checking of external links; if one of our servers launches a link-checking task on any scale, its bandwidth will quickly be throttled by network agents designed to detect exactly this sort of potentially dangerous behaviour and constrain it.
As a result, we have to rely on the
W3C’s excellent link-checking service. This
service is designed to parse a web page and check all the links within it. However, we do not
want to have the service parse all our pages and end up checking the same links over and over
again. Instead, our build process generates a set of simple HTML pages which contain only
links, where each link appears only once. You can find these pages in the products
directory
of the Jenkins build artifacts. They are named externalLinks_001.html
,
externalLinks_002.html
and so on.
We cannot simply provide the URL of one of these files to the W3C Link Checker, because our
Jenkins server is configured to disallow access from robot processes such as the one the
Link Checker service uses (this is to prevent our Jenkins server from being overwhelmed by
search engine crawlers and so on). Instead, we have to copy these files to another server
where the link checker can access them. When you want to do some link checking, let an
The check will typically take a long time, and you will see status messages coming in on the page as it proceeds. When it is done, you will have a report listing potential problems. Information on how to handle the results appears below.
The report provided by the link checker divides problems into several types based on the
http response code
received when retrieving the page:
The following guidelines should help you resolve problems found by the checker. Remember that when you fix a link in the MoEML XML files, you need to fix every instance of it; many links appear many times in different pages.
Search for another location for the same content. If there is one, and it remains appropriate to the link(s) in question, change the links to point to it.
If there is no sign of the original content. check the Wayback Machine. If the content is available there, take the latest version from the Wayback Machine which is functional/complete/appropriate and change the links to point to that.
Moved Permanently)
Check that the content at the new location is in fact the original content or a new version of it. If so, change the link. If the content at the new location is something else (e.g., a message saying the content has been removed, or nothing at all), proceed as for a 404.
Access not allowed due to robots.txt
happens when the site rejects crawlers and the checker does not check the URL. In this case, check the URL manually. If you find that you arrive at the page, but the URL in the browser bar is not what you originally requested, then the situation is probably a 301, and you should proceed appropriately. If the content is missing or you get a 404, again proceed as above.
In cases where a site appears to be down, but you do not know whether it is temporary or not, make a note and come back to it after a few days. If it is still down, treat it as a 404.