Our very first XSLT assignment is an Identity Transformation, a kind transformation we have to do frequently in our projects when we need to make specific changes to our encoding. We want to make some small changes in our Georg Forster file to make better choices of TEI elements for some of our tags.

To begin, download the Georg Forster file from here: ForsterGeorgComplete.xml and open it in <oXygen>. We don’t want to change much about this file, but we do want to alter its tagging just a little, and that is a good occasion to write an Identity Transformation XSLT, converting our XML to XML that is meant to be (for the most part) identical to the original.

Here are two changes we want to make to our XML file:

You may already be calculating how to do these tasks with a regular expression Find and Replace, and while we know you could do that, our purpose with this exercise is to make the changes using an XSLT transformation, and we hope you will learn some things about how XSLT works through this exercise!

To begin, open a new XSLT stylesheet in <oXygen> and switch to the XSLT view. We will have some housekeeping to do as we get started.

Namespaces matter! Setting up an XSLT stylesheet to Read TEI

Georg Forster’s A Voyage Round the World is coded in the TEI namespace, which means that your XSLT stylesheet must include an instruction at the top to specify that when it tries to match elements, it needs to match them in that TEI namespace. When you create a new XSLT document in <oXygen/> it won’t contain that instruction by default, so whenever you are working with TEI you need to add it (See the text in blue below). We also need to make sure that our XSLT parser understands it is outputting results to the TEI namespace, so we change one more line (See the text in red below).Our modified stylesheet template looks like the following:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xpath-default-namespace="http://www.tei-c.org/ns/1.0"
    xmlns="http://www.tei-c.org/ns/1.0"
    version="3.0">
    
</xsl:stylesheet>

Writing the Identity Transformation!

  1. We will give you your first template rule, to set this as an identity transformation. We’re going to use a new form for this in version XSLT 3.0, so that is why we have set version="3.0" in our stylesheet template above. On future assignments we are setting the default version 2.0 for tranforming to HTML mostly because the old version is better tested for processing HTML output, but for an identity transformation of XML to XML, we like the efficient new code we can write in version 3.0. (You can see an old form here in the first template rule of our Identity transformation of Shakespeare’s sonnets, which you can download, save and open from here. That old first rule matches on all nodes, elements and attributes throughout the document and simply copies them. It’s perfectly fine to use that older template rule in place of the one we show you below, but we like the simplicity of this new form even better!).
    <xsl:mode on-no-match="shallow-copy"/>

    This XSLT statement is the opposite of the xsl:template match we have been showing you in our XSLT tutorial. You basically say, if I do not write a template rule to match an element, attribute, or comment node, really of any part of the document that I do not mention in a template match rule, XSLT should simply make a copy of that element and output it. Try running this and look at your output: it will look exactly identical to the current XML document. Obviously we do not need to do this unless we want to make changes with template match rules! There is another way to copy, called "deep copy" in XSLT, but we do not want use it here. When you use "deep copy" in XSLT, you reproduce the full directory tree underneath a given element, so the understanding is that we would match on the root element only, and reproduce all the descendents of that one node just as they are. We like the "on-no-match-shallow-copy" approach because we do not necessarily want to copy every node just as it is in the original. We only want to copy if it we do not want to write a new template rule that will change it.

  2. Next, we will simply write our template rules to match on the particular elements we wish to change. You may wish to start with the simpler of the two, to convert all the <emph> elements into <hi rend="italics"> in the output XML. Review our Introduction to XSLT to see how to write a template match on any particular element, and how to output as a different element in its place using <xsl:apply-templates/>.
  3. Now, write the template rule that will match only on <l> elements that are children of <head> elements. And see if you can figure out how to replace these by positioning the self-closing line-break element <lb/>, positioned in the correct spot in relation to <xsl:apply-templates/> so that the <lb/> sits at the end of a line.
  4. We often use an Identity transformation with Attribute Value Templates, a handy special format in XSLT for adding attributes to output elements like <p> or <l>, for example where we want to include an attribute (like @n or @number) to hold a paragraph or line number. The Attribute Value Template is a special way to write an XPath expression that pulls information from our input XML to output as our attribute value (so we can come up with a count() of where the particular line we are processing sits in relation to all the preceding:: line elements ahead of it). There is a special way to write these into attribute values you are constructing for your output file, and for that you need to read and look at the examples in Obdurodon’s Attribute Value Templates (AVT) tutorial. We have already numbered the paragraphs in the Forster file, but as your final challenge for this assignment, read the AVT tutorial and then write two more template rules to add @n attributes that automatically number the <div> elements for Books, and the <div> elements for Chapters. (Hint: For a little extra help, look at my old example ID-transform stylesheet that adds line numbers to a series of sonnets, downloadable from here if you didn’t download it earlier.) We will return to this later, since you will be working with AVTs in later XSLT exercises.

When you are finished, save your XSLT file and your XML output of the Georg Forster file, following our usual homework file naming conventions, and upload these to the appropriate place in Courseweb.