The Fall 2020 DIGIT 400 James Bond project team has prepared XML for the screenplay Goldeneye, which you can access by right-clicking on the file and downloading it from here: Goldeneye.xml. Open the file in oXygen and work with the XPath Window set to version 3.1. Respond to the XPath questions below in a text or markdown file, and upload to Canvas for this assignment when you’re finished. (Please use an attachment! If you paste your answer into the text box, canvas may munch the code formatting.) Some of these tasks are thought-provoking, and even difficult. If you get stuck, do the best you can, and if you can’t get a working answer, give the answers you tried and explain where they failed to get the results you wanted. Sometimes doing that will help you figure out what’s wrong, and even when it doesn’t, it will help us identify the difficult moments.
You should consult The XPath Functions We Use Most page and especially its section 4 on Strings. As always, consult our class notes and our introductory guide Follow the XPath!. Be sure to give the XPath expression you used in your answer, and don’t just report your results. This way, if the answer is incorrect, we can help explain what went wrong.
First of all, skim through the document to get a sense of how it is coded. Write some XPath to see if you can write XPath expressions to find all the scenes, stage directions, speeches, and speakers just to warm up and familiarize yourself with the file.
sd
elements. These contain the stage directions.
matches()
function to locate the stage directions that hold a regular expression pattern of three or more capital letters in a row.Heading
element. How can you reliably find the first stage direction immediately following that Heading element? (Hint: our solution uses the following-sibling::
axis and a position predicate to indicate the first in a sequence.) Heading
elements, how can you find out which ones contain reference to the character "Q"? (Hint: add a predicate).string-length()
function, which indicates the number of characters in the XML node that you visit.
string-length()
of all the stage directions coded in sd
elements.max()
function to find out the longest length of a stage direction in the Goldeneye script.string-length()
and max()
functions took us off the XML tree to yield calculated results. How can we write XPath to return the XML element sd
that has the maximum string-length()
? Hint: Try searching for sp
elements with a predicate that checks to see if the string-length()
is equal to the maximum string-length you found in the previous step.spk
elements are nested as children inside the sp
elements. Write an XPath expression to return all the speakers (spk
) who deliver speeches that contain the word "Iraq". spk
elements are entered in block caps. Use the XPath lower-case()
function to return all the spk elements lower-cased instead and record your expression.string-surgeryin XPath by working with substrings. Consult this page to learn about the XPath
substring()
function and see how to write it out. Now, see if you can apply the substring()
function to isolate the 2nd letter onward in the spk
elements. Then, lower-case()
that substring!substring()
to isolate letters 2 to the end, you should be able to change it to return only the very first letter. Try it and record your expression.concat()
function, and there is a convenient shorthand for it in XPath 3.1 which sets two vertical bars ||
between the expressions you want to connect. However, we need to be careful because concatenation requires joining exactly one thing to exactly one other thing. (XPath can't figure out on its own how to concat (or tie together) the whole sequence of substrings of the first letter to the whole sequence of the substrings of the rest.) To help XPath to work one at a time over sequences of spk
substrings, look up the for $i in (sequence) return ...
XPath sequence. (This is a for-loop in XPath, and $i
is known as a range variable that isolates each member of the series, one by one.) With the for-loop, you can go one step at a time through the series of //spk
nodes and return a concatenation of the substring functions you figured out, using $i
as the first argument of your substring functions. See if you can work out how to write this XPath.