Instructors: Elisa Beshero-Bondar and David J. Birnbaum
Description:
If you work with TEI or XML for scholarly editions, archives, or digital collections, this course will professionalize your XML workflow. Whether you’re a librarian managing digital collections, a graduate student working on an institutional project, or a scholar building a critical edition, you will gain confidence as a project developer working sustainably and efficiently with structured documents.. You’ll learn to use the XML family of languages to validate encoding consistency, transform documents for multiple outputs, query complex textual relationships, and build infrastructure that adapts as technologies evolve.
You may be wondering how AI will impact this work. You may be surprised to learn that AI makes XML processing expertise more essential than ever. High-quality AI applications, particularly in scholarly edition workflows, require consistent, structured data, and digital humanities projects need validation strategies to ensure AI-generated content fully conforms to scholarly and editorial standards.
New and experienced coders of XML will benefit alike from this course. Our goals are 1) to share strategies for systematically building archives and databases, and 2) to increase participants’ confidence and fluency in extracting information coded in XML in those archives and databases. This class teaches you how to navigate and process XML using tools designed for the purpose—XSLT, XQuery, and Schematron. We cover these together as members of the same XML family, sharing a common syntax in XPath.
XPath is the center of the course, but we will show you how it applies in multiple XML processing contexts so that you learn how these work similarly and how they are used, respectively, to validate documents and to transform them for publication and other reuse. We’ll apply XPath to check for accuracy of text encoding—to write schema rules to manage your coding (or your project team’s coding). You’ll practice and gain fluency in writing XPath expressions and patterns, including sequence expressions, regular expressions, datatypes, predicates, operators, and functions (from the core library and user-defined).
We’ll write XPath to calculate how frequently you’ve marked a certain phenomenon, or locate which names of people are mentioned together in the same chapter, paragraph, sentence, stanza, or annotation. You’ll learn how XPath can help you to build exciting visualizations from XML code (such as to make a chart like a timeline or a network graph). Whether you are an XML beginner or a more experienced coder, you’ll find that strengthened skills in XPath and the XML family will help you with systematic encoding, document processing, and project management. These skills will also prepare you to create custom processing workflows—from preparing datasets for AI applications to building quality control pipelines that combine your scholarly expertise with emerging tools.
This is a hands-on course. Consider this offering in complement with, and / or to be built on by: Text Encoding Fundamentals and their Application, Out-of-the-Box Text Analysis for the Digital Humanities, Text Processing - Techniques & Traditions, XML Applications for Historical and Literary Research. No advanced knowledge of XML processing is necessary but those with interests in document processing who have taken Digital Documentation and Imaging for Humanists; Advanced TEI Concepts / TEI Customization; A Collaborative Approach to XSLT; or Geographical Information Systems in the Digital Humanities will certainly benefit.