Skip to content
  • Course Overview
  • Contact
  • My Account
  • Home
  • Private: All Courses
  • Disruptive TEI: De-/Encoding Normative Frameworks

Disruptive TEI: De-/Encoding Normative Frameworks

Curriculum

  • 13 Sections
  • 71 Lessons
  • Lifetime
Expand all sectionsCollapse all sections
  • Module 1: The Basics
    Build a strong foundation in TEI by learning its core structure, syntax, and key elements.
    11
    • 1.1
      1.1: What is TEI?
    • 1.2
      1.1: Quick Quiz
      3 Questions
    • 1.3
      1.2: TEI-XML and Intelligent Search
    • 1.4
      1.2: Quick Quiz
      3 Questions
    • 1.5
      1.3: Setting Up Your TEI Workspace
    • 1.6
      1.3: Quick Quiz
      3 Questions
    • 1.7
      1.4: TEI Document Structure & Key Elements
    • 1.8
      1.4: Quick Quiz
      3 Questions
    • 1.9
      1.5: Genre-Specific Encoding
    • 1.10
      1.5: Quick Quiz
      3 Questions
    • 1.11
      Live Coding Exercise: TEI Metadata
  • Module 2: Critical Encoding Practices
    Explore how text encoding reflects systems of power and how critical approaches can challenge biases in metadata and markup.
    4
    • 2.1
      2.1: Power and Representation in Text Encoding
    • 2.2
      2.2: Bias in TEI Markup and Metadata
    • 2.3
      2.3: Encoding Marginalized Voices and Alternative Perspectives
    • 2.4
      2.4: Ethical and Inclusive Encoding Practices
  • Module 3: Decolonial TEI Encoding
    Uncover colonial legacies in encoding practices and implement ethical frameworks for attribution, consent, and cultural specificity.
    5
    • 3.1
      3.1: Identifying Colonial Biases in TEI Encoding
    • 3.2
      3.2: Ethical Attribution and Contextualization
    • 3.3
      3.3: Consent-Based & Community-Defined Metadata
    • 3.4
      3.4: Implementing Traditional Knowledge Labels
    • 3.5
      3.5: Challenging Assumptions of Universality and Neutrality
  • Module 4: Cultural Knowledge and Editorial Responsibility
    Interrogate editorial neutrality, embrace relational responsibility, and center community protocols in your encoding decisions.
    6
    • 4.1
      4.1: Disrupting Editorial Neutrality
    • 4.2
      4.2: Archival Silences: Encoding Absence and Refusal
    • 4.3
      4.3: Cultural Protocols and Community Authority
    • 4.4
      4.4: Relational Responsibility Beyond Attribution
    • 4.5
      4.5: Ethics in Metadata: Naming, Identity, and Consent
    • 4.6
      4.6: Encoding Epistemologies and Resisting Eurocentric Taxonomies
  • Module 5: Antiracist Markup Strategies
    Develop strategies for addressing racial bias, linguistic marginalization, and whitewashed textual representations in TEI.
    5
    • 5.1
      5.1: Introduction to Antiracist TEI Encoding
    • 5.2
      5.2: Dialects and Code-Switching
    • 5.3
      5.3: Textual Erasure and Whitewashing
    • 5.4
      5.4: Structural Norms and Editorial Power
    • 5.5
      5.5: Positionality and Editorial Perspective
  • Module 6: Encoding Language Diversity and Multilingualism
    Address linguistic injustice by encoding diverse, non-standard, and community-centered languages in respectful and effective ways.
    5
    • 6.1
      6.1: Language, Power, and Digital Editions
    • 6.2
      6.2: TEI Basics for Multilingual Texts
    • 6.3
      6.3: Community Languages and Non-Standard Varieties
    • 6.4
      6.4: Addressing Linguistic Injustice
    • 6.5
      6.5: Visualizing and Presenting Multilingual Editions
  • Module 7: Queer Perspectives and Markup Beyond the Binary
    Apply queer theory to encoding practices by exploring fluidity, nonlinear narratives, and resistance to binary structures.
    5
    • 7.1
      7.1: Queer Theory and Textual Encoding
    • 7.2
      7.2: Encoding Fluid Identities and Relationships
    • 7.3
      7.3: Queering Authorship and Attribution
    • 7.4
      7.4: Queer Temporalities and Nonlinear Narratives
    • 7.5
      7.5: Encoding Queer Linguistic Practices
  • Module 8: Cripping TEI: Encoding Disability, Neurodivergence, and Access
    Center disabled and neurodivergent experiences in encoding through principles of care, multimodality, and temporal disruption.
    6
    • 8.1
      8.1: Disability Studies, Crip Theory, and Digital Normativity
    • 8.2
      8.2: Disability Representation: Visibility, Consent, and Power
    • 8.3
      8.3: TEI Elements and Critical Metadata for Representing Disability
    • 8.4
      8.4: Crip Time, Temporal Disruption, and Nonlinearity
    • 8.5
      8.5: Multimodal and Access-Centered Encoding
    • 8.6
      8.6: Care, Access, and Responsibility
  • Module 9: Nonlinear, Fluid, and Ambiguous Texts
    Learn to represent polyvocal, performative, and evolving texts that defy traditional textual boundaries and require ethical flexibility.
    7
    • 9.2
      9.1: Challenging Linearity: Encoding Fluid and Living Narratives
    • 9.3
      9.2: Encoding Ambiguity, Multiplicity, and Uncertainty
    • 9.4
      9.3: Multi-Voiced and Polyphonic Texts
    • 9.5
      9.4: Spatial, Visual, and Performative Texts
    • 9.6
      9.5: Care-Centered Strategies for Encoding Complex Texts
    • 9.7
      9.6: Embodied Knowledge: Encoding Performance, Gesture, and Sensory Texts
  • Module 10: Markup for Change: Encoding Embodiment, Equity, and Environment
    Engage with affect, trauma, resistance, and environmental justice to explore how markup can become a form of advocacy and healing.
    6
    • 10.1
      10.1: Encoding Affect, Embodiment, and Intuitive Knowledge
    • 10.2
      10.2: Trauma and Loss: Encoding Silences and Difficult Histories
    • 10.3
      10.3: Resistance and Alternative Literacies
    • 10.4
      10.4: TEI and Environmental Humanities
    • 10.5
      10.5: Anti-Neoliberal TEI: Encoding Beyond Efficiency, Ownership, and Commodification
    • 10.6
      10.6: Data Justice and Encoding Against Surveillance
  • Module 12: Teaching with TEI: Composition, Code, and Critical Reading
    Design inclusive, multimodal, and critical pedagogy strategies using TEI as a tool for close reading, authorship, and data ethics.
    6
    • 11.1
      12.1: Text Encoding as Critical Reading
    • 11.2
      12.6: Grading Code with Care: Critical Approaches to Assessment
    • 11.3
      12.2: Textual Analysis with TEI
    • 11.4
      12.3: Multimodal TEI Assignments
    • 11.5
      12.4: Teaching Data Ethics and Representation
    • 11.6
      12.5: Collaborative Student Projects
  • Module 11: From Authority to Accountability: Collaborative Approaches to TEI
    Shift from single-author control to collective responsibility through equitable workflows and shared encoding practices.
    5
    • 12.1
      11.1: Rethinking Authorship
    • 12.2
      11.2: Ethical Frameworks for Collective Encoding
    • 12.3
      11.3: Equitable Workflow Design
    • 12.4
      11.4: Tools and Platforms for Collaborative TEI Encoding
    • 12.5
      11.5: Strategies for Managing Multi-Author Projects
  • Module 13: Beyond Encoding: Analyzing & Visualizing TEI Data
    Translate encoded data into stories, visualizations, and public-facing scholarship while attending to ethical data practices.
    6
    • 13.1
      13.1: Analyzing TEI Data
    • 13.2
      13.2: Visualization Techniques for TEI
    • 13.3
      13.3: Publishing TEI Projects Responsibly
    • 13.4
      13.4: Telling Stories with TEI
    • 13.5
      13.5: The Risks of Interoperability: TEI Meets Linked Open Data
    • 13.6
      13.6: Public Humanities Projects

1.2: TEI-XML and Intelligent Search

Lesson 1.2: TEI-XML and Intelligent Search

Introduction

TEI is based on XML (Extensible Markup Language), a flexible markup language used to represent and store textual data in a hierarchical structure that makes it easier to analyze, retrieve, and preserve information. Unlike HTML, which is primarily designed for web display, XML does not define how data should be displayed; rather, it provides a flexible structure for representing complex relationships in data. As Lou Burnard explains, TEI-XML gives us a framework for representing whatever is considered of importance about the text, not just its appearance, so that software can act on the distinctions identified, generating new visualisations and new perspectives (9).

It is highly adaptable, allowing users to encode information using custom tags that define the structure and meaning of content. This makes XML ideal for organizing textual data in TEI projects, where structured markup facilitates advanced search capabilities, fosters data interoperability, and ensures long-term preservation of digitally encoded texts.

XML Features for Structured Text Analysis

  • Self-descriptive: Tags are not predefined (users create their own meaningful tags).
  • Hierarchical: XML data is structured in a tree-like format, with nested elements, which allows hierarchical organization of data, making it easy to categorize and retrieve.
  • Extensible: New elements can be added without breaking the structure.
  • Interoperable: XML works across different platforms and systems.
  • Adaptable: XML separates content and presentation as it does not dictate how data is displayed.

How XML Enables Intelligent Search in TEI

TEI leverages XML to enhance the searchability and retrieval of textual data. Unlike plain text, XML provides a structured way to represent a document’s content, allowing for semantic tagging, metadata inclusion, and hierarchical organization. These features enable a more advanced and what Lou Burnard calls intelligent search (8). For example, he explains, in such a search ‘London’ as the name of a place in Canada is distinguished from that of a place in England, or the surname of an author (8). By enabling nuanced searches that distinguish between different contexts and meanings, TEI helps scholars express deeper interpretive insights and supports more inclusive and accurate textual analysis across disciplines.

Structured Data for Precise Queries

One of XML’s key advantages is its ability to structure textual data hierarchically, meaning that elements such as paragraphs, sections, and even words can be explicitly marked up. This structure allows search engines and computational tools to distinguish between different types of information. For example, in a TEI-encoded document, a user could search for a specific name only within <persName> tags, filtering out irrelevant mentions of the same word in other contexts.

Example Comparison:

  • Plain text search: Searching for “John” in a plain text file returns all instances, whether it refers to a person, a location (e.g., “John Street”), or a reference in another context (e.g., “Dear John”).
  • TEI-based XML search: Searching for <persName> containing “John” ensures that only named persons matching “John” appear in the results, excluding unrelated mentions.

Semantic Markup for Meaning-Based Search

XML allows TEI users to apply semantic tagging, which means encoding texts with meaningful labels rather than relying on basic string-matching techniques. This capability enhances context-aware searches, enabling researchers to retrieve results based on concepts rather than mere keywords.

Example Comparison:

  • Plain text search: Searching for “queen” will return results including “queen bee,” “Queen Elizabeth,” and “queen-sized bed.”
  • TEI-based XML search: Searching within <roleName> ensures that only mentions of royal titles like “Queen Elizabeth” or “Queen Victoria” appear in results, filtering out unrelated uses.

Metadata-Driven Search Capabilities

Metadata is a key part of XML-based TEI because it provides useful details about texts, like the author’s name, when it was published, and its historical context, enabling more precise filtering and organization. This structured approach helps users search and categorize texts more effectively, allowing them to retrieve relevant materials based on attributes such as date, author, or document type.

Example Comparison:

  • Plain text search: Searching for “Shakespeare” retrieves all references, whether he is mentioned as an author, a character, or a subject of discussion.
  • TEI-based XML search: A query filtering <author> ensures only works written by Shakespeare appear, rather than texts that merely mention him.

XPath and XQuery for Advanced Search

TEI-encoded XML documents benefit from powerful query languages such as XPath and XQuery, which are designed for searching and extracting information from XML documents. In TEI-encoded texts, they facilitate fine-grained search and text analysis.

XPath is a language used to navigate the structure of an XML document and retrieve specific elements based on their position within the hierarchy. It enables searches based on hierarchical relationships within the text, allowing users to define paths to find elements nested within specific sections or structures.

Example of XPath in TEI: Find all <title> elements, regardless of their depth within <teiHeader>, ensuring precise retrieval of document titles.

XQuery builds upon XPath and provides additional functionalities for filtering, transforming, and structuring results from XML documents. It enables complex text retrieval operations, such as finding all instances of a term appearing within footnotes but not in main text.

Example of XQuery in TEI: Search for all <p> elements in the text body but only return those where “Shakespeare” is listed as the author.

How XQuery Extends XPath:

  • XPath is mainly used to locate and select XML nodes.
  • XQuery retrieves, filters, transforms, and organizes data into structured outputs.

By using XPath and XQuery, TEI-encoded XML documents become highly searchable and adaptable, allowing users to query texts based on structure, not just keywords.

Example Comparison:

  • Plain text search: A search for “revolution” returns all results, regardless of where it appears in the document.
  • TEI-based XML search with XPath/XQuery: A user can query for “revolution” only within <note> elements, retrieving only footnotes discussing the term.

Machine-Readable Texts

Because XML is machine-readable, TEI-encoded texts can be indexed and processed by search engines in ways that plain text cannot. Instead of searching blindly through unstructured text, indexing mechanisms can categorize content by author, title, date, genre, or thematic elements. This allows for filtered and ranked results and enables more effective cross-corpus searching, allowing users to analyze large text collections with structured queries that differentiate between different types of content.


Work Cited

  • Burnard, Lou. What Is the Text Encoding Initiative? OpenEdition Press, 2014, https://doi.org/10.4000/books.oep.426.

Suggested Readings

  • TEI Consortium. A Gentle Introduction to XML. TEI: Guidelines for Electronic Text Encoding and Interchange, TEI Consortium, 2025, tei-c.org/release/doc/tei-p5-doc/en/html/SG.html. Accessed February 20, 2025.
  • Birnbaum, David J. What is XML and Why Should Humanists Care? An Even Gentler Introduction to XML. Digital Humanities obdurodon.org, 7 Dec. 2024, dh.obdurodon.org/what-is-xml.xhtml. Accessed February 20, 2025.
  • A shamelessly short intro to XML for DH beginners (includes TEI). LaTeX Ninja’ing and the Digital Humanities, 2 Feb. 2022, latex-ninja.com/2022/02/02/a-shamelessly-short-intro-to-xml-for-dh-beginners-includes-tei/. Accessed February 20, 2025.
  • Hawkins, Kevin S. “Introduction to XML for Text.” ultraslavonic.info, 31 Oct. 2019, ultraslavonic.info/intro-to-xml/. Accessed February 20, 2025.
1.1: Quick Quiz
Prev
1.2: Quick Quiz
Next

© 2025 Disruptive TEI • Privacy Policy • Accessibility Statement