Since the publication of the first "Internet edition" of the Herzog August Bibliothek, the field of digital editing has developed from its experimental beginnings into an established discipline. On the one hand, there is the increasing establishment of standards, especially the TEI guidelines and regulations for research data and publication licenses. On the other hand, the dynamics of technological development of tools and infrastructures present the implementation of digital editions with ever-new opportunities and challenges. Particularly in publicly funded long-term projects, this tension often leads to special difficulties in the implementation and conception of an editing practice that simultaneously considers digital standards, picks up on and anticipates innovations, and guarantees the sustainability of the generated data and workflows. Since the publication of the first "Internet edition" by the Herzog August Bibliothek, the field of digital editing has evolved from its experimental beginnings into an established discipline. On the one hand, there is the increasing establishment of standards, particularly the TEI guidelines and regulations for research data and publication licenses. On the other hand, the dynamic technological development of tools and infrastructures presents digital editions with ever-new opportunities and challenges. Especially in publicly funded long-term projects, this tension often leads to unique difficulties in implementing and conceiving an editing practice that simultaneously considers digital standards, incorporates and anticipates innovations, and ensures the sustainability of the generated data and workflows. This fundamental tension, which will always remain inherent in digital editing due to the intertwining of technological digital change with the generation of long-term basic research, must be addressed at the level of editorial workflows. Therefore, this paper aims to present four principles that should underlie careful planning of such workflows, derived from the so-called Unix philosophy (programs should perform a task well, work together, and process text streams as a universal interface). With regard to a digital editor, this was translated into the four principles of modularization, interoperability, redundancy, explicitness, and chaining potential. These principles are illustrated by a sample workflow developed thereafter for the digital edition of the Decretum Burchardi, a hybrid edition with a particular focus on the practices of text editing (conversions, emendations, interventions, etc.) expressed in a group of five manuscripts. The goal is a digital edition with synoptic viewing options, text/image interlacing, and a derivation for printing in the MGH as a ‘classical edition’.1 The aforementioned principles were integrated into the project's edition workflow as follows:
Modularisation: Given the expected technological change, each workflow component was integrated in a way that allows functionally equivalent solutions to replace it easily.
Interoperability: When selecting the methods, care was taken to ensure they generate or utilize accessible text streams or accessible text files (TEI XML, PageXML, JSON). This facilitates processing in various stages of the workflow (or external contexts) and enables the interchangeability of the mentioned modules.
Redundancy and explicitness: With regard to good scientific practice, individual work steps are explicitly represented in the data. For instance, special characters and abbreviations are preserved in the HTR model used and are only later normalized by processing the PageXML and transferring it to TEI. This redundancy and explicitness of the data not only increase their openness to other contexts and thus their sustainability, but also guarantee editorial transparency.
Chaining potential: Solutions with high chaining potential were chosen for all these steps. The forwarding within the chain is accomplished by Python scripts that transmit, manipulate, and post?process the generated data. In this way, for example, observations recorded in Transkribus can be automatically generated as annotations across the different links of the chain and displayed in Mirador.
Following these principles, a multi-stage and semi-automated workflow was developed, consisting of (1) automated transcription in Transkribus, (2) post-processing, (3) framework-based TEI encoding in OxygenXML, (4) storage and retrieval in eXist-db, (5) collation by CollateX, (6) preparation for digital viewing by Webstack, and (7) display and annotation of the manuscripts via IIIF in Mirador. The goal is a semi-automated pipeline of editorial practice from the transcription of the manuscript in Transkribus to the editorially prepared view on the web and in Mirador, which can be used as a blueprint for further edition projects.