A Python package for converting TEI XML collations to NEXUS and other formats.
Textual scholars have been using phylogenetics to analyze manuscript traditions since the early 1990s.
Many standard phylogenetic software packages accept as input the NEXUS file format.
The teiphy program takes a collation of texts encoded using the Text Encoding Initiative (TEI) guidelines
and converts it to a NEXUS format so that it can be used for phylogenetic analysis.
It can also convert to other formats as well, including Hennig86 (for TNT), PHYLIP (for RAxML), FASTA, and the XML format used by BEAST 2.7.
(Note: Some features that teiphy includes in BEAST 2 outputs will not work correctly with versions of BEAST 2 prior to v2.7.7.
Other features depend on the BEASTLabs, BEAST_CLASSIC, and BDSKY packages.
Make sure you have a recent version of BEAST 2 and these packages installed!)
The software can be installed using pip:
pip install teiphyAlternatively, you can install the package by cloning this repository and installing it with poetry:
git clone https://github.com/jjmccollum/teiphy.git
cd teiphy
pip install poetry
poetry installOnce the package is installed, you can run all unit tests via the command
poetry run pytestTo use the software, run the teiphy command line tool:
teiphy <input TEI XML> <output file>teiphy can export to NEXUS, Hennig86 (TNT), PHYLIP (in the relaxed form used by RAxML), FASTA, BEAST 2.7 XML, CSV, TSV, Excel and stemma formats.
teiphy will try to infer the file format to export to from the extension of the output file. Accepted file extensions are:
".nex", ".nexus", ".nxs", ".ph", ".phy", ".fa", ".fasta", ".xml", ".tnt", ".csv", ".tsv", ".xlsx".
To explicitly say which format you wish to export to, use the --format option. For example:
teiphy <input TEI XML> <output file> --format nexusFor more information about the other options, see the help with:
teiphy --helpOr see the documentation with explanations about advanced usage.
The software can also be used in Python directly. See API Reference in the documentation for more information.
teiphy was designed by Joey McCollum (Australian Catholic University) and Robert Turnbull (University of Melbourne).
We received additional help from Stephen C. Carlson (Australian Catholic University).
If you use this software, please cite the paper: Joey McCollum and Robert Turnbull, "teiphy: A Python Package for Converting TEI XML Collations to NEXUS and Other Formats," JOSS 7.80 (2022): 4879, DOI: 10.21105/joss.04879.
@article{MT2022,
author = {Joey McCollum and Robert Turnbull},
title = {{teiphy: A Python Package for Converting TEI XML Collations to NEXUS and Other Formats}},
journal = {Journal of Open Source Software},
year = {2022},
volume = {7},
number = {80},
pages = {4879},
publisher = {The Open Journal},
doi = {10.21105/joss.04879},
url = {https://doi.org/10.21105/joss.04879}
}Further details on the capabilities of teiphy, particularly in terms of the text-critically valuable features it can map from TEI XML collations to BEAST 2 inputs, are discussed in Joey McCollum and Robert Turnbull, "Using Bayesian Phylogenetics to Infer Manuscript Transmission History," DSH 39.1 (2024): 258–279, DOI: 10.1093/llc/fqad089.
@article{MT2024,
author = {Joey McCollum and Robert Turnbull},
title = {{Using Bayesian Phylogenetics to Infer Manuscript Transmission History}},
journal = {Digital Scholarship in the Humanities},
year = {2024},
volume = {39},
number = {1},
pages = {258--279},
doi = {10.1093/llc/fqad089},
url = {https://doi.org/10.1093/llc/fqad089}
}