This repository provides a containerized version of McmcDate, developed by Dominik Schrempf (@dschrempf), that can date a phylogenetic tree with constraints.
By using this container image, you can omit the necessity of installing Haskell, the computer language mcmc-date was written in, the Glasgow Haskell Compiler, and Cabal on your system and just run mcmc-date directly instead.
Currently an Apptainer container is provided which is supported by most HPC clusters.
If you don't have Apptainer installed on your system yet, please follow the installation instructions here. Apptainer version 1.4.2 was used for creating and testing this container.
Go to the directory with your dataset (rooted tree, treelist, calibrations and constraints, etc).
cd workdir
Let's assume that the data under workdir/ looks as the following:
- workdir/data/rooted.tree
- workdir/data/trees.list
- workdir/data/calibrations.csv
- workdir/data/constraints.csv
- workdir/analysis.conf
Clone this repository into the workdir:
git clone https://github.com/oist/mcmc-date-container.git
Create a symlink to the run and analyze scripts:
ln -s mcmc-date-container/run
ln -s mcmc-date-container/analyze
./run -a -r "$(pwd)/mcmc-date-container/mcmcdate.sif" -f analysis.conf -c -k ug f p
where
- with the
-aoption we ask forApptainerinstead of a local Haskell installation, - with
-rwe specify the absolute path to the Apptainer image, - with
-fwe specify the analysis configuration file, - with
-cwe activate calibrations, - with
-kwe activate constraints, - with
ugwe ask for the uncorrelated gamma molecular clock model (writealinstead for the autocorrelated lognormal model), - with
fwe ask for a full covariant likelihood matrix - and finally with
pwe run the preparation step ofmcmc-date.
See the usage description of run below and the McmcDate tutorial for more information.
./run -a -r "$(pwd)/mcmc-date-container/mcmcdate.sif" -f analysis.conf -c -k ug f r
where all the options are the same as above except replacing p (prepare) with r (run), thus running the tree dating analysis.
./analyze -a -r "$(pwd)/mcmc-date-container/mcmcdate.sif"
- analyze this wrapper script is a modified version of McmcDate's
analyzescript including the option (-a -r PATH) to call an Apptainer container instead of running it directly in a local Haskell environment. The script is backwards compatible with the originalanalyzescript. - run this wrapper script is a modified version of McmcDate's
runscript including the option (-a -r PATH) to call an Apptainer container instead of running it directly in a local Haskell environment. The script is backwards compatible with the originalrunscript. - mcmcdate.sif Apptainer's sif container image that runs a slim version of Debian Trixie (13.1) and contains the
mcmc-dateand its helper scripts in binary format - mcmcdate_debian_global_v2.def Apptainer container definition file to create a fully functional Haskell environment able to compile
mcmc-dateand its helper scripts - mcmcdate_debian_global_v2_multistage.def Apptainer multistage container definition file that after creating the
mcmc-datebinaries, it copies them into a fresh Debian installation getting rid of the Haskell environment and reducing final image size - example/analysis.conf Example
analysis.conffile formcmc-date. Please note, that even though calibrations and constraints have to be specified in this file, they will be only used if you activate them with the corresponding switches for therunscript (-cand-k) also. - example/calibrations.csv Example time calibrations file (the header line is mandatory). Two leaf names pinpoint their most recent common ancestor node the calibration is to be set on.
- example/constraints.csv Example relative constraints file (the header line is mandatory). Two leaf names pinpoint their most recent common ancestor node the constraint is to be set on.
Usage: run [OPTIONS] RELAXED_MOLECULAR_CLOCK_MODEL LIKELIHOOD_SPECIFICATION COMMANDS
Auxiliary data options:
-b Activate braces
-c Activate calibrations
-k Activate constraints
Algorithm related options:
-i NAME Initialize state and cycle from previous analysis with NAME
-H Activate Hamiltonian proposal (slow, but great convergence)
-m Use Mc3 algorithm insteahd of Mhg
Other options:
-f FILE Use a different analysis configuration file (relative path)
-n SUFFIX Use an analysis suffix
-p Activate profiling
-s Use Haskell stack instead of cabal-install
-a Use Apptainer image instead of cabal-install or Haskell stack
-r FILE Absolute path to the mcmcdate Apptainer SIF file, if not set, by default: <script's directory>/mcmcdate.sif is used
Relaxed molecular clock model:
ug Uncorrelated gamma model
ul Uncorrelated log normal model
uw Uncorrelated white noise model
al Autocorrelated log normal model
Likelihood specification:
f Full covariance matrix
s Sparse covariance matrix
u Univariate approach
n No likelihood; use prior and auxiliary data only
Available commands:
p Prepare analysis
r Run dating analysis
c Continue dating analysis
m Compute marginal likelihood
A configuration file "analysis.conf" is required.
For reference, see the sample configuration file.
Usage: analyze [options]
Options:
-h Display help and exit
-s Skip files with previous analysis
-a Use Apptainer image instead of cabal-install or Haskell stack
-r FILE Absolute path to the mcmcdate Apptainer SIF file, if not set, by default: <script's directory>/mcmcdate.sif is used
For development:
APPTAINER_TMPDIR="/your/local/tmp/dir" apptainer build --sandbox mcmcdate_debian_global_v2 mcmcdate_debian_global_v2.def
Now you can enter the image and modify it to your needs via:
apptainer shell --writable mcmcdate_debian_global_v2
For distribution use the multistage build that removes Cabal, GHC, etc and only keeps the mcmc-date binaries and their library dependencies:
APPTAINER_TMPDIR="/your/local/tmp/dir" apptainer build mcmcdate.sif mcmcdate_debian_global_v2_multistage.def