- Departments of Statistics and of Botany, University of Wisconsin - Madison, Madison, United States of America
- Bioinformatics & Computational Biology, Hybridization / Introgression, Macroevolution, Phylogenetics / Phylogenomics
Relative time constraints improve molecular dating
Dating with constraints
Estimating the absolute age of diversification events is challenging, because molecular sequences provide timing information in units of substitutions, not years. Additionally, the rate of molecular evolution (in substitutions per year) can vary widely across lineages. Accurate dating of speciation events traditionally relies on non-molecular data. For very fast-evolving organisms such as SARS-CoV-2, for which samples are obtained over a time span, the collection times provide this external information from which we can learn the rate of molecular evolution and date past events (Boni et al. 2020). In groups for which the fossil record is abundant, state-of-the-art dating methods use fossil information to complement molecular data, either in the form of a prior distribution on node ages (Nguyen & Ho 2020), or as data modelled with a fossilization process (Heath et al. 2014).
Dating is a challenge in groups that lack fossils or other geological evidence, such as very old lineages and microbial lineages. In these groups, horizontal gene transfer (HGT) events have been identified as informative about relative dates: the ancestor of the gene's donor must be older than the descendants of the gene's recipient. Previous work using HGTs to date phylogenies have used methodologies that are ad-hoc (Davín et al 2018) or employ a small number of HGTs only (Magnabosco et al. 2018, Wolfe & Fournier 2018).
Szöllősi et al. (2021) present and validate a Bayesian approach to estimate the age of diversification events based on relative information on these ages, such as implied by HGTs. This approach is flexible because it is modular: constraints on relative node ages can be combined with absolute age information from fossil data, and with any substitution model of molecular evolution, including complex state-of-art models. To ease the computational burden, the authors also introduce a two-step approach, in which the complexity of estimating branch lengths in substitutions per site is decoupled from the complexity of timing the tree with branch lengths in years, accounting for uncertainty in the first step. Currently, one limitation is that the tree topology needs to be known, and another limitation is that constraints need to be certain. Users of this method should be mindful of the latter when hundreds of constraints are used, as done by Szöllősi et al. (2021) to date the trees of Cyanobacteria and Archaea.
Szöllősi et al. (2021)'s method is implemented in RevBayes, a highly modular platform for phylogenetic inference, rapidly growing in popularity (Höhna et al. 2016). The RevBayes tutorial page features a step-by-step tutorial "Dating with Relative Constraints", which makes the method highly approachable.
Boni MF, Lemey P, Jiang X, Lam TT-Y, Perry BW, Castoe TA, Rambaut A, Robertson DL (2020) Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. Nature Microbiology, 5, 1408–1417. https://doi.org/10.1038/s41564-020-0771-4
Davín AA, Tannier E, Williams TA, Boussau B, Daubin V, Szöllősi GJ (2018) Gene transfers can date the tree of life. Nature Ecology & Evolution, 2, 904–909. https://doi.org/10.1038/s41559-018-0525-3
Heath TA, Huelsenbeck JP, Stadler T (2014) The fossilized birth–death process for coherent calibration of divergence-time estimates. Proceedings of the National Academy of Sciences, 111, E2957–E2966. https://doi.org/10.1073/pnas.1319091111
Höhna S, Landis MJ, Heath TA, Boussau B, Lartillot N, Moore BR, Huelsenbeck JP, Ronquist F (2016) RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language. Systematic Biology, 65, 726–736. https://doi.org/10.1093/sysbio/syw021
Magnabosco C, Moore KR, Wolfe JM, Fournier GP (2018) Dating phototrophic microbial lineages with reticulate gene histories. Geobiology, 16, 179–189. https://doi.org/10.1111/gbi.12273
Nguyen JMT, Ho SYW (2020) Calibrations from the Fossil Record. In: The Molecular Evolutionary Clock: Theory and Practice (ed Ho SYW), pp. 117–133. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-030-60181-2_8
Szollosi, G.J., Hoehna, S., Williams, T.A., Schrempf, D., Daubin, V., Boussau, B. (2021) Relative time constraints improve molecular dating. bioRxiv, 2020.10.17.343889, ver. 8 recommended and peer-reviewed by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2020.10.17.343889
Wolfe JM, Fournier GP (2018) Horizontal gene transfer constrains the timing of methanogen evolution. Nature Ecology & Evolution, 2, 897–903. https://doi.org/10.1038/s41559-018-0513-7