Submit a preprint

Latest recommendationsrsstwitter

IdTitleAuthors▼AbstractPictureThematic fieldsRecommenderReviewersSubmission date
19 Jul 2021
article picture

Host phenology can drive the evolution of intermediate virulence strategies in some obligate-killer parasites

Modelling parasitoid virulence evolution with seasonality

Recommended by ORCID_LOGO based on reviews by Alex Best and 2 anonymous reviewers

The harm most parasites cause to their host, i.e. the virulence, is a mystery because host death often means the end of the infectious period. For obligate killer parasites, or “parasitoids”, that need to kill their host to transmit to other hosts the question is reversed. Indeed, more rapid host death means shorter generation intervals between two infections and mathematical models show that, in the simplest settings, natural selection should always favour more virulent strains (Levin and Lenski, 1983). Adding biological details to the model modifies this conclusion and, for instance, if the relationship between the infection duration and the number of parasites transmission stages produced in a host is non-linear, strains with intermediate levels of virulence can be favoured (Ebert and Weisser 1997). Other effects, such as spatial structure, could yield similar effects (Lion and van Baalen, 2007).

In their study, MacDonald et al. (2021) explore another type of constraint, which is seasonality. Earlier studies, such as that by Donnelly et al. (2013) showed that this constraint can affect virulence evolution but they had focused on directly transmitted parasites. Using a mathematical model capturing the dynamics of a parasitoid, MacDonald et al. (2021) show if two main assumptions are met, namely that at the end of the season only transmission stages (or “propagules”) survive and that there is a constant decay of these propagules with time, then strains with intermediate levels of virulence are favoured.

Practically, the authors use delay differential equations and an adaptive dynamics approach to identify evolutionary stable strategies. As expected, the longer the short the season length, the higher the virulence (because propagule decay matters less). The authors also identify a non-linear relationship between the variation in host development time and virulence. Generally, the larger the variation, the higher the virulence because the parasitoid has to kill its host before the end of the season. However, if the variation is too wide, some hosts become physically impossible to use for the parasite, whence a decrease in virulence.

Finally, MacDonald et ali. (2021) show that the consequence of adding trade-offs between infection duration and the number of propagules produced is in line with earlier studies (Ebert and Weisser 1997). These mathematical modelling results provide testable predictions for using well-described systems in evolutionary ecology such as daphnia parasitoids, baculoviruses, or lytic phages.

Reference

Donnelly R, Best A, White A, Boots M (2013) Seasonality selects for more acutely virulent parasites when virulence is density dependent. Proc R Soc B, 280, 20122464. https://doi.org/10.1098/rspb.2012.2464

Ebert D, Weisser WW (1997) Optimal killing for obligate killers: the evolution of life histories and virulence of semelparous parasites. Proc R Soc B, 264, 985–991. https://doi.org/10.1098/rspb.1997.0136

Levin BR, Lenski RE (1983) Coevolution in bacteria and their viruses and plasmids. In: Futuyma DJ, Slatkin M eds. Coevolution. Sunderland, MA, USA: Sinauer Associates, Inc., 99–127.

Lion S, van Baalen M (2008) Self-structuring in spatial evolutionary ecology. Ecol. Lett., 11, 277–295. https://doi.org/10.1111/j.1461-0248.2007.01132.x

MacDonald H, Akçay E, Brisson D (2021) Host phenology can drive the evolution of intermediate virulence strategies in some obligate-killer parasites. bioRxiv, 2021.03.13.435259, ver. 8 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2021.03.13.435259

Host phenology can drive the evolution of intermediate virulence strategies in some obligate-killer parasitesHannelore MacDonald, Erol Akçay, Dustin Brisson<p style="text-align: justify;">The traditional mechanistic trade-offs resulting in a negative correlation between transmission and virulence are the foundation of nearly all current theory on the evolution of parasite virulence. Several ecologica...Evolutionary Dynamics, Evolutionary Ecology, Evolutionary Epidemiology, Evolutionary TheorySamuel Alizon2021-03-14 13:47:33 View
13 Apr 2023
article picture

The landscape of nucleotide diversity in Drosophila melanogaster is shaped by mutation rate variation

An unusual suspect: the mutation landscape as a determinant of local variation in nucleotide diversity

Recommended by based on reviews by David Castellano and 1 anonymous reviewer

Sometimes, important factors for explaining biological processes fall through the cracks, and it is only through careful modeling that their importance eventually comes out to light. In this study, Barroso and Dutheil introduce a new method based on the sequentially Markovian coalescent (SMC, Marjoran and Wall 2006) for jointly estimating local recombination and coalescent rates along a genome. Unlike previous SMC-based methods, however, their method can also co-estimate local patterns of variation in mutation rates. 

This is a powerful improvement which allows them to tackle questions about the reasons for the extensive variation in nucleotide diversity across the chromosomes of a species - a problem that has plagued the minds of population geneticists for decades (Begun and Aquadro 1992, Andolfatto 2007, McVicker et al., 2009, Pouyet and Gilbert 2021). The authors find that variation in de novo mutation rates appears to be the most important factor in determining nucleotide diversity in Drosophila melanogaster. Though seemingly contradicting previous attempts at addressing this problem (Comeron 2014), they take care to investigate and explain why that might be the case.

Barroso and Dutheil have also taken care to carefully explain the details of their new approach and have carried a very thorough set of analyses comparing competing explanations for patterns of nucleotide variation via causal modeling. The reviewers raised several issues involving choices made by the authors in their analysis of variance partitioning, the proper evaluation of the role of linked selection and the recombination rate estimates emerging from their model. These issues have all been extensively addressed by the authors, and their conclusions seem to remain robust. The study illustrates why the mutation landscape should not be ignored as an important determinant of local variation in genetic diversity, and opens up questions about the generalizability of these results to other organisms.

REFERENCES

Andolfatto, P. (2007). Hitchhiking effects of recurrent beneficial amino acid substitutions in the Drosophila melanogaster genome. Genome research, 17(12), 1755-1762. https://doi.org/10.1101/gr.6691007

Barroso, G. V., & Dutheil, J. Y. (2021). The landscape of nucleotide diversity in Drosophila melanogaster is shaped by mutation rate variation. bioRxiv, 2021.09.16.460667, ver. 3 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2021.09.16.460667

Begun, D. J., & Aquadro, C. F. (1992). Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature, 356(6369), 519-520. https://doi.org/10.1038/356519a0

Comeron, J. M. (2014). Background selection as baseline for nucleotide variation across the Drosophila genome. PLoS Genetics, 10(6), e1004434. https://doi.org/10.1371/journal.pgen.1004434

Marjoram, P., & Wall, J. D. (2006). Fast" coalescent" simulation. BMC genetics, 7, 1-9. https://doi.org/10.1186/1471-2156-7-16

McVicker, G., Gordon, D., Davis, C., & Green, P. (2009). Widespread genomic signatures of natural selection in hominid evolution. PLoS genetics, 5(5), e1000471. https://doi.org/10.1371/journal.pgen.1000471

Pouyet, F., & Gilbert, K. J. (2021). Towards an improved understanding of molecular evolution: the relative roles of selection, drift, and everything in between. Peer Community Journal, 1, e27. https://doi.org/10.24072/pcjournal.16

The landscape of nucleotide diversity in Drosophila melanogaster is shaped by mutation rate variationGustavo V Barroso, Julien Y Dutheil<p style="text-align: justify;">What shapes the distribution of nucleotide diversity along the genome? Attempts to answer this question have sparked debate about the roles of neutral stochastic processes and natural selection in molecular evolutio...Bioinformatics & Computational Biology, Population Genetics / GenomicsFernando Racimo2022-10-30 07:52:07 View
12 Feb 2024
article picture

How do plant RNA viruses overcome the negative effect of Muller s ratchet despite strong transmission bottlenecks?

How to survive the mutational meltdown: lessons from plant RNA viruses

Recommended by based on reviews by Brent Allman, Ana Morales-Arce and 1 anonymous reviewer

Although most mutations are deleterious, the strongly deleterious ones do not spread in a very large population as their chance of fixation is very small. Another mechanism via which the deleterious mutations can be eliminated is via recombination or sexual reproduction. However, in a finite asexual population, the subpopulation without any deleterious mutation will eventually acquire a deleterious mutation resulting in the reduction of the population size or in other words, an increase in the genetic drift. This, in turn, will lead the population to acquire deleterious mutations at a faster rate eventually leading to a mutational meltdown.

This irreversible (or, at least over some long time scales) accumulation of deleterious mutations is especially relevant to RNA viruses due to their high mutation rate, and while the prior work has dealt with bacteriophages and RNA viruses, the study by Lafforgue et al. [1] makes an interesting contribution to the existing literature by focusing on plants.

In this study, the authors enquire how despite the repeated increase in the strength of genetic drift, how the RNA viruses manage to survive in plants. Following a series of experiments and some numerical simulations, the authors find that as expected, after severe bottlenecks, the fitness of the population decreases significantly. But if the bottlenecks are followed by population expansion, the Muller’s ratchet can be halted due to the genetic diversity generated during population growth. They hypothesize this mechanism as a potential way by which the RNA viruses can survive the mutational meltdown.

As a theoretician, I find this investigation quite interesting and would like to see more studies addressing, e.g., the minimum population growth rate required to counter the potential extinction for a given bottleneck size and deleterious mutation rate. Of course, it would be interesting to see in future work if the hypothesis in this article can be tested in natural populations.

References

[1] Guillaume Lafforgue, Marie Lefebvre, Thierry Michon, Santiago F. Elena (2024) How do plant RNA viruses overcome the negative effect of Muller s ratchet despite strong transmission bottlenecks? bioRxiv, ver. 3 peer-reviewed and recommended by Peer Community In Evolutionary Biology
https://doi.org/10.1101/2023.08.01.550272

How do plant RNA viruses overcome the negative effect of Muller s ratchet despite strong transmission bottlenecks?Guillaume Lafforgue, Marie Lefebvre, Thierry Michon, Santiago F. Elena<p>Muller's ratchet refers to the irreversible accumulation of deleterious mutations in small populations, resulting in a decline in overall fitness. This phenomenon has been extensively observed in experiments involving microorganisms, including ...Experimental Evolution, Genome EvolutionKavita Jain2023-08-04 09:37:08 View
05 Feb 2019
article picture

The quiescent X, the replicative Y and the Autosomes

Replication-independent mutations: a universal signature ?

Recommended by based on reviews by Marc Robinson-Rechavi and Robert Lanfear

Mutations are the primary source of genetic variation, and there is an obvious interest in characterizing and understanding the processes by which they appear. One particularly important question is the relative abundance, and nature, of replication-dependent and replication-independent mutations - the former arise as cells replicate due to DNA polymerization errors, whereas the latter are unrelated to the cell cycle. A recent experimental study in fission yeast identified a signature of mutations in quiescent (=non-replicating) cells: the spectrum of such mutations is characterized by an enrichment in insertions and deletions (indels) compared to point mutations, and an enrichment of deletions compared to insertions [2].
What Achaz et al. [1] report here is that the very same signature is detectable in humans. This time the approach is indirect and relies on two key aspects of mammalian reproduction biology: (1) oocytes remain quiescent over most of a female's lifespan, whereas spermatocytes keep dividing after male puberty, and (2) X chromosome, Y chromosome and autosomes spend different amounts of time in a female vs. male context. In agreement with the yeast study, Achaz et al. show that in humans the male-associated Y chromosome, for which quiescence is minimal, has by far the lowest ratios of indels to point mutations and of deletions to insertions, whereas the female-associated X chromosome has the highest. This is true both of variants that are polymorphic among humans and of fixed differences between humans and chimpanzees.
So we appear to be here learning about an important and general aspect of the mutation process. The authors suggest that, to a large extent, chromosomes tend to break in pieces at a rate that is proportional to absolute time - because indels in quiescent stage presumably result from double-strand DNA breaks. A very recent analysis of numerous mother-father-child trios in humans confirms this prediction in demonstrating an effect of maternal age, but not of paternal age, on the recombination rate [3]. This result also has important implications with respect to the interpretation of substitution rate variation among taxa and genomic compartments, particularly mitochondrial vs. nuclear, and their relationship with the generation time and longevity of organisms (e.g. [4]).

References

[1] Achaz, G., Gangloff, S., and Arcangioli, B. (2019). The quiescent X, the replicative Y and the Autosomes. BioRxiv, 351288, ver. 3 peer-reviewed and recommended by PCI Evol Biol. doi: 10.1101/351288
[2] Gangloff, S., Achaz, G., Francesconi, S., Villain, A., Miled, S., Denis, C., and Arcangioli, B. (2017). Quiescence unveils a novel mutational force in fission yeast. eLife, 6:e27469. doi: 10.7554/eLife.27469
[3] Halldorsson, B. V., Palsson, G., Stefansson, O. A., Jonsson, H., Hardarson, M. T., Eggertsson, H. P., … Stefansson, K. (2019). Characterizing mutagenic effects of recombination through a sequence-level genetic map. Science, 363: eaau1043. doi: 10.1126/science.aau1043
[4] Saclier, N., François, C. M., Konecny-Dupré, L., Lartillot, N., Guéguen, L., Duret, L., … Lefébure, T. (2019). Life History Traits Impact the Nuclear Rate of Substitution but Not the Mitochondrial Rate in Isopods. Molecular Biology and Evolution, in press. doi: 10.1093/molbev/msy247

The quiescent X, the replicative Y and the AutosomesGuillaume Achaz, Serge Gangloff, Benoit Arcangioli<p>From the analysis of the mutation spectrum in the 2,504 sequenced human genomes from the 1000 genomes project (phase 3), we show that sexual chromosomes (X and Y) exhibit a different proportion of indel mutations than autosomes (A), ranking the...Bioinformatics & Computational Biology, Genome Evolution, Human Evolution, Molecular Evolution, Population Genetics / Genomics, Reproduction and SexNicolas Galtier2018-07-25 10:37:48 View
11 Dec 2020
article picture

Quantifying transmission dynamics of acute hepatitis C virus infections in a heterogeneous population using sequence data

Phylodynamics of hepatitis C virus reveals transmission dynamics within and between risk groups in Lyon

Recommended by based on reviews by Chris Wymant and Louis DuPlessis

Genomic epidemiology seeks to better understand the transmission dynamics of infectious pathogens using molecular sequence data. Phylodynamic methods have given genomic epidemiology new power to track the transmission dynamics of pathogens by combining phylogenetic analyses with epidemiological modeling. In recent year, applications of phylodynamics to chronic viral infections such as HIV and hepatitis C virus (HVC) have provided some of the best examples of how phylodynamic inference can provide valuable insights into transmission dynamics within and between different subpopulations or risk groups, allowing for more targeted interventions.
However, conducting phylodynamic inference under complex epidemiological models comes with many challenges. In some cases, it is not always straightforward or even possible to perform likelihood-based inference. Structured SIR-type models where infected individuals can belong to different subpopulations provide a classic example. In this case, the model is both nonlinear and has a high-dimensional state space due to tracking different types of hosts. Computing the likelihood of a phylogeny under such a model involves complex numerical integration or data augmentation methods [1]. In these situations, Approximate Bayesian Computation (ABC) provides an attractive alternative, as Bayesian inference can be performed without computing likelihoods as long as one can efficiently simulate data under the model to compare against empirical observations [2].
Previous work has shown how ABC approaches can be applied to fit epidemiological models to phylogenies [3,4]. Danesh et al. [5] further demonstrate the real world merits of ABC by fitting a structured SIR model to HCV data from Lyon, France. Using this model, they infer viral transmission dynamics between “classical” hosts (typically injected drug users) and “new” hosts (typically young MSM) and show that a recent increase in HCV incidence in Lyon is due to considerably higher transmission rates among “new” hosts . This study provides another great example of how phylodynamic analysis can help epidemiologists understand transmission patterns within and between different risk groups and the merits of expanding our toolkit of statistical methods for phylodynamic inference.

References

[1] Rasmussen, D. A., Volz, E. M., and Koelle, K. (2014). Phylodynamic inference for structured epidemiological models. PLoS Comput Biol, 10(4), e1003570. doi: https://doi.org/10.1371/journal.pcbi.1003570
[2] Beaumont, M. A., Zhang, W., and Balding, D. J. (2002). Approximate Bayesian computation in population genetics. Genetics, 162(4), 2025-2035.
[3] Ratmann, O., Donker, G., Meijer, A., Fraser, C., and Koelle, K. (2012). Phylodynamic inference and model assessment with approximate bayesian computation: influenza as a case study. PLoS Comput Biol, 8(12), e1002835. doi: https://doi.org/10.1371/journal.pcbi.1002835
[4] Saulnier, E., Gascuel, O., and Alizon, S. (2017). Inferring epidemiological parameters from phylogenies using regression-ABC: A comparative study. PLoS computational biology, 13(3), e1005416. doi: https://doi.org/10.1371/journal.pcbi.1005416
[5] Danesh, G., Virlogeux, V., Ramière, C., Charre, C., Cotte, L. and Alizon, S. (2020) Quantifying transmission dynamics of acute hepatitis C virus infections in a heterogeneous population using sequence data. bioRxiv, 689158, ver. 5 peer-reviewed and recommended by PCI Evol Biol. doi: https://doi.org/10.1101/689158

Quantifying transmission dynamics of acute hepatitis C virus infections in a heterogeneous population using sequence dataGonche Danesh, Victor Virlogeux, Christophe Ramière, Caroline Charre, Laurent Cotte, Samuel Alizon<p>Opioid substitution and syringes exchange programs have drastically reduced hepatitis C virus (HCV) spread in France but HCV sexual transmission in men having sex with men (MSM) has recently arisen as a significant public health concern. The fa...Evolutionary Epidemiology, Phylogenetics / PhylogenomicsDavid Rasmussen2019-07-11 13:37:23 View
18 Aug 2020
article picture

Early phylodynamics analysis of the COVID-19 epidemics in France

SARS-Cov-2 genome sequence analysis suggests rapid spread followed by epidemic slowdown in France

Recommended by based on reviews by Luca Ferretti and 2 anonymous reviewers

Sequencing and analyzing SARS-Cov-2 genomes in nearly real time has the potential to quickly confirm (and inform) our knowledge of, and response to, the current pandemic [1,2]. In this manuscript [3], Danesh and colleagues use the earliest set of available SARS-Cov-2 genome sequences available from France to make inferences about the timing of the major epidemic wave, the duration of infections, and the efficacy of lockdown measures. Their phylodynamic estimates -- based on fitting genomic data to molecular clock and transmission models -- are reassuringly close to estimates based on 'traditional' epidemiological methods: the French epidemic likely began in mid-January or early February 2020, and spread relatively rapidly (doubling every 3-5 days), with people remaining infectious for a median of 5 days [4,5]. These transmission parameters are broadly in line with estimates from China [6,7], but are currently unknown in France (in the absence of contact tracing data). By estimating the temporal reproductive number (Rt), the authors detected a slowing down of the epidemic in the most recent period of the study, after mid-March, supporting the efficacy of lockdown measures.
Along with the three other reviewers of this manuscript, I was impressed with the careful and exhaustive phylodynamic analyses reported by Danesh et al. [3]. Notably, they take care to show that the major results are robust to the choice of priors and to sampling. The authors are also careful to note that the results are based on a limited sample size of SARS-Cov-2 genomes, which may not be representative of all regions in France. Their analysis also focused on the dominant SARS-Cov-2 lineage circulating in France, which is also circulating in other countries. The variations they inferred in epidemic growth in France could therefore be reflective on broader control policies in Europe, not only those in France. Clearly more work is needed to fully unravel which control policies (and where) were most effective in slowing the spread of SARS-Cov-2, but Danesh et al. [3] set a solid foundation to build upon with more data. Overall this is an exemplary study, enabled by rapid and open sharing of sequencing data, which provides a template to be replicated and expanded in other countries and regions as they deal with their own localized instances of this pandemic.

References

[1] Grubaugh, N. D., Ladner, J. T., Lemey, P., Pybus, O. G., Rambaut, A., Holmes, E. C., & Andersen, K. G. (2019). Tracking virus outbreaks in the twenty-first century. Nature microbiology, 4(1), 10-19. doi: 10.1038/s41564-018-0296-2
[2] Fauver et al. (2020) Coast-to-Coast Spread of SARS-CoV-2 during the Early Epidemic in the United States. Cell, 181(5), 990-996.e5. doi: 10.1016/j.cell.2020.04.021
[3] Danesh, G., Elie, B., Michalakis, Y., Sofonea, M. T., Bal, A., Behillil, S., Destras, G., Boutolleau, D., Burrel, S., Marcelin, A.-G., Plantier, J.-C., Thibault, V., Simon-Loriere, E., van der Werf, S., Lina, B., Josset, L., Enouf, V. and Alizon, S. and the COVID SMIT PSL group (2020) Early phylodynamics analysis of the COVID-19 epidemic in France. medRxiv, 2020.06.03.20119925, ver. 3 peer-reviewed and recommended by PCI Evolutionary Biology. doi: 10.1101/2020.06.03.20119925
[4] Salje et al. (2020) Estimating the burden of SARS-CoV-2 in France. hal-pasteur.archives-ouvertes.fr/pasteur-02548181
[5] Sofonea, M. T., Reyné, B., Elie, B., Djidjou-Demasse, R., Selinger, C., Michalakis, Y. and Samuel Alizon, S. (2020) Epidemiological monitoring and control perspectives: application of a parsimonious modelling framework to the COVID-19 dynamics in France. medRxiv, 2020.05.22.20110593. doi: 10.1101/2020.05.22.20110593
[6] Rambaut, A. (2020) Phylogenetic analysis of nCoV-2019 genomes. virological.org/t/phylodynamic-analysis-176-genomes-6-mar-2020/356
[7] Li et al. (2020) Early transmission dynamics in Wuhan, China, of novel coronavirus–infected pneumonia. N Engl J Med, 382: 1199-1207. doi: 10.1056/NEJMoa2001316

Early phylodynamics analysis of the COVID-19 epidemics in FranceGonché Danesh, Baptiste Elie,Yannis Michalakis, Mircea T. Sofonea, Antonin Bal, Sylvie Behillil, Grégory Destras, David Boutolleau, Sonia Burrel, Anne-Geneviève Marcelin, Jean-Christophe Plantier, Vincent Thibault, Etienne Simon-Loriere, Sylvie va...<p>France was one of the first countries to be reached by the COVID-19 pandemic. Here, we analyse 196 SARS-Cov-2 genomes collected between Jan 24 and Mar 24 2020, and perform a phylodynamics analysis. In particular, we analyse the doubling time, r...Evolutionary Epidemiology, Molecular Evolution, Phylogenetics / PhylogenomicsB. Jesse Shapiro2020-06-04 13:13:57 View
12 Jul 2017
article picture

Assortment of flowering time and defense alleles in natural Arabidopsis thaliana populations suggests co-evolution between defense and vegetative lifespan strategies

Towards an integrated scenario to understand evolutionary patterns in A. thaliana

Recommended by based on reviews by Rafa Rubio de Casas and Xavier Picó

Nobody can ignore that a full understanding of evolution requires an integrated approach from both conceptual and methodological viewpoints. Although some life-history traits, e.g. flowering time, have long been receiving more attention than others, in many cases because the former are more workable than the latter, we must acknowledge that our comprehension about how evolution works is strongly biased and limited. In the Arabidopsis community, such an integration is making good progress as an increasing number of research groups worldwide are changing the way in which evolution is put to the test.

This manuscript [1] is a good example of that as the authors raise an important issue in evolutionary biology by combining gene expression and flowering time data from different sources. In particular, the authors explore how variation in flowering time, which determines lifespan, and host immunity defenses co-vary, which is interpreted in terms of co-evolution between the two traits. Interestingly, the authors go beyond that pattern by separating lifespan-dependent from lifespan–independent defense genes, and by showing that defense genes with variants known to impact fitness in the field are among the genes whose expression co-varies most strongly with flowering time. Finally, these results are supported by a simple mathematical model indicating that such a relationship can also be expected theoretically.

Overall, the readers will find many conceptual and methodological elements of interest in this manuscript. The idea that evolution is better understood under the scope of life history variation is really exciting and challenging, and in my opinion on the right track for disentangling the inherent complexities of evolutionary research. However, only when we face complexity, we also face its costs and burdens. In this particular case, the well-known co-variation between seed dormancy and flowering time is a missing piece, as well as the identification of (variation in) putative selective pressures accounting for the co-evolution between defense mechanisms and life history (seed dormancy vs. flowering time) along environmental gradients. More intellectual, technical and methodological challenges that with no doubt are totally worth it.

Reference

[1] Glander S, He F, Schmitz G, Witten A, Telschow A, de Meaux J. 2017. Assortment of flowering time and defense alleles in natural Arabidopsis thaliana populations suggests co-evolution between defense and vegetative lifespan strategies. bioRxiv ver.1 of June 19, 2017. doi: 10.1101/131136

Assortment of flowering time and defense alleles in natural Arabidopsis thaliana populations suggests co-evolution between defense and vegetative lifespan strategiesGlander S, He F, Schmitz G, Witten A, Telschow A, de Meaux JThe selective impact of pathogen epidemics on host defenses can be strong but remains transient. By contrast, life-history shifts can durably and continuously modify the balance between costs and benefits of immunity, which arbitrates the evolutio...Adaptation, Evolutionary Ecology, Expression Studies, Life History, Phenotypic Plasticity, Quantitative Genetics, Species interactionsXavier Picó Sophie Karrenberg, Rafa Rubio de Casas, Xavier Picó2017-06-21 10:57:14 View
23 Jan 2020
article picture

A novel workflow to improve multi-locus genotyping of wildlife species: an experimental set-up with a known model system

Improving the reliability of genotyping of multigene families in non-model organisms

Recommended by based on reviews by Sebastian Ernesto Ramos-Onsins, Helena Westerdahl and Thomas Bigot

The reliability of published scientific papers has been the topic of much recent discussion, notably in the biomedical sciences [1]. Although small sample size is regularly pointed as one of the culprits, big data can also be a concern. The advent of high-throughput sequencing, and the processing of sequence data by opaque bioinformatics workflows, mean that sequences with often high error rates are produced, and that exact but slow analyses are not feasible.
The troubles with bioinformatics arise from the increased complexity of the tools used by scientists, and from the lack of incentives and/or skills from authors (but also reviewers and editors) to make sure of the quality of those tools. As a much discussed example, a bug in the widely used PLINK software [2] has been pointed as the explanation [3] for incorrect inference of selection for increased height in European Human populations [4].
High-throughput sequencing often generates high rates of genotyping errors, so that the development of bioinformatics tools to assess the quality of data and correct them is a major issue. The work of Gillingham et al. [5] contributes to the latter goal. In this work, the authors propose a new bioinformatics workflow (ACACIA) for performing genotyping analysis of multigene complexes, such as self-incompatibility genes in plants, major histocompatibility genes (MHC) in vertebrates, and homeobox genes in animals, which are particularly challenging to genotype in non-model organisms. PCR and sequencing of multigene families generate artefacts, hence spurious alleles. A key to Gillingham et al.‘ s method is to call candidate genes based on Oligotyping, a software pipeline originally conceived for identifying variants from microbiome 16S rRNA amplicons [6]. This allows to reduce the number of false positives and the number of dropout alleles, compared to previous workflows.
This method is not based on an explicit probability model, and thus it is not conceived to provide a control of the rate of errors as, say, a valid confidence interval should (a confidence interval with coverage c for a parameter should contain the parameter with probability c, so the error rate 1- c is known and controlled by the user who selects the value of c). However, the authors suggest a method to adapt the settings of ACACIA to each application.
To compare and validate the new workflow, the authors have constructed new sets of genotypes representing different extents copy number variation, using already known genotypes from chicken MHC. In such conditions, it was possible to assess how many alleles are not detected and what is the rate of false positives. Gillingham et al. additionally investigated the effect of using non-optimal primers. They found better performance of ACACIA compared to a preexisting pipeline, AmpliSAS [7], for optimal settings of both methods. However, they do not claim that ACACIA will always be better than AmpliSAS. Rather, they warn against the common practice of using the default settings of the latter pipeline. Altogether, this work and the ACACIA workflow should allow for better ascertainment of genotypes from multigene families.

References

[1] Ioannidis, J. P. A, Greenland, S., Hlatky, M. A., Khoury, M. J., Macleod, M. R., Moher, D., Schulz, K. F. and Tibshirani, R. (2014) Increasing value and reducing waste in research design, conduct, and analysis. The Lancet, 383, 166-175. doi: 10.1016/S0140-6736(13)62227-8
[2] Chang, C. C., Chow, C. C., Tellier, L. C. A. M., Vattikuti, S., Purcell, S. M. and Lee, J. J. (2015) Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience, 4, 7, s13742-015-0047-8. doi: 10.1186/s13742-015-0047-8
[3] Robinson, M. R. and Visscher, P. (2018) Corrected sibling GWAS data release from Robinson et al. http://cnsgenomics.com/data.html
[4] Field, Y., Boyle, E. A., Telis, N., Gao, Z., Gaulton, K. J., Golan, D., Yengo, L., Rocheleau, G., Froguel, P., McCarthy, M.I . and Pritchard J. K. (2016) Detection of human adaptation during the past 2000 years. Science, 354(6313), 760-764. doi: 10.1126/science.aag0776
[5] Gillingham, M. A. F., Montero, B. K., Wihelm, K., Grudzus, K., Sommer, S. and Santos P. S. C. (2020) A novel workflow to improve multi-locus genotyping of wildlife species: an experimental set-up with a known model system. bioRxiv 638288, ver. 3 peer-reviewed and recommended by Peer Community In Evolutionary Biology. doi: 10.1101/638288
[6] Eren, A. M., Maignien, L., Sul, W. J., Murphy, L. G., Grim, S. L., Morrison, H. G., and Sogin, M.L. (2013) Oligotyping: differentiating between closely related microbial taxa using 16S rRNA gene data. Methods in Ecology and Evolution 4(12), 1111-1119. doi: 10.1111/2041-210X.12114
[7] Sebastian, A., Herdegen, M., Migalska, M. and Radwan, J. (2016) AMPLISAS: a web server for multilocus genotyping using next‐generation amplicon sequencing data. Mol Ecol Resour, 16, 498-510. doi: 10.1111/1755-0998.12453

A novel workflow to improve multi-locus genotyping of wildlife species: an experimental set-up with a known model systemGillingham, Mark A. F., Montero, B. Karina, Wilhelm, Kerstin, Grudzus, Kara, Sommer, Simone and Santos, Pablo S. C.<p>Genotyping novel complex multigene systems is particularly challenging in non-model organisms. Target primers frequently amplify simultaneously multiple loci leading to high PCR and sequencing artefacts such as chimeras and allele amplification...Bioinformatics & Computational Biology, Evolutionary Ecology, Genome Evolution, Molecular EvolutionFrançois Rousset Helena Westerdahl, Sebastian Ernesto Ramos-Onsins, Paul J. McMurdie , Arnaud Estoup, Vincent Segura, Jacek Radwan , Torbjørn Rognes , William Stutz , Kevin Vanneste , Thomas Bigot, Jill A. Hollenbach , Wieslaw Babik , Marie-Christin...2019-05-15 17:30:44 View
10 Jan 2020
article picture

Probabilities of tree topologies with temporal constraints and diversification shifts

Fitting diversification models on undated or partially dated trees

Recommended by based on reviews by Amaury Lambert, Dominik Schrempf and 1 anonymous reviewer

Phylogenetic trees can be used to extract information about the process of diversification that has generated them. The most common approach to conduct this inference is to rely on a likelihood, defined here as the probability of generating a dated tree T given a diversification model (e.g. a birth-death model), and then use standard maximum likelihood. This idea has been explored extensively in the context of the so-called diversification studies, with many variants for the models and for the questions being asked (diversification rates shifting at certain time points or in the ancestors of particular subclades, trait-dependent diversification rates, etc).
However, all this assumes that the dated tree T is known without error. In practice, trees (that is, both the tree topology and the divergence times) are inferred based on DNA sequences, possibly combined with fossil information for calibrating and informing the divergence times. Molecular dating is a delicate exercise, however, and much more so in fact than reconstructing the tree topology. In particular, a mis-specificied model for the relaxed molecular clock, or a mis-specifiied prior, can have a substantial impact on the estimation of divergence dates - which in turn could severely mislead the inference about the underlying diversification process. This thus raises the following question: would that be possible to conduct inference and testing of diversification models without having to go through the dangerous step of molecular dating?
In his article ""Probabilities of tree topologies with temporal constraints and diversification shifts"" [1], Gilles Didier introduces a recursive method for computing the probability of a tree topology under some diversification model of interest, without knowledge of the exact dates, but only interval constraints on the dates of some of the nodes of the tree. Such interval constraints, which are derived from fossil knowledge, are typically used for molecular dating: they provide the calibrations for the relaxed clock analysis. Thus, what is essentially proposed by Gilles Didier is to use them in combination with the tree topology only, thus bypassing the need to estimates divergence times first, before fitting a diversification model to a phylogenetic tree.
This article, which is primarily a mathematical and algorithmic contribution, is then complemented with several applications: testing for a diversification shift in a given subclade of the phylogeny, just based on the (undated) tree topology, with interval constraints on some of its internal nodes; but also, computing the age distribution of each node and sampling on the joint distribution on node ages, conditional on the interval constraints. The test for the presence of a diversification shift is particularly interesting: an application to simulated data (and without any interval constraint in that case) suggests that the method based on the undated tree performs about as well as the classical method based on a dated tree, and this, even granting the classical approach a perfect knowledge of the dates - given that, in practice, one in fact relies on potentially biased estimates. Finally, an application to a well-known example (rate shifts in cetacean phylogeny) is presented.
This article thus represents a particularly meaningful contribution to the methodology for diversification studies; but also, for molecular dating itself: it is a well known problem in molecular dating that computing and sampling from the conditional distributions on node ages, given fossil constraints, and more generally understanding and visualizing how interval constraints on some nodes of the tree impact the distribution at other nodes, is a particularly difficult exercise. For that reason, the algorithmic routines presented in the present article will be useful in this context as well.

References

[1] Didier, G. (2020) Probabilities of tree topologies with temporal constraints and diversification shifts. bioRxiv, 376756, ver. 4 peer-reviewed and recommended by PCI Evolutionary Biology. doi: 10.1101/376756

Probabilities of tree topologies with temporal constraints and diversification shiftsGilles Didier<p>Dating the tree of life is a task far more complicated than only determining the evolutionary relationships between species. It is therefore of interest to develop approaches apt to deal with undated phylogenetic trees. The main result of this ...Bioinformatics & Computational Biology, MacroevolutionNicolas Lartillot2019-01-30 11:28:58 View
17 May 2021
article picture

Relative time constraints improve molecular dating

Dating with constraints

Recommended by based on reviews by David Duchêne and 1 anonymous reviewer

Estimating the absolute age of diversification events is challenging, because molecular sequences provide timing information in units of substitutions, not years. Additionally, the rate of molecular evolution (in substitutions per year) can vary widely across lineages. Accurate dating of speciation events traditionally relies on non-molecular data. For very fast-evolving organisms such as SARS-CoV-2, for which samples are obtained over a time span, the collection times provide this external information from which we can learn the rate of molecular evolution and date past events (Boni et al. 2020). In groups for which the fossil record is abundant, state-of-the-art dating methods use fossil information to complement molecular data, either in the form of a prior distribution on node ages (Nguyen & Ho 2020), or as data modelled with a fossilization process (Heath et al. 2014).

Dating is a challenge in groups that lack fossils or other geological evidence, such as very old lineages and microbial lineages. In these groups, horizontal gene transfer (HGT) events have been identified as informative about relative dates: the ancestor of the gene's donor must be older than the descendants of the gene's recipient. Previous work using HGTs to date phylogenies have used methodologies that are ad-hoc (Davín et al 2018) or employ a small number of HGTs only (Magnabosco et al. 2018, Wolfe & Fournier 2018).

Szöllősi et al. (2021) present and validate a Bayesian approach to estimate the age of diversification events based on relative information on these ages, such as implied by HGTs. This approach is flexible because it is modular: constraints on relative node ages can be combined with absolute age information from fossil data, and with any substitution model of molecular evolution, including complex state-of-art models. To ease the computational burden, the authors also introduce a two-step approach, in which the complexity of estimating branch lengths in substitutions per site is decoupled from the complexity of timing the tree with branch lengths in years, accounting for uncertainty in the first step. Currently, one limitation is that the tree topology needs to be known, and another limitation is that constraints need to be certain. Users of this method should be mindful of the latter when hundreds of constraints are used, as done by Szöllősi et al. (2021) to date the trees of Cyanobacteria and Archaea.

Szöllősi et al. (2021)'s method is implemented in RevBayes, a highly modular platform for phylogenetic inference, rapidly growing in popularity (Höhna et al. 2016). The RevBayes tutorial page features a step-by-step tutorial "Dating with Relative Constraints", which makes the method highly approachable.

References:

Boni MF, Lemey P, Jiang X, Lam TT-Y, Perry BW, Castoe TA, Rambaut A, Robertson DL (2020) Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. Nature Microbiology, 5, 1408–1417. https://doi.org/10.1038/s41564-020-0771-4

Davín AA, Tannier E, Williams TA, Boussau B, Daubin V, Szöllősi GJ (2018) Gene transfers can date the tree of life. Nature Ecology & Evolution, 2, 904–909. https://doi.org/10.1038/s41559-018-0525-3

Heath TA, Huelsenbeck JP, Stadler T (2014) The fossilized birth–death process for coherent calibration of divergence-time estimates. Proceedings of the National Academy of Sciences, 111, E2957–E2966. https://doi.org/10.1073/pnas.1319091111

Höhna S, Landis MJ, Heath TA, Boussau B, Lartillot N, Moore BR, Huelsenbeck JP, Ronquist F (2016) RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language. Systematic Biology, 65, 726–736. https://doi.org/10.1093/sysbio/syw021

Magnabosco C, Moore KR, Wolfe JM, Fournier GP (2018) Dating phototrophic microbial lineages with reticulate gene histories. Geobiology, 16, 179–189. https://doi.org/10.1111/gbi.12273

Nguyen JMT, Ho SYW (2020) Calibrations from the Fossil Record. In: The Molecular Evolutionary Clock: Theory and Practice  (ed Ho SYW), pp. 117–133. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-030-60181-2_8

Szollosi, G.J., Hoehna, S., Williams, T.A., Schrempf, D., Daubin, V., Boussau, B. (2021) Relative time constraints improve molecular dating. bioRxiv, 2020.10.17.343889, ver. 8  recommended and peer-reviewed by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2020.10.17.343889

Wolfe JM, Fournier GP (2018) Horizontal gene transfer constrains the timing of methanogen evolution. Nature Ecology & Evolution, 2, 897–903. https://doi.org/10.1038/s41559-018-0513-7

Relative time constraints improve molecular datingGergely J Szollosi, Sebastian Hoehna, Tom A Williams, Dominik Schrempf, Vincent Daubin, Bastien Boussau<p style="text-align: justify;">Dating the tree of life is central to understanding the evolution of life on Earth. Molecular clocks calibrated with fossils represent the state of the art for inferring the ages of major groups. Yet, other informat...Bioinformatics & Computational Biology, Genome Evolution, Phylogenetics / PhylogenomicsCécile Ané2020-10-21 23:39:17 View