- Professorship for Population Genetics, Technical University of Munich, Freising, Germany
- Evolutionary Theory, Genome Evolution, Molecular Evolution, Population Genetics / Genomics, Species interactions
The fate of recessive deleterious or overdominant mutations near mating-type loci under partial selfing
Maintenance of deleterious mutations and recombination suppression near mating-type loci under selfingRecommended by Aurelien Tellier based on reviews by 3 anonymous reviewers
The causes and consequences of the evolution of sexual reproduction are a major topic in evolutionary biology. With advances in sequencing technology, it becomes possible to compare sexual chromosomes across species and infer the neutral and selective processes shaping polymorphism at these chromosomes. Most sex and mating-type chromosomes exhibit an absence of recombination in large genomic regions around the animal, plant or fungal sex-determining genes. This suppression of recombination likely occurred in several time steps generating stepwise increasing genomic regions starting around the sex-determining genes. This mechanism generates so-called evolutionary strata of differentiation between sex chromosomes (Nicolas et al., 2004, Bergero and Charlesworth, 2009, Hartmann et al. 2021). The evolution of extended regions of recombination suppression is also documented on mating-type chromosomes in fungi (Hartmann et al., 2021) and around supergenes (Yan et al., 2020, Jay et al., 2021). The exact reason and evolutionary mechanisms for this phenomenon are still, however, debated.
Two hypotheses are proposed: 1) sexual antagonism (Charlesworth et al., 2005), which, nevertheless, does explain the observed occurrence of the evolutionary strata, and 2) the sheltering of deleterious alleles by inversions carrying a lower load than average in the population (Charlesworth and Wall, 1999, Antonovics and Abrams, 2004). In the latter, the mechanism is as follows. A genetic inversion or a suppressor of recombination in cis may exhibit some overdominance behaviour. The inversion exhibiting less recessive deleterious mutations (compared to others at the same locus) may increase in frequency, before at higher frequency occurring at the homozygous state, expressing its genetic load. However, the inversion may never be at the homozygous state if it is genetically linked to a gene in a permanently heterozygous state. The inversion can then be advantageous and may reach fixation at the sex chromosome (Charlesworth and Wall, 1999, Antonovics and Abrams, 2004, Jay et al., 2022). These selective mechanisms promote thus the suppression of recombination around the sex-determining gene, and recessive deleterious mutations are permanently sheltered. This hypothesis is corroborated by the sheltering of deleterious mutations observed around loci under balancing selection (Llaurens et al. 2009, Lenz et al. 2016) and around mating-type genes in fungi and supergenes (Jay et al. 2021, Jay et al., 2022).
In this present theoretical study, Tezenas et al. (2022) analyse how linkage to a necessarily heterozygous fungal mating type locus influences the persistence/extinction time of a new mutation at a second selected locus. This mutation can either be deleterious and recessive, or overdominant. There is arbitrary linkage between the two loci, and sexual reproduction occurs either between 1) gametes of different individuals (outcrossing), or 2) by selfing with gametes originating from the same (intra-tetrad) or different (inter-tetrad) tetrads produced by that individual. Note, here, that the mating-type gene does not prevent selfing. The authors study the initial stochastic dynamics of the mutation using a multi-type branching process (and simulations when analytical results cannot be obtained) to compute the extinction time of the deleterious mutation. The main result is that the presence of a mating-type locus always decreases the purging probability and increases the purging time of the mutations under selfing. Ultimately, deleterious mutations can indeed accumulate near the mating-type locus over evolutionary time scales. In a nutshell, high selfing or high intra-tetrad mating do increase the sheltering effect of the mating-type locus. In effect, the outcome of sheltering of deleterious mutations depends on two opposing mechanisms: 1) a higher selfing rate induces a greater production of homozygotes and an increased effect of the purging of deleterious mutations, while 2) a higher intra-tetrad selfing rate (or linkage with the mating-type locus) generates heterozygotes which have a small genetic load (and are favoured). The authors also show that rare events of extremely long maintenance of deleterious mutations can occur.
The authors conclude by highlighting the manifold effect of selfing which reduces the effective population size and thus impairs the efficiency of selection and increases the mutational load, while also favouring the purge of deleterious homozygous mutations. Furthermore, this study emphasizes the importance of studying the maintenance and accumulation of deleterious mutations in the vicinity of heterozygous loci (e.g. under balancing selection) in selfing species.
Antonovics J, Abrams JY (2004) Intratetrad Mating and the Evolution of Linkage Relationships. Evolution, 58, 702–709. https://doi.org/10.1111/j.0014-3820.2004.tb00403.x
Bergero R, Charlesworth D (2009) The evolution of restricted recombination in sex chromosomes. Trends in Ecology & Evolution, 24, 94–102. https://doi.org/10.1016/j.tree.2008.09.010
Charlesworth D, Morgan MT, Charlesworth B (1990) Inbreeding Depression, Genetic Load, and the Evolution of Outcrossing Rates in a Multilocus System with No Linkage. Evolution, 44, 1469–1489. https://doi.org/10.1111/j.1558-5646.1990.tb03839.x
Charlesworth D, Charlesworth B, Marais G (2005) Steps in the evolution of heteromorphic sex chromosomes. Heredity, 95, 118–128. https://doi.org/10.1038/sj.hdy.6800697
Charlesworth B, Wall JD (1999) Inbreeding, heterozygote advantage and the evolution of neo–X and neo–Y sex chromosomes. Proceedings of the Royal Society of London. Series B: Biological Sciences, 266, 51–56. https://doi.org/10.1098/rspb.1999.0603
Hartmann FE, Duhamel M, Carpentier F, Hood ME, Foulongne-Oriol M, Silar P, Malagnac F, Grognet P, Giraud T (2021) Recombination suppression and evolutionary strata around mating-type loci in fungi: documenting patterns and understanding evolutionary and mechanistic causes. New Phytologist, 229, 2470–2491. https://doi.org/10.1111/nph.17039
Jay P, Chouteau M, Whibley A, Bastide H, Parrinello H, Llaurens V, Joron M (2021) Mutation load at a mimicry supergene sheds new light on the evolution of inversion polymorphisms. Nature Genetics, 53, 288–293. https://doi.org/10.1038/s41588-020-00771-1
Jay P, Tezenas E, Véber A, Giraud T (2022) Sheltering of deleterious mutations explains the stepwise extension of recombination suppression on sex chromosomes and other supergenes. PLOS Biology, 20, e3001698. https://doi.org/10.1371/journal.pbio.3001698
Lenz TL, Spirin V, Jordan DM, Sunyaev SR (2016) Excess of Deleterious Mutations around HLA Genes Reveals Evolutionary Cost of Balancing Selection. Molecular Biology and Evolution, 33, 2555–2564. https://doi.org/10.1093/molbev/msw127
Llaurens V, Gonthier L, Billiard S (2009) The Sheltered Genetic Load Linked to the S Locus in Plants: New Insights From Theoretical and Empirical Approaches in Sporophytic Self-Incompatibility. Genetics, 183, 1105–1118. https://doi.org/10.1534/genetics.109.102707
Nicolas M, Marais G, Hykelova V, Janousek B, Laporte V, Vyskot B, Mouchiroud D, Negrutiu I, Charlesworth D, Monéger F (2004) A Gradual Process of Recombination Restriction in the Evolutionary History of the Sex Chromosomes in Dioecious Plants. PLOS Biology, 3, e4. https://doi.org/10.1371/journal.pbio.0030004
Tezenas E, Giraud T, Véber A, Billiard S (2022) The fate of recessive deleterious or overdominant mutations near mating-type loci under partial selfing. bioRxiv, 2022.10.07.511119, ver. 2 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2022.10.07.511119
Yan Z, Martin SH, Gotzek D, Arsenault SV, Duchen P, Helleu Q, Riba-Grognuz O, Hunt BG, Salamin N, Shoemaker D, Ross KG, Keller L (2020) Evolution of a supergene that regulates a trans-species social polymorphism. Nature Ecology & Evolution, 4, 240–249. https://doi.org/10.1038/s41559-019-1081-1
Joint inference of adaptive and demographic history from temporal population genomic data
Inference of genome-wide processes using temporal population genomic dataRecommended by Aurelien Tellier based on reviews by Lawrence Uricchio and 2 anonymous reviewers
Evolutionary genomics, and population genetics in particular, aim to decipher the respective influence of neutral and selective forces shaping genetic polymorphism in a species/population. This is a much-needed requirement before scanning genome data for footprints of species adaptation to their biotic and abiotic environment (Johri et al. 2022). In general, we would like to quantify the proportion of the genome evolving neutrally and under selective (positive, balancing and negative) pressures (Kern and Hahn 2018, Johri et al. 2021). We thus need to understand patterns of linked selection along the genome, that is how the distribution of genetic polymorphisms is shaped by selected sites and the recombination landscape. The present contribution by Pavinato et al. (2022) provides an additional method in the population genomics toolbox to quantify the extent of linked positive and negative selection using temporal data.
The availability of genomics data for model and non-model species has led to improvement of the modeling framework for demography and selection (Johri et al. 2022), but also new inference methods making use of the full genome data based on the Sequential Markovian Coalescent (SMC, Li and Durbin 2011), Approximate Bayesian Computation (ABC, Jay et al. 2019), ABC and machine learning (Pudlo et al. 2016, Raynal et al. 2019) or Deep Learning (Sanchez et al. 2021). These methods are based on one sample in time and the use of the coalescent theory to reconstruct the past (demographic) history. However, it is also possible to obtain for many species temporal data sampled over several time points. For species with short generation time (in experimental evolution or monitored populations), one can sample a population every couple of generations as exemplified with Drosophila melanogaster (Bergland et al. 2010). For species with longer generation times that cannot be easily regularly sampled in time, it becomes possible to sequence available specimens from museums (e.g. Cridland et al. 2018) or ancient DNA samples. Methods using temporal data are based on the classical population genomics assumption that demography (migration, population subdivision, population size changes) leaves a genome-wide signal, while selection leaves a localized signal in the close vicinity of the causal mutation. Several methods do assess the demography of a population (change in effective population size, Ne, in time) using temporal data (e.g. Jorde and Ryman 2007) which can be used to calibrate the detection of loci under strong positive selection (Foll et al. 2014). Recently Buffalo and Coop (2020) used genome-wide covariance between allele frequency changes across time samples (and across replicates) to quantify the effects of linked selection over short timescales.
In the present contribution, Pavinato et al. (2022) make use of temporal data to draw the joint estimation of demographic and selective parameters using a simulation-based method (ABC-Random Forests). This study by Pavinato et al. (2022) builds a framework allowing to infer the census size of the population in time (N) separately from the effect of genetic drift, which is determined by change in effective population size (Ne) in time, as well estimates of genome-wide parameters of selection. In a nutshell, the authors use a forward simulator and summarize genome data by genomic windows using classic statistics (nucleotide diversity, Tajima’s D, FST, heterozygosity) between time samples and for each sample. They specifically use the distributions (higher moments) of these statistics among all windows. The authors combine as input for the ABC-RF, vectors of summary statistics, model parameters and five latent variables: Ne, the ratio Ne/N, the number of beneficial mutations under strong selection, the average selection coeﬀicient of strongly selected mutations, and the average substitution load. Indeed, the authors are interested in three different types of selection components: 1) the adaptive potential of a population which is estimated as the population mutation rate of beneficial mutations (θb), 2) the number of mutations under strong selection (irrespective of whether they reached fixation or not), and 3) the overall population fitness which is a function of the genetic load. In other words, the novelty of this method is not to focus on the detection of loci under selection, but to infer key parameters/distributions summarizing the genome-wide signal of demography and (positive and negative) selection. As a proof of principle, the authors then apply their method to a dataset of feral populations of honey bees (Apis mellifera) collected in California across many years and recovered from Museum samples (Cridland et al. 2018). The approach yields estimates of Ne which are on the same order of magnitude of previous estimates in hymenopterans, and the authors discuss why the different populations show various values of Ne and N which can be explained by different history of admixture with wild but also domesticated lineages of bees.
This study focuses on quantifying the genome-wide joint footprints of demography, and strong positive and negative selection to determine which proportion of the genome evolves neutrally or not. Further application of this method can be anticipated, for example, to study species with ecological and life-history traits which generate discrepancies between census size and Ne, for example for plants with selfing or seed banking (Sellinger et al. 2020), and for which the genome-wide effect of linked selection is not fully understood.
Johri P, Aquadro CF, Beaumont M, Charlesworth B, Excoffier L, Eyre-Walker A, Keightley PD, Lynch M, McVean G, Payseur BA, Pfeifer SP, Stephan W, Jensen JD (2022) Recommendations for improving statistical inference in population genomics. PLOS Biology, 20, e3001669. https://doi.org/10.1371/journal.pbio.3001669
Kern AD, Hahn MW (2018) The Neutral Theory in Light of Natural Selection. Molecular Biology and Evolution, 35, 1366–1371. https://doi.org/10.1093/molbev/msy092
Johri P, Riall K, Becher H, Excoffier L, Charlesworth B, Jensen JD (2021) The Impact of Purifying and Background Selection on the Inference of Population History: Problems and Prospects. Molecular Biology and Evolution, 38, 2986–3003. https://doi.org/10.1093/molbev/msab050
Pavinato VAC, Mita SD, Marin J-M, Navascués M de (2022) Joint inference of adaptive and demographic history from temporal population genomic data. bioRxiv, 2021.03.12.435133, ver. 6 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2021.03.12.435133
Li H, Durbin R (2011) Inference of human population history from individual whole-genome sequences. Nature, 475, 493–496. https://doi.org/10.1038/nature10231
Jay F, Boitard S, Austerlitz F (2019) An ABC Method for Whole-Genome Sequence Data: Inferring Paleolithic and Neolithic Human Expansions. Molecular Biology and Evolution, 36, 1565–1579. https://doi.org/10.1093/molbev/msz038
Pudlo P, Marin J-M, Estoup A, Cornuet J-M, Gautier M, Robert CP (2016) Reliable ABC model choice via random forests. Bioinformatics, 32, 859–866. https://doi.org/10.1093/bioinformatics/btv684
Raynal L, Marin J-M, Pudlo P, Ribatet M, Robert CP, Estoup A (2019) ABC random forests for Bayesian parameter inference. Bioinformatics, 35, 1720–1728. https://doi.org/10.1093/bioinformatics/bty867
Sanchez T, Cury J, Charpiat G, Jay F (2021) Deep learning for population size history inference: Design, comparison and combination with approximate Bayesian computation. Molecular Ecology Resources, 21, 2645–2660. https://doi.org/10.1111/1755-0998.13224
Bergland AO, Behrman EL, O’Brien KR, Schmidt PS, Petrov DA (2014) Genomic Evidence of Rapid and Stable Adaptive Oscillations over Seasonal Time Scales in Drosophila. PLOS Genetics, 10, e1004775. https://doi.org/10.1371/journal.pgen.1004775
Cridland JM, Ramirez SR, Dean CA, Sciligo A, Tsutsui ND (2018) Genome Sequencing of Museum Specimens Reveals Rapid Changes in the Genetic Composition of Honey Bees in California. Genome Biology and Evolution, 10, 458–472. https://doi.org/10.1093/gbe/evy007
Jorde PE, Ryman N (2007) Unbiased Estimator for Genetic Drift and Effective Population Size. Genetics, 177, 927–935. https://doi.org/10.1534/genetics.107.075481
Foll M, Shim H, Jensen JD (2015) WFABC: a Wright–Fisher ABC-based approach for inferring effective population sizes and selection coefficients from time-sampled data. Molecular Ecology Resources, 15, 87–98. https://doi.org/10.1111/1755-0998.12280
Buffalo V, Coop G (2020) Estimating the genome-wide contribution of selection to temporal allele frequency change. Proceedings of the National Academy of Sciences, 117, 20672–20680. https://doi.org/10.1073/pnas.1919039117
Sellinger TPP, Awad DA, Moest M, Tellier A (2020) Inference of past demography, dormancy and self-fertilization rates from whole genome sequence data. PLOS Genetics, 16, e1008698. https://doi.org/10.1371/journal.pgen.1008698
Deceptive combined effects of short allele dominance and stuttering: an example with Ixodes scapularis, the main vector of Lyme disease in the U.S.A.
New curation method for microsatellite markers improves population genetics analysesRecommended by Aurelien Tellier based on reviews by Eric Petit, Martin Husemann and 2 anonymous reviewers
Genetic markers are used for in modern population genetics/genomics to uncover the past neutral and selective history of population and species. Besides Single Nucleotide Polymorphisms (SNPs) obtained from whole genome data, microsatellites (or Short Tandem Repeats, SSR) have been common markers of choice in numerous population genetics studies of non-model species with large sample sizes . Microsatellites can be used to uncover and draw inference of the past population demography (e.g. expansion, decline, bottlenecks…), population split, population structure and gene flow, but also life history traits and modes of reproduction (e.g. [2,3]). These markers are widely used in conservation genetics  or to study parasites or disease vectors . Microsatellites do show higher mutation rate than SNPs increasing, on the one hand, the statistical power to infer recent events (for example crop domestication, [2,3]), while, on the other hand, decreasing their statistical power over longer time scales due to homoplasy .
To perform such analyses, however, an excellent and reliable quality of data is required. As emphasized in the article by De Meeûs et al.  three main issues do bias the observed heterozygosity at microsatellites: null alleles, short allele dominance (SAD) and stuttering. These originates from poor PCR amplification. As a result, an excess of homozygosity is observed at the microsatellite loci leading to overestimation of the variation statistics FIS and FST as well as increased linage disequilibrium (LD). For null alleles, several methods and software do help to reduce the bias, and in the present study, De Meeûs et al.  propose a way to tackle issues with SAD and stuttering.
The authors study a dataset consisting of 387 samples from 61 subsamples genotyped at nine loci of the species Ixodes scapularis, i.e. ticks transmitting the Lyme disease. Based on correlation methods and FST, FIS they can uncover null alleles and SAD. Stuttering is detected by evaluating the heterozygote deficit between alleles displaying a single repeat difference. Without correction, six loci are affected by one of these amplification problems generating a large deficit of heterozygotes (measured by significant FIS and FST) remaining so after correction for the false discovery rate (FDR). These results would be classically interpreted as a strong Wahlund effect and/or selection at several loci.
After correcting for null alleles, the authors apply two novel corrections: 1) a re-examination of the chromatograms reveals previously disregarded larger alleles thus decreasing SAD, and 2) pooling alleles close in size decreasing stuttering. The corrected dataset shows then a significant excess of heterozygotes as could be expected in a dioecious species with strong population structure. The FDR correction removes then the significant excess of homozygotes and LD between pairs of loci. FST on the cured dataset is used to demonstrate the strong population structure and small effective subpopulation sizes. This is confirmed by a clustering analysis using discriminant analysis of principal components (DAPC).
While based on a specific dataset of ticks from different populations sampled across the USA, the generality of the authors’ approach is presented in Figure 6 in which they provide a step by step flowchart to cure microsatellite datasets from null alleles, SAD and stuttering. Several criteria based on FIS, FST and LD between loci are used as decision keys in the flowchart. An excel file is also provided as help for the curation steps. This study and the proposed methodology are thus extremely useful for all population geneticists working on non-model species with large number of samples genotyped at microsatellite markers. The method not only allows more accurate estimates of heterozygosity but also prevents the thinning of datasets due to the removal of problematic loci. As a follow-up and extension of this work, an exhaustive simulation study could investigate the influence of these data quality issues on past demographic and population structure inference under a wide range of scenarios. This would allow to quantify the current biases in the literature and the robustness of the methodology devised by De Meeûs et al. .
 Jarne, P., and Lagoda, P. J. (1996). Microsatellites, from molecules to populations and back. Trends in ecology & evolution, 11(10), 424-429. doi: 10.1016/0169-5347(96)10049-5
 Cornille, A., Giraud, T., Bellard, C., Tellier, A., Le Cam, B., Smulders, M. J. M., Kleinschmit, J., Roldan-Ruiz, I. and Gladieux, P. (2013). Postglacial recolonization history of the E uropean crabapple (Malus sylvestris M ill.), a wild contributor to the domesticated apple. Molecular Ecology, 22(8), 2249-2263. doi: 10.1111/mec.12231
 Parat, F., Schwertfirm, G., Rudolph, U., Miedaner, T., Korzun, V., Bauer, E., Schön C.-C. and Tellier, A. (2016). Geography and end use drive the diversification of worldwide winter rye populations. Molecular ecology, 25(2), 500-514. doi: 10.1111/mec.13495
 Broquet, T., Ménard, N., & Petit, E. (2007). Noninvasive population genetics: a review of sample source, diet, fragment length and microsatellite motif effects on amplification success and genotyping error rates. Conservation Genetics, 8(1), 249-260. doi: 10.1007/s10592-006-9146-5
 Koffi, M., De Meeûs, T., Séré, M., Bucheton, B., Simo, G., Njiokou, F., Salim, B., Kaboré, J., MacLeod, A., Camara, M., Solano, P., Belem, A. M. G. and Jamonneau, V. (2015). Population genetics and reproductive strategies of African trypanosomes: revisiting available published data. PLoS neglected tropical diseases, 9(10), e0003985. doi: 10.1371/journal.pntd.0003985
 Estoup, A., Jarne, P., & Cornuet, J. M. (2002). Homoplasy and mutation model at microsatellite loci and their consequences for population genetics analysis. Molecular ecology, 11(9), 1591-1604. doi: 10.1046/j.1365-294X.2002.01576.x
 De Meeûs, T., Chan, C. T., Ludwig, J. M., Tsao, J. I., Patel, J., Bhagatwala, J., and Beati, L. (2019). Deceptive combined effects of short allele dominance and stuttering: an example with Ixodes scapularis, the main vector of Lyme disease in the USA. bioRxiv, 622373, ver. 4 peer-reviewed and recommended by Peer Community In Evolutionary Biology. doi: 10.1101/622373