Recommendation

An unusual suspect: the mutation landscape as a determinant of local variation in nucleotide diversity

Fernando Racimo based on reviews by David Castellano and 1 anonymous reviewer

A recommendation of:

The landscape of nucleotide diversity in Drosophila melanogaster is shaped by mutation rate variation

Gustavo V Barroso, Julien Y Dutheil (2022), bioRxiv, ver.3, peer-reviewed and recommended by PCI Evolutionary Biology https://doi.org/10.1101/2021.09.16.460667

Read preprint in preprint server Now published in Peer Community Journal

Data used for results

Codes used in this study

Scripts used to obtain or analyze results

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

The landscape of nucleotide diversity in Drosophila melanogaster is shaped by mutation rate variation

What shapes the distribution of nucleotide diversity along the genome? Attempts to answer this question have sparked debate about the roles of neutral stochastic processes and natural selection in molecular evolution. However, the mechanisms of evolution do not act in isolation, and integrative models that simultaneously consider the influence of multiple factors on diversity are lacking; without them, confounding factors lurk in the estimates. Here we present a new statistical method that jointly infers the genomic landscapes of genealogies, recombination rates and mutation rates. In doing so, our model captures the effects of genetic drift, linked selection and local mutation rates on patterns of genomic variation. We then formalise a causal model of how these micro-evolutionary forces interact, and cast it as a linear regression to estimate their individual contributions to levels of diversity along the genome. Our analyses reclaim the well-established signature of linked selection in Drosophila melanogaster , but we estimate that the mutation landscape is the major driver of the genome-wide distribution of diversity in this species. Furthermore, our simulation results suggest that in many evolutionary scenarios the mutation landscape will be a crucial factor shaping diversity, depending notably on the genomic window size. We argue that incorporating mutation rate variation into the null model of molecular evolution will lead to more robust inferences in population genomics.

Sequentially Markovian Coalescent, Mutation rate variation, Linked Selection, Drosophila melanogaster

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

يتشكل مشهد تنوع النوكليوتيدات في ذبابة الفاكهة السوداء من خلال تباين معدل الطفرة

ما الذي يشكل توزيع تنوع النيوكليوتيدات على طول الجينوم؟ أثارت محاولات الإجابة على هذا السؤال جدلاً حول أدوار العمليات العشوائية المحايدة والانتقاء الطبيعي في التطور الجزيئي. ومع ذلك، فإن آليات التطور لا تعمل بمعزل عن غيرها، ولا توجد نماذج تكاملية تأخذ في الاعتبار تأثير العوامل المتعددة على التنوع في نفس الوقت؛ ومن دونها، تكمن عوامل مربكة في التقديرات. نقدم هنا طريقة إحصائية جديدة تستنتج بشكل مشترك المناظر الطبيعية الجينومية لسلاسل الأنساب ومعدلات إعادة التركيب ومعدلات الطفرة. ومن خلال القيام بذلك، يلتقط نموذجنا تأثيرات الانجراف الوراثي والاختيار المرتبط ومعدلات الطفرات المحلية على أنماط التباين الجيني. نقوم بعد ذلك بإضفاء الطابع الرسمي على نموذج سببي لكيفية تفاعل هذه القوى التطورية الدقيقة، ونطرحه باعتباره انحدارًا خطيًا لتقدير مساهماتها الفردية في مستويات التنوع على طول الجينوم. تستعيد تحليلاتنا التوقيع الراسخ للانتقاء المرتبط في ذبابة الفاكهة السوداء، لكننا نقدر أن مشهد الطفرة هو المحرك الرئيسي لتوزيع التنوع على نطاق الجينوم في هذا النوع. علاوة على ذلك، تشير نتائج المحاكاة لدينا إلى أنه في العديد من السيناريوهات التطورية، سيكون مشهد الطفرة عاملاً حاسماً في تشكيل التنوع، اعتمادًا بشكل خاص على حجم نافذة الجينوم. نحن نرى أن دمج تباين معدل الطفرة في النموذج الفارغ للتطور الجزيئي سيؤدي إلى استنتاجات أكثر قوة في علم الجينوم السكاني.

التحالف الماركوفي التتابعي، تباين معدل الطفرة، الاختيار المرتبط، ذبابة الفاكهة السوداء

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

El panorama de la diversidad de nucleótidos en Drosophila melanogaster está determinado por la variación de la tasa de mutación

¿Qué da forma a la distribución de la diversidad de nucleótidos a lo largo del genoma? Los intentos de responder a esta pregunta han provocado un debate sobre el papel de los procesos estocásticos neutros y la selección natural en la evolución molecular. Sin embargo, los mecanismos de la evolución no actúan de forma aislada y faltan modelos integradores que consideren simultáneamente la influencia de múltiples factores sobre la diversidad; sin ellos, en las estimaciones acechan factores de confusión. Aquí presentamos un nuevo método estadístico que infiere conjuntamente los paisajes genómicos de genealogías, tasas de recombinación y tasas de mutación. Al hacerlo, nuestro modelo captura los efectos de la deriva genética, la selección vinculada y las tasas de mutación local en los patrones de variación genómica. Luego formalizamos un modelo causal de cómo interactúan estas fuerzas microevolutivas y lo presentamos como una regresión lineal para estimar sus contribuciones individuales a los niveles de diversidad a lo largo del genoma. Nuestros análisis reclaman la firma bien establecida de selección vinculada en Drosophila melanogaster , pero estimamos que el panorama de mutaciones es el principal impulsor de la distribución de la diversidad en todo el genoma en esta especie. Además, los resultados de nuestra simulación sugieren que en muchos escenarios evolutivos el panorama de mutaciones será un factor crucial que dará forma a la diversidad, dependiendo notablemente del tamaño de la ventana genómica. Sostenemos que incorporar la variación de la tasa de mutación en el modelo nulo de evolución molecular conducirá a inferencias más sólidas en genómica de poblaciones.

Coalescente secuencialmente markoviano, variación de la tasa de mutación, selección ligada, Drosophila melanogaster

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Le paysage de la diversité nucléotidique chez Drosophila melanogaster est façonné par la variation du taux de mutation

Qu'est-ce qui façonne la répartition de la diversité nucléotidique le long du génome ? Les tentatives pour répondre à cette question ont suscité un débat sur le rôle des processus stochastiques neutres et de la sélection naturelle dans l'évolution moléculaire. Cependant, les mécanismes de l’évolution n’agissent pas de manière isolée et il manque des modèles intégrateurs prenant simultanément en compte l’influence de multiples facteurs sur la diversité ; sans eux, des facteurs de confusion se cachent dans les estimations. Nous présentons ici une nouvelle méthode statistique qui déduit conjointement les paysages génomiques des généalogies, les taux de recombinaison et les taux de mutation. Ce faisant, notre modèle capture les effets de la dérive génétique, de la sélection liée et des taux de mutation locaux sur les modèles de variation génomique. Nous formalisons ensuite un modèle causal de la manière dont ces forces micro-évolutives interagissent et le présentons sous la forme d'une régression linéaire pour estimer leurs contributions individuelles aux niveaux de diversité le long du génome. Nos analyses récupèrent la signature bien établie de la sélection liée chez Drosophila melanogaster , mais nous estimons que le paysage des mutations est le principal moteur de la distribution de la diversité à l'échelle du génome de cette espèce. De plus, les résultats de nos simulations suggèrent que dans de nombreux scénarios évolutifs, le paysage des mutations sera un facteur crucial façonnant la diversité, en fonction notamment de la taille de la fenêtre génomique. Nous soutenons que l'intégration de la variation du taux de mutation dans le modèle nul de l'évolution moléculaire conduira à des inférences plus robustes en génomique des populations.

Coalescent markovien séquentiel, variation du taux de mutation, sélection liée, Drosophila melanogaster

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

ड्रोसोफिला मेलानोगास्टर में न्यूक्लियोटाइड विविधता का परिदृश्य उत्परिवर्तन दर भिन्नता से आकार लेता है

जीनोम के साथ न्यूक्लियोटाइड विविधता के वितरण को क्या आकार देता है? इस प्रश्न का उत्तर देने के प्रयासों ने आणविक विकास में तटस्थ स्टोकेस्टिक प्रक्रियाओं और प्राकृतिक चयन की भूमिकाओं के बारे में बहस छेड़ दी है। हालाँकि, विकास के तंत्र अलगाव में कार्य नहीं करते हैं, और एकीकृत मॉडल जो एक साथ विविधता पर कई कारकों के प्रभाव पर विचार करते हैं, उनका अभाव है; उनके बिना, भ्रमित करने वाले कारक अनुमानों में छिपे रहते हैं। यहां हम एक नई सांख्यिकीय पद्धति प्रस्तुत करते हैं जो वंशावली, पुनर्संयोजन दर और उत्परिवर्तन दर के जीनोमिक परिदृश्य का संयुक्त रूप से अनुमान लगाती है। ऐसा करने में, हमारा मॉडल जीनोमिक भिन्नता के पैटर्न पर आनुवंशिक बहाव, जुड़े चयन और स्थानीय उत्परिवर्तन दर के प्रभावों को पकड़ता है। फिर हम एक कारण मॉडल को औपचारिक रूप देते हैं कि ये सूक्ष्म-विकासवादी ताकतें कैसे बातचीत करती हैं, और इसे जीनोम के साथ विविधता के स्तर में उनके व्यक्तिगत योगदान का अनुमान लगाने के लिए एक रैखिक प्रतिगमन के रूप में डालती हैं। हमारे विश्लेषण ड्रोसोफिला मेलानोगास्टर में जुड़े चयन के सुस्थापित हस्ताक्षर को पुनः प्राप्त करते हैं, लेकिन हमारा अनुमान है कि उत्परिवर्तन परिदृश्य इस प्रजाति में विविधता के जीनोम-व्यापक वितरण का प्रमुख चालक है। इसके अलावा, हमारे सिमुलेशन परिणाम बताते हैं कि कई विकासवादी परिदृश्यों में उत्परिवर्तन परिदृश्य विविधता को आकार देने वाला एक महत्वपूर्ण कारक होगा, जो विशेष रूप से जीनोमिक विंडो के आकार पर निर्भर करता है। हमारा तर्क है कि आणविक विकास के शून्य मॉडल में उत्परिवर्तन दर भिन्नता को शामिल करने से जनसंख्या जीनोमिक्स में अधिक मजबूत निष्कर्ष निकलेंगे।

क्रमिक रूप से मार्कोवियन कोलेसेंट, उत्परिवर्तन दर भिन्नता, लिंक्ड चयन, ड्रोसोफिला मेलानोगास्टर

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

キイロショウジョウバエのヌクレオチド多様性の景観は突然変異率の変動によって形作られる e44649ce2e074e34bcbbb4a8e238bdf9 逐次マルコフ合体、突然変異率変動、リンク選択、キイロショウジョウバエ

ゲノムに沿ったヌクレオチドの多様性の分布を形成するものは何ですか?この疑問に答えようとする試みは、分子進化における中立確率過程と自然選択の役割についての議論を引き起こした。しかし、進化のメカニズムは単独で機能するわけではなく、多様性に対する複数の要因の影響を同時に考慮する統合的なモデルは不足しています。これらがなければ、推定値には交絡因子が潜んでいます。今回我々は、家系図、組換え率、突然変異率のゲノム状況を共同で推論する新しい統計手法を紹介します。そうすることで、私たちのモデルは、ゲノム変異のパターンに対する遺伝的浮動、連鎖選択、および局所突然変異率の影響を捕捉します。次に、これらの微小進化の力がどのように相互作用するかの因果モデルを形式化し、それを線形回帰としてキャストして、ゲノムに沿った多様性レベルに対する個々の寄与を推定します。私たちの分析は キイロショウジョウバエ における連鎖選択の確立された特徴を再現していますが、変異の状況がこの種のゲノム全体にわたる多様性の分布の主な要因であると推定しています。さらに、我々のシミュレーション結果は、多くの進化シナリオにおいて、特にゲノムウィンドウのサイズに応じて、変異の状況が多様性を形成する重要な要素となることを示唆している。私たちは、変異率の変動を分子進化のヌルモデルに組み込むことで、集団ゲノミクスにおけるより堅牢な推論につながると主張します。

逐次マルコフ合体、突然変異率変動、リンク選択、キイロショウジョウバエ

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

A paisagem da diversidade de nucleotídeos em Drosophila melanogaster é moldada pela variação da taxa de mutação

O que molda a distribuição da diversidade de nucleotídeos ao longo do genoma? As tentativas de responder a esta questão geraram debate sobre os papéis dos processos estocásticos neutros e da seleção natural na evolução molecular. No entanto, os mecanismos de evolução não atuam isoladamente e faltam modelos integrativos que considerem simultaneamente a influência de múltiplos fatores na diversidade; sem eles, factores de confusão espreitam nas estimativas. Aqui apresentamos um novo método estatístico que infere conjuntamente as paisagens genômicas de genealogias, taxas de recombinação e taxas de mutação. Ao fazer isso, nosso modelo captura os efeitos da deriva genética, da seleção vinculada e das taxas de mutação local nos padrões de variação genômica. Em seguida, formalizamos um modelo causal de como essas forças microevolutivas interagem e o apresentamos como uma regressão linear para estimar suas contribuições individuais para os níveis de diversidade ao longo do genoma. Nossas análises recuperam a assinatura bem estabelecida da seleção ligada em Drosophila melanogaster , mas estimamos que o cenário de mutação é o principal impulsionador da distribuição da diversidade em todo o genoma nesta espécie. Além disso, os resultados da nossa simulação sugerem que em muitos cenários evolutivos o cenário de mutação será um fator crucial na formação da diversidade, dependendo nomeadamente do tamanho da janela genómica. Argumentamos que a incorporação da variação da taxa de mutação no modelo nulo de evolução molecular levará a inferências mais robustas na genômica populacional.

Coalescente Sequencialmente Markoviano, Variação da taxa de mutação, Seleção Ligada, Drosophila melanogaster

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Ландшафт нуклеотидного разнообразия Drosophila melanogaster формируется изменением скорости мутаций.

Что формирует распределение разнообразия нуклеотидов по геному? Попытки ответить на этот вопрос вызвали споры о роли нейтральных стохастических процессов и естественного отбора в молекулярной эволюции. Однако механизмы эволюции не действуют изолированно, и интегративные модели, одновременно учитывающие влияние множества факторов на разнообразие, отсутствуют; без них в оценках скрываются искажающие факторы. Здесь мы представляем новый статистический метод, который совместно делает выводы о геномном ландшафте генеалогий, скорости рекомбинации и скорости мутаций. При этом наша модель отражает влияние генетического дрейфа, связанного отбора и частоты локальных мутаций на закономерности геномной изменчивости. Затем мы формализуем причинно-следственную модель взаимодействия этих микроэволюционных сил и представляем ее как линейную регрессию, чтобы оценить их индивидуальный вклад в уровни разнообразия по всему геному. Наш анализ подтверждает устоявшийся признак связанного отбора у Drosophila melanogaster , но мы считаем, что мутационный ландшафт является основным фактором распределения разнообразия по всему геному у этого вида. Более того, результаты нашего моделирования показывают, что во многих эволюционных сценариях мутационный ландшафт будет решающим фактором, формирующим разнообразие, в частности, в зависимости от размера геномного окна. Мы утверждаем, что включение вариаций частоты мутаций в нулевую модель молекулярной эволюции приведет к более надежным выводам в популяционной геномике.

Последовательное марковское слияние, вариация скорости мутаций, связанный отбор, Drosophila melanogaster

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

果蝇的核苷酸多样性景观是由突变率变化决定的

什么决定了基因组中核苷酸多样性的分布？回答这个问题的尝试引发了关于中性随机过程和自然选择在分子进化中的作用的争论。然而，进化机制并不是孤立作用的，缺乏同时考虑多种因素对多样性影响的综合模型；如果没有它们，估计中就会潜藏着混杂因素。在这里，我们提出了一种新的统计方法，可以联合推断谱系、重组率和突变率的基因组景观。在此过程中，我们的模型捕获了遗传漂变、连锁选择和局部突变率对基因组变异模式的影响。然后，我们形式化了这些微观进化力量如何相互作用的因果模型，并将其作为线性回归来估计它们对基因组多样性水平的个体贡献。我们的分析重申了黑腹果蝇中已确立的连锁选择特征，但我们估计突变景观是该物种全基因组多样性分布的主要驱动力。此外，我们的模拟结果表明，在许多进化场景中，突变景观将是塑造多样性的关键因素，尤其取决于基因组窗口的大小。我们认为，将突变率变异纳入分子进化的无效模型将在群体基因组学中产生更稳健的推论。

顺序马尔可夫聚结，突变率变异，连锁选择，果蝇

Submission: posted 30 October 2022, validated 31 October 2022
Recommendation: posted 13 April 2023, validated 13 April 2023

Cite this recommendation as:
Racimo, F. (2023) An unusual suspect: the mutation landscape as a determinant of local variation in nucleotide diversity. Peer Community in Evolutionary Biology, 100636. https://doi.org/10.24072/pci.evolbiol.100636

Recommendation

Sometimes, important factors for explaining biological processes fall through the cracks, and it is only through careful modeling that their importance eventually comes out to light. In this study, Barroso and Dutheil introduce a new method based on the sequentially Markovian coalescent (SMC, Marjoran and Wall 2006) for jointly estimating local recombination and coalescent rates along a genome. Unlike previous SMC-based methods, however, their method can also co-estimate local patterns of variation in mutation rates.

This is a powerful improvement which allows them to tackle questions about the reasons for the extensive variation in nucleotide diversity across the chromosomes of a species - a problem that has plagued the minds of population geneticists for decades (Begun and Aquadro 1992, Andolfatto 2007, McVicker et al., 2009, Pouyet and Gilbert 2021). The authors find that variation in de novo mutation rates appears to be the most important factor in determining nucleotide diversity in Drosophila melanogaster. Though seemingly contradicting previous attempts at addressing this problem (Comeron 2014), they take care to investigate and explain why that might be the case.

Barroso and Dutheil have also taken care to carefully explain the details of their new approach and have carried a very thorough set of analyses comparing competing explanations for patterns of nucleotide variation via causal modeling. The reviewers raised several issues involving choices made by the authors in their analysis of variance partitioning, the proper evaluation of the role of linked selection and the recombination rate estimates emerging from their model. These issues have all been extensively addressed by the authors, and their conclusions seem to remain robust. The study illustrates why the mutation landscape should not be ignored as an important determinant of local variation in genetic diversity, and opens up questions about the generalizability of these results to other organisms.

REFERENCES

Andolfatto, P. (2007). Hitchhiking effects of recurrent beneficial amino acid substitutions in the Drosophila melanogaster genome. Genome research, 17(12), 1755-1762. https://doi.org/10.1101/gr.6691007

Barroso, G. V., & Dutheil, J. Y. (2021). The landscape of nucleotide diversity in Drosophila melanogaster is shaped by mutation rate variation. bioRxiv, 2021.09.16.460667, ver. 3 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2021.09.16.460667

Begun, D. J., & Aquadro, C. F. (1992). Levels of naturally occurring DNA polymorphism correlate with recombination rates in D. melanogaster. Nature, 356(6369), 519-520. https://doi.org/10.1038/356519a0

Comeron, J. M. (2014). Background selection as baseline for nucleotide variation across the Drosophila genome. PLoS Genetics, 10(6), e1004434. https://doi.org/10.1371/journal.pgen.1004434

Marjoram, P., & Wall, J. D. (2006). Fast" coalescent" simulation. BMC genetics, 7, 1-9. https://doi.org/10.1186/1471-2156-7-16

McVicker, G., Gordon, D., Davis, C., & Green, P. (2009). Widespread genomic signatures of natural selection in hominid evolution. PLoS genetics, 5(5), e1000471. https://doi.org/10.1371/journal.pgen.1000471

Pouyet, F., & Gilbert, K. J. (2021). Towards an improved understanding of molecular evolution: the relative roles of selection, drift, and everything in between. Peer Community Journal, 1, e27. https://doi.org/10.24072/pcjournal.16

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Funding:
This work was supported by a grant from the German Research Foundation (Deutsche Forschungsgemeinschaft) attributed to JYD, within the priority program (SPP) 1590 “probabilistic structures in evolution”

Reviews

Evaluation round #2

DOI or URL of the preprint: https://doi.org/10.1101/2021.09.16.460667

Version of the preprint: 2

Author's Reply, 02 Apr 2023

Dear Recommander,

We thank you and the reviewer for pointing out ways in which we could improve the presentation of our results. We were able to incorporate all suggestions into the updated version of our manuscript: Figure S1 now includes the demography inferred by iSMC (piecewise constant within time intervals, mapped from splines parameters) as well as the smoothed sketch used for coalescent simulations; Greek letters are now used in the legends of Figures 4 and 5; Panel titles for Figure 5 have been updated to reflect that block lengths of constant mutation rate represent averages from a geometric distribution; Tables 1 and S2 now include visual guides. Clarifications regarding these changes were added in lines 696-703. Comments from reviewer 1 are addressed below.

We are thankful that the reviewer considers our work a contribution to the field, and we agree that our model has shortcomings. For example, in line 532 we mention how it can be improved in the future. We also reiterate that the binning of genomic landscapes into windows is independent from genome-wide parameter estimation and do not see it as a step back in terms of interpretability (we tried to clarify this further in lines 688-694). On the contrary, that our linear model explains >99% of the distribution of diversity along the genome is evidence that our framework is adequate to describe the effects of drift, mutation, recombination and linked selection on patterns of DNA variation. Thus we can use typical procedures like ANOVA and standardized coefficients to assess the impact of each micro-evolutionary mechanism on levels of diversity. These are rather easy to interpret in terms of relative importance. In the end, our findings are not incompatible with the existing literature because previous studies on linked selection focused on its relative importance after removing the effect of mutation rate variation, and also because there are, in fact, studies highlighting the importance of mutation rate variation, although for the most part they have been wrongfully ignored. We provide citations to some of these throughout the main text.

https://doi.org/10.24072/pci.evolbiol.100636.ar2

Decision by Fernando Racimo, posted 20 Mar 2023, validated 20 Mar 2023

Dear authors,

The reviewers have read your replies to them and are (mostly) satisfied with them. Thank you for your answers to the queries by me and the reviewers.

I don't think it will be necessary for the reviewers to see you manuscript again, as long as the points below are addressed.

Best,
Fernando

1. It's unclear whether Supplemental Figure S1 is a direct result of your inference, or a "sketch" (inspired by what?) that you then used in simulations. As reviewer 2 rightly points out (R2.3), there should be a global TMRCA emerging from this analysis, even if not the central focus of your study. If this can't be produced for some reason, a better explanation for that reason should be available in the text.
2. Please improve the resolution of Supplemental Figure S1 so that the x-axis can be read.
3. Figure 5 warrants more explanation as to what is being depicted here exactly. For example, does "mu block ~500kb" imply that the mutation rate was simulated so as to vary in blocks of (approximately?) 500 kb? In the text it says exactly 500 kb. Also, could you replace "TMRCA" for "tau", and use the greek symbols in the figure as you use in the text?
4. Can you draw horizontal lines in Table 1 and Table S2 to help the reader figure out when one model ends and another begins?
5. Can you address this comment by the anonymous reviewer in your text? "Couldn’t lower autocorrelation instead result not from frequent variation in recombination rate window-to-window, but relatively few windows with extreme shifts in recombination rate relative to their neighboring windows?"

https://doi.org/10.24072/pci.evolbiol.100636.d2

Reviewed by David Castellano, 17 Mar 2023

Barroso and Dutheil have addressed my main concerns and clarified the issues I raised in my previous review. I do not have any further comments.

https://doi.org/10.24072/pci.evolbiol.100636.rev21

Reviewed by anonymous reviewer 1, 17 Mar 2023

Download the review https://doi.org/10.24072/pci.evolbiol.100636.rev22

Evaluation round #1

DOI or URL of the preprint: https://doi.org/10.1101/2021.09.16.460667

Version of the preprint: 1

Author's Reply, 16 Feb 2023

Download author's reply https://doi.org/10.24072/pci.evolbiol.100636.ar1

Decision by Fernando Racimo, posted 09 Dec 2022, validated 13 Dec 2022

In this manuscript, Barroso & Dutheil present a new method for co-estimating local recombination rates, local mutation rates and local effective population sizes along the genome, and then apply it to a Drosophila melanogaster haploid genome panel from Zambia. They find a strong role for local variation in mutation rate on variation in local patterns of diversity along the genome - a finding that appears to reach contradictory conclusions to previous approaches to the question of the major determinants of local diversity. The paper is well written, and I agree with the reviewers that the approach is innovative and elegant. I also think the methodology is very well explained. I have some concerns about the robustness of the biological conclusions, and their dependence on particular decisions by the authors. The first reviewers' point about the size of analysis windows should be further explored, and the authors could do a more thorough test into the role of linked selection using simulations. The second reviewer also raised some important points about how the chosen shape for the DFE could influence parameters estimation, and about how the recombination rate estimates could be compared to empirical estimates. I would be happy to recommend this manuscript once these concerns are addressed.

https://doi.org/10.24072/pci.evolbiol.100636.d1

Reviewed by anonymous reviewer 1, 06 Dec 2022

Download the review https://doi.org/10.24072/pci.evolbiol.100636.rev11

Reviewed by David Castellano, 28 Nov 2022

In this manuscript, Barroso & Dutheil propose an extension of a statistical method that jointly infers the genomic landscape of genealogies (or "local Ne"), recombination rates and mutation rates. They benchmark the method with simulations and apply it to Drosophila melanogaster from Zambia. They find that, at the genomic window lengths that they analyze (50Kb, 200Kb and 1Mb), the mutation landscape seems to be the most important determinant of the levels of genetic diversity along the Drosophila genome. This conclusion is somehow contradicting Comeron 2014, where he concluded that the genetic diversity landscape is mostly affected by linked selection (or tau, or TMRCA, using Barroso & Dutheil terminology). However, the authors do a good job of reasoning why both studies seem to reach contradictory conclusions. This manuscript is relevant to the population genomics community because it makes available a powerful tool and it brings back to the spot light the mutation landscape. I agree that the mutation landscape is often an overlooked ingredient to explain the genetic diversity landscape within a genome.

I've divided this revision into 4 sections.

1. Is the science sound, with a logical narrative and well-supported results and conclusions?

The manuscript follows a logical narrative and the methods are sound. The simulations and benchmarking are convincing. The literature context provided in the introduction and discussion is very helpful (below I suggest a couple of more papers to back up some ideas tho). However, some aspects require further clarification.

1.1 I do not think that the partial R^2 is a good way to assess the relative importance of each variable (mutation rate, recombination rate and TMRCA) on the levels of genetic diversity. The standardized regression coefficients, which are the regression coefficients obtained from estimating a model on the standardized variables (mean = 0, standard deviation = 1), are better suited for this job IMO. I would also suggest reporting the variance inflation factors in Table 1.

1.2 I wonder why the authors do not try to validate their mutation rate and recombination rate estimates using empirical measures. I understand that perhaps empirical measures of the mutation rate (using mutation accumulation inbreed lines?) might be hard to find, but the empirical recombination landscape from Comeron is publicly available. Moreover, how the global or genome-wide TMRCA inferred with the new method compares to previous demographic estimates in this population? If they are not similar then what can be the cause?

1.3 Related to the previous point. For future implementations maybe in the Discussion. Could it be possible to plug-in empirical mutation and recombination landscapes to infer the local TMRCA (or "local Ne")?

1.4 Regarding the DFE of Drosophila. The authors simulate a shape = 1 (row 654), this is equivalent to an exponential distribution and will produce way more weakly deleterious mutations than the ones expected when the shape = 0.3-0.4 (which is the value more commonly estimated in the literature for this species, see https://academic.oup.com/mbe/article/35/11/2685/5078937 and others). This excess of weakly deleterious mutations could explain, I believe, the results explained from rows 406 to 415.

1.5 Suggested literature.
1.5.1 In here (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1575889/) Spencer et al. 2006 use "wavelet techniques to identify correlations acting at different scales" which in essence is very similar to Barroso & Dutheil work.
1.5.2 In here (https://onlinelibrary.wiley.com/doi/10.1002/bies.201200150) Martincorena and Luscombe 2012 review "the main forces driving the evolution of local mutation rates and identify the main limiting factors" which might be relevant for sentences in rows 495-500.
1.5.3 Bear in mind that Castellano et al. 2018a was already published in GBE "long" ago. The cited preprinted version might be outdated.

1.6 Could the authors provide some intuition about the "five mutation rate classes, five recombination rate classes and 30 coalescent time intervals" used? Why this setting and not another one? How relevant is this choice in downstream analyses?

2. Is there enough info to allow verifying and reproducing the data?

The supplementary information, plus the scripts, are easy to access.

3. Are there obscure passages that you (or a potential reader) can’t go through?

3.1 I do not understand the first sentence of the paragraph starting at row 154.
3.2 In figures 4 and 5 it might be helpful to scale the y-axes in log units?
3.3 Typo at row 245 "the contribution of contribution"?

4. Potential extra analysis only if interesting enough to the recommender and/or author:

Relevant to rows 346-355. I am just curious to know how much tau (or TMRCA) varies along the genome in the absence of selection compared to the presence of selection. Could the authors show some density plots in both scenarios? I think it is often assumed that in the absence of selection genetic diversity is entirely explained by the mutation landscape. Still, it seems that the TMRCA (and genetic diversity) can vary along the genome stochastically in the presence of recombination. This is an interesting finding that should be further highlighted.

https://doi.org/10.24072/pci.evolbiol.100636.rev12

User comments

No user comments yet

or Register
Submit a preprint