Recommendation

Simulating bacterial evolution forward-in-time

Frederic Bertels based on reviews by 3 anonymous reviewers

A recommendation of:

Simulation of bacterial populations with SLiM

Jean Cury, Benjamin C. Haller, Guillaume Achaz, and Flora Jay (2021), bioRxiv, 2020.09.28.316869, ver. 5 peer-reviewed and recommended by Peer Community in Evolutionary Biology https://doi.org/10.1101/2020.09.28.316869

Read preprint in preprint server Now published in Peer Community Journal

Codes used in this study

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

Simulation of bacterial populations with SLiM

Simulation of genomic data is a key tool in population genetics, yet, to date, there is no forward-in-time simulator of bacterial populations that is both computationally efficient and adaptable to a wide range of scenarios. Here we demonstrate how to simulate bacterial populations with SLiM, a forward-in-time simulator built for eukaryotes. SLiM has gained many users in recent years, due to its speed and power, and has extensive documentation showcasing various scenarios that it can simulate. This paper focuses on a simple demographic scenario, to explore unique aspects of modeling bacteria in SLiM's scripting language. In addition, we illustrate the flexibility of SLiM by simulating the growth of bacteria on a Petri dish with antibiotic. To foster the development of bacterial simulations based upon this recipe, we explain the inner workings of its code. We also validate the simulator, by extensively testing the results of simulations against existing simulators, and against theoretical expectations for some summary statistics. This protocol, with the flexibility and power of SLiM, will enable the community to simulate bacterial populations efficiently under a wide range of evolutionary scenarios.

Bacteria, population genetics, simulations, forward-in-time simulator, SLiM

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

محاكاة التجمعات البكتيرية باستخدام SLiM

تعد محاكاة البيانات الجينومية أداة رئيسية في علم الوراثة السكانية، ومع ذلك، حتى الآن، لا يوجد محاكاة مستقبلية للمجموعات البكتيرية تتسم بالكفاءة الحسابية وقابلة للتكيف مع مجموعة واسعة من السيناريوهات. نوضح هنا كيفية محاكاة التجمعات البكتيرية باستخدام SLiM، وهو جهاز محاكاة تقدمي في الزمن مصمم لحقيقيات النوى. لقد اكتسب SLiM العديد من المستخدمين في السنوات الأخيرة، نظرًا لسرعته وقوته، ولديه وثائق واسعة النطاق تعرض سيناريوهات مختلفة يمكنه محاكاتها. تركز هذه الورقة على سيناريو ديموغرافي بسيط، لاستكشاف الجوانب الفريدة لنمذجة البكتيريا في لغة البرمجة النصية SLiM. بالإضافة إلى ذلك، نوضح مرونة SLiM من خلال محاكاة نمو البكتيريا على طبق بيتري باستخدام المضادات الحيوية. لتعزيز تطوير عمليات المحاكاة البكتيرية بناءً على هذه الوصفة، قمنا بشرح الأعمال الداخلية لشفرتها. نحن أيضًا نتحقق من صحة جهاز المحاكاة، من خلال اختبار نتائج عمليات المحاكاة على نطاق واسع مقابل أجهزة المحاكاة الموجودة، ومقابل التوقعات النظرية لبعض الإحصائيات الموجزة. هذا البروتوكول، مع مرونة وقوة SLiM، سيمكن المجتمع من محاكاة مجموعات البكتيريا بكفاءة في ظل مجموعة واسعة من السيناريوهات التطورية.

البكتيريا، علم الوراثة السكانية، المحاكاة، محاكاة التقدم في الزمن، SLiM

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Simulación de poblaciones bacterianas con SLiM.

La simulación de datos genómicos es una herramienta clave en genética de poblaciones; sin embargo, hasta la fecha, no existe ningún simulador de poblaciones bacterianas en el tiempo que sea computacionalmente eficiente y adaptable a una amplia gama de escenarios. Aquí demostramos cómo simular poblaciones bacterianas con SLiM, un simulador de avance en el tiempo creado para eucariotas. SLiM ha ganado muchos usuarios en los últimos años debido a su velocidad y potencia, y cuenta con una amplia documentación que muestra varios escenarios que puede simular. Este artículo se centra en un escenario demográfico simple, para explorar aspectos únicos del modelado de bacterias en el lenguaje de programación de SLiM. Además, ilustramos la flexibilidad de SLiM simulando el crecimiento de bacterias en una placa de Petri con antibiótico. Para fomentar el desarrollo de simulaciones bacterianas basadas en esta receta, explicamos el funcionamiento interno de su código. También validamos el simulador, probando exhaustivamente los resultados de las simulaciones con simuladores existentes y con expectativas teóricas para algunas estadísticas resumidas. Este protocolo, con la flexibilidad y el poder de SLiM, permitirá a la comunidad simular poblaciones bacterianas de manera eficiente en una amplia gama de escenarios evolutivos.

Bacterias, genética de poblaciones, simulaciones, simulador de avance en el tiempo, SLiM.

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Simulation de populations bactériennes avec SLiM

La simulation des données génomiques est un outil clé en génétique des populations. Pourtant, à ce jour, il n'existe aucun simulateur avancé dans le temps des populations bactériennes qui soit à la fois efficace sur le plan informatique et adaptable à un large éventail de scénarios. Nous montrons ici comment simuler des populations bactériennes avec SLiM, un simulateur avancé dans le temps conçu pour les eucaryotes. SLiM a gagné de nombreux utilisateurs ces dernières années, en raison de sa vitesse et de sa puissance, et dispose d'une documentation complète présentant divers scénarios qu'il peut simuler. Cet article se concentre sur un scénario démographique simple, pour explorer les aspects uniques de la modélisation des bactéries dans le langage de script de SLiM. De plus, nous illustrons la flexibilité de SLiM en simulant la croissance de bactéries sur une boîte de Pétri avec un antibiotique. Pour favoriser le développement de simulations bactériennes basées sur cette recette, nous expliquons le fonctionnement interne de son code. Nous validons également le simulateur en testant de manière approfondie les résultats des simulations par rapport aux simulateurs existants et par rapport aux attentes théoriques pour certaines statistiques récapitulatives. Ce protocole, doté de la flexibilité et de la puissance de SLiM, permettra à la communauté de simuler efficacement des populations bactériennes dans un large éventail de scénarios évolutifs.

Bactéries, génétique des populations, simulations, simulateur à avance dans le temps, SLiM

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

SLiM के साथ जीवाणु आबादी का अनुकरण

जनसंख्या आनुवंशिकी में जीनोमिक डेटा का अनुकरण एक महत्वपूर्ण उपकरण है, फिर भी, आज तक, जीवाणु आबादी का कोई फॉरवर्ड-इन-टाइम सिम्युलेटर नहीं है जो कम्प्यूटेशनल रूप से कुशल और परिदृश्यों की एक विस्तृत श्रृंखला के लिए अनुकूल हो। यहां हम प्रदर्शित करते हैं कि यूकेरियोट्स के लिए बनाए गए फॉरवर्ड-इन-टाइम सिम्युलेटर SLiM के साथ बैक्टीरिया की आबादी का अनुकरण कैसे किया जाए। SLiM ने अपनी गति और शक्ति के कारण हाल के वर्षों में कई उपयोगकर्ता प्राप्त किए हैं, और इसके पास विभिन्न परिदृश्यों को प्रदर्शित करने वाले व्यापक दस्तावेज़ हैं जिनका यह अनुकरण कर सकता है। यह पेपर SLiM की स्क्रिप्टिंग भाषा में बैक्टीरिया मॉडलिंग के अनूठे पहलुओं का पता लगाने के लिए एक सरल जनसांख्यिकीय परिदृश्य पर केंद्रित है। इसके अलावा, हम एंटीबायोटिक के साथ पेट्री डिश पर बैक्टीरिया के विकास का अनुकरण करके एसएलआईएम के लचीलेपन का वर्णन करते हैं। इस नुस्खे के आधार पर बैक्टीरिया सिमुलेशन के विकास को बढ़ावा देने के लिए, हम इसके कोड की आंतरिक कार्यप्रणाली की व्याख्या करते हैं। हम मौजूदा सिमुलेटरों के विरुद्ध सिमुलेशन के परिणामों और कुछ सारांश आँकड़ों के लिए सैद्धांतिक अपेक्षाओं के विरुद्ध बड़े पैमाने पर परीक्षण करके सिम्युलेटर को मान्य भी करते हैं। यह प्रोटोकॉल, SLiM के लचीलेपन और शक्ति के साथ, समुदाय को विकासवादी परिदृश्यों की एक विस्तृत श्रृंखला के तहत बैक्टीरिया की आबादी का कुशलतापूर्वक अनुकरण करने में सक्षम करेगा।

बैक्टीरिया, जनसंख्या आनुवंशिकी, सिमुलेशन, फॉरवर्ड-इन-टाइम सिम्युलेटर, एसएलआईएम

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

SLiMによる細菌集団のシミュレーション

ゲノムデータのシミュレーションは集団遺伝学における重要なツールですが、これまでのところ、計算効率が高く、幅広いシナリオに適応できる細菌集団のフォワードインタイムシミュレーターは存在しません。ここでは、真核生物用に構築された順方向シミュレーターである SLiM を使用して細菌集団をシミュレートする方法を示します。 SLiM は、そのスピードとパワーにより近年多くのユーザーを獲得しており、シミュレーションできるさまざまなシナリオを紹介する広範なドキュメントを備えています。この論文は、SLiM のスクリプト言語での細菌のモデリングのユニークな側面を探るために、単純な人口統計シナリオに焦点を当てています。さらに、抗生物質を含むペトリ皿上での細菌の増殖をシミュレートすることにより、SLiM の柔軟性を示します。このレシピに基づいた細菌シミュレーションの開発を促進するために、コードの内部動作について説明します。また、シミュレーションの結果を既存のシミュレータと比較し、いくつかの要約統計量の理論的期待に照らして広範にテストすることにより、シミュレータを検証します。このプロトコルは、SLiM の柔軟性と能力を備えており、コミュニティが幅広い進化シナリオの下で細菌集団を効率的にシミュレートできるようになります。

細菌、集団遺伝学、シミュレーション、フォワード・イン・タイム・シミュレーター、SLiM

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Simulação de populações bacterianas com SLiM

A simulação de dados genômicos é uma ferramenta fundamental em genética populacional, mas, até o momento, não existe um simulador de populações bacterianas que seja computacionalmente eficiente e adaptável a uma ampla gama de cenários. Aqui demonstramos como simular populações bacterianas com SLiM, um simulador avançado no tempo construído para eucariotos. O SLiM conquistou muitos usuários nos últimos anos, devido à sua velocidade e potência, e possui extensa documentação apresentando diversos cenários que pode simular. Este artigo se concentra em um cenário demográfico simples, para explorar aspectos únicos da modelagem de bactérias na linguagem de script do SLiM. Além disso, ilustramos a flexibilidade do SLiM simulando o crescimento de bactérias em uma placa de Petri com antibiótico. Para promover o desenvolvimento de simulações bacterianas baseadas nesta receita, explicamos o funcionamento interno do seu código. Também validamos o simulador, testando extensivamente os resultados das simulações em relação aos simuladores existentes e em relação às expectativas teóricas para algumas estatísticas resumidas. Este protocolo, com a flexibilidade e o poder do SLiM, permitirá à comunidade simular populações bacterianas de forma eficiente sob uma ampla gama de cenários evolutivos.

Bactérias, genética populacional, simulações, simulador forward-in-time, SLiM

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Моделирование бактериальных популяций с помощью SLiM

Моделирование геномных данных является ключевым инструментом в популяционной генетике, однако на сегодняшний день не существует перспективного симулятора бактериальных популяций, который был бы одновременно эффективным в вычислительном отношении и адаптируемым к широкому спектру сценариев. Здесь мы демонстрируем, как моделировать бактериальные популяции с помощью SLiM, симулятора, ориентированного на будущее, созданного для эукариот. В последние годы SLiM приобрел множество пользователей благодаря своей скорости и мощности, а также имеет обширную документацию, демонстрирующую различные сценарии, которые он может моделировать. В этой статье основное внимание уделяется простому демографическому сценарию, целью которого является изучение уникальных аспектов моделирования бактерий на языке сценариев SLiM. Кроме того, мы иллюстрируем гибкость SLiM, моделируя рост бактерий на чашке Петри с антибиотиком. Чтобы способствовать развитию бактериальных симуляций на основе этого рецепта, мы объясняем внутреннюю работу его кода. Мы также проверяем симулятор, тщательно проверяя результаты моделирования на существующих симуляторах и на соответствие теоретическим ожиданиям некоторых сводных статистических данных. Этот протокол, обладающий гибкостью и мощью SLiM, позволит сообществу эффективно моделировать бактериальные популяции в широком диапазоне эволюционных сценариев.

Бактерии, популяционная генетика, моделирование, симулятор движения вперед во времени, SLiM

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

使用 SLiM 模拟细菌种群

基因组数据的模拟是群体遗传学的关键工具，但迄今为止，还没有一种既计算高效又适用于各种场景的实时细菌群体模拟器。在这里，我们演示了如何使用 SLiM 来模拟细菌种群，SLiM 是一种为真核生物构建的实时模拟器。近年来，SLiM 因其速度和功能而赢得了许多用户，并且拥有大量文档展示了它可以模拟的各种场景。本文重点关注简单的人口统计场景，探索使用 SLiM 脚本语言对细菌进行建模的独特方面。此外，我们通过用抗生素模拟培养皿上细菌的生长来说明 SLiM 的灵活性。为了促进基于此配方的细菌模拟的发展，我们解释了其代码的内部工作原理。我们还通过针对现有模拟器以及一些汇总统计数据的理论预期广泛测试模拟结果来验证模拟器。该协议具有 SLiM 的灵活性和强大功能，将使社区能够在各种进化场景下有效地模拟细菌种群。

细菌、群体遗传学、模拟、实时模拟器、SLiM

Submission: posted 02 October 2020
Recommendation: posted 15 February 2021, validated 04 March 2021

Cite this recommendation as:
Bertels, F. (2021) Simulating bacterial evolution forward-in-time. Peer Community in Evolutionary Biology, 100123. https://doi.org/10.24072/pci.evolbiol.100123

Recommendation

Jean Cury and colleagues (2021) have developed a protocol to simulate bacterial evolution in SLiM. In contrast to existing methods that depend on the coalescent, SLiM simulates evolution forward in time. SLiM has, up to now, mostly been used to simulate the evolution of eukaryotes (Haller and Messer 2019), but has been adapted here to simulate evolution in bacteria. Forward-in-time simulations are usually computationally very costly. To circumvent this issue, bacterial population sizes are scaled down. One would now expect results to become inaccurate, however, Cury et al. show that scaled-down forwards simulations provide very accurate results (similar to those provided by coalescent simulators) that are consistent with theoretical expectations. Simulations were analyzed and compared to existing methods in simple and slightly more complex scenarios where recombination affects evolution. In all scenarios, simulation results from coalescent methods (fastSimBac (De Maio and Wilson 2017), ms (Hudson 2002)) and scaled-down forwards simulations were very similar, which is very good news indeed.

A biologist not aware of the complexities of forwards, backwards simulations and the coalescent, might now naïvely ask why another simulation method is needed if existing methods perform just as well. To address this question the manuscript closes with a very neat example of what exactly is possible with forwards simulations that cannot be achieved using existing methods. The situation modeled is the growth and evolution of a set of 50 bacteria that are randomly distributed on a petri dish. One side of the petri dish is covered in an antibiotic the other is antibiotic-free. Over time, the bacteria grow and acquire antibiotic resistance mutations until the entire artificial petri dish is covered with a bacterial lawn. This simulation demonstrates that it is possible to simulate extremely complex (e.g. real world) scenarios to, for example, assess whether certain phenomena are expected with our current understanding of bacterial evolution, or whether there are additional forces that need to be taken into account. Hence, forwards simulators could significantly help us to understand what current models can and cannot explain in evolutionary biology.

References

Cury J, Haller BC, Achaz G, Jay F (2021) Simulation of bacterial populations with SLiM. bioRxiv, 2020.09.28.316869, version 5 peer-reviewed and recommended by Peer community in Evolutionary Biology. https://doi.org/10.1101/2020.09.28.316869

De Maio N, Wilson DJ (2017) The Bacterial Sequential Markov Coalescent. Genetics, 206, 333–343. https://doi.org/10.1534/genetics.116.198796

Haller BC, Messer PW (2019) SLiM 3: Forward Genetic Simulations Beyond the Wright–Fisher Model. Molecular Biology and Evolution, 36, 632–637. https://doi.org/10.1093/molbev/msy228

Hudson RR (2002) Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics, 18, 337–338. https://doi.org/10.1093/bioinformatics/18.2.337

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Reviews

Evaluation round #3

DOI or URL of the preprint: 10.1101/2020.09.28.316869

Version of the preprint: 3

Author's Reply, 10 Feb 2021

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.evolbiol.100356.ar3

Decision by Frederic Bertels, posted 11 Jan 2021

Dear authors,

please revise your manuscript according to the reviewers' comments. Please make sure to address and reply to every comment.

https://doi.org/10.24072/pci.evolbiol.100356.d3

Reviewed by anonymous reviewer 1, 06 Jan 2021

I want to thank the authors for improving a lot their manuscript. I highly appreciate your effort.
I hope you agree that your second version is more readable and precise than the first one.
In my case, almost all the comments I had have been answered.
-You changed the abstract
- You showcase an example
- you clarify gene conversion term
- the importance of considering a circular genome etc...

Two major suggestions:
However, I would like to suggest focusing on two critical aspects of the paper:
1. the figure legends are better, but they do not directly convey the entire message.
For example when I see Figure 1 I am not sure if rescaling factors have also been applied in FastSimBac and ms.

Would be a big plus to include a summary table of your results.

It's a bit of a pain to read through the supplementary figures. Low quality, tiny labels. It's not easy to go through them.

One minor suggestion:
3. This is a minor suggestion: when you show the code on page 6 and 7 avoid better this format.
Use a grey shaded area similar to StackOverflow when you present a code case, so as the reader can copy paste and test the code easily. That's an aesthetic suggestion.

In any case,
Thank you for your effort.
Happy new year

https://doi.org/10.24072/pci.evolbiol.100356.rev31

Reviewed by anonymous reviewer 2, 18 Dec 2020

I have taken a look at the authors' responses to my original comments and found them sound.

https://doi.org/10.24072/pci.evolbiol.100356.rev32

Reviewed by anonymous reviewer 3, 06 Jan 2021

This revised manuscript is partly incorporating the reviewer's suggestions. Most of my points were taken care of, however, I see room for improvement for the description of the simulation, the clarity of the figure, and the explanation of the burn-in phase. My largest remaining concern is the discussion of selected recombination events. Here are my detailed points:

An overview figure is provided, however it is only in the supplement and it does not visualise the simulation parameters (Ne, generations, rho, tractlen, genomesize, hgtrate).

An additional simulation of bacteria under antibiotics on a Petri dish is now included. The choice of parameters is quite arbitrary, e.g., the antibiotic is reducing the fitness only to 0.47, I guess lower values are more realistic. However, the simulation should simply display an example of application and it serves this purpose. Nevertheless, method's details are still missing. How does the spatial model work, how is the neighbourhood of a bacterium defined? I guess this is a standard model, so references would help here a lot. I am also missing the information on recombination rate and tract length for this simulation.

I had provided several references for the discussion of realistic recombination tract lengths. Nevertheless, the authors decided to not discuss the range of recombination tract lengths in the manuscript and point to their simulations of length 1220bp 12,200bp and 122,000bp. I had also pointed out that their initial choice of a recombination tract length of 122kbp is based on "selected" recombination events. Thus, this estimate is not a good choice for simulations which should be based on unselected events. The authors ignored this point in their answer. I understand, that at this point of the manuscript it is not feasible to repeat the simulations, but the discussion of unselected vs. selected recombination events should at least be mentioned in the discussion. Otherwise the reference to the S. agalactiae length is misleading.

The authors added more information on the burn-in phase and also display it in the figure. As I understand, in the WF model, the burn-in has to be run before the SLiM simulation, however, with the nonWF model it is possible to run it afterwards only on the individuals that have descendants (as displayed in Fig. 8A). However, how can SLiM be run with selection if the diversity and the mutations are not yet clear at the start of the simulation? How is the fitness of the individuals known? I think I am missing a piece of information here.

https://doi.org/10.24072/pci.evolbiol.100356.rev33

Evaluation round #2

DOI or URL of the preprint: 10.1101/2020.09.28.316869

Version of the preprint: 2

Author's Reply, 08 Dec 2020

Dear managing board members,

First of all we would like to thank everyone, board members, recommenders and reviewers for their work in general and for the time spent on our manuscript. On Monday November, 30th, we sent our answers to the reviewers’ comment along with an updated version of our manuscript “Simulation of bacterial population with SLiM”. About 14 hours later, we receive a notification from Peer Community In Evolutionary Biology, informing us that our paper would not be recommended. To our surprise, the recommender did not send our updated paper and replies to the reviewers and categorically rejected our paper based on critics that appear unfair to us. Hence, we would like to appeal this decision for the following reasons. The main reason for the recommender to reject our paper appears to be that we do not demonstrate that using SLiM does open new simulation possibilities, and to do so, we should, in addition to our paper, “simply solve an interesting biological problem”. The recommender adds “The reviewers and I feel like the value of the paper really hangs on the detailed analysis of such an example”. First we added this example, showcasing new simulation possibilities, as asked by a reviewer, who even suggested to add it in the discussion. Second we doubt that the reviewers saw this new example since the paper was not sent for another round of review. We do not intend to take the paper in that direction, because our study belongs to a “method and software” category and it not our intention to turn it into a “novel biological insights” type of manuscript by deepening the analysis of specific cases. Given PCI’s policy about the scope of a paper (”No need to examine whether the article falls within the scope of the PCI. Once a submission has been validated by the managing board, it is considered suitable for the PCI”), we do not understand the request of including a novel and detailed analysis of an “interesting biological problem". Besides, PCI Evol Biol precises that “Studies of methodologies for evolutionary biology are also appreciated”. Moreover, it is very solidly established and uncontroversial to say that forward simulation allows many important types of models to be run that are analytically intractable and cannot be simulated with the coalescent. This is the direction that part of the field of population genetics is moving in, as the field realizes how limiting and unrealistic analytical models and the coalescent can be (even while they are obviously fast and powerful, and certainly have their uses). We show how to do such forward simulations for bacteria, for which no efficient and flexible simulator exists in 2020, and demonstrate the power of our method by showing a simple spatial model with environmental pressure and ongoing selection that could never be run with the coalescent, and that could obviously be used or extended to address all sorts of interesting questions. The other reasons advanced by the recommender to reject our paper are the following :

(i) A critic that the novel experiment is not reproducible, yet we included the new model in our Github repository associated with this paper at the time of submitting the review; to fully clarify this we now have added a more explicit sentence in the figure caption itself, and in the methods section.

(ii) The recommender was confused about the presence of negative values on a plot with positively defined data. We understood that the recommender was concerned that showing the mean minus std was confusing for the readers because it has negative values, while the empirical values are all positives, so we removed this side of the theoretical expectation (the theoretical line only), as asked. However, we did not understand from the comment that there was an incomprehension: the mean minus standard deviation of a distribution can be negative even if all values are positives. Although not fully informative about a distribution, showing mean +/- std is a common usage. We want to stress here that no experimental points were removed from the figures.

(iii) A missing supplementary figure and another supplementary figure that we unfortunately forgot to update.

(iv) Vague comments such as we “did not address many points the reviewers have made” without detailing which points, while in our opinion we had addressed all the points, or “the figure legends have not improved”, although we did change the figures and figure captions to include the first round of comments.

(v) Surprisingly, the recommender also questions the meaning of a term in a new supplementary figure. This term is not correctly reported by the recommender (”burn-in through capitation” instead of “burn-in through recapitation”), and is a term that appears (with its derivatives) no less than 12 times in the main text, including one section title (”Recapitating and adding neutral mutation”). The concept behind this term is explained in detail in different sections of the paper, including the above-mentioned dedicated section. We do not believe these are fair and constructive critics that could justify a rejection. We did not identify in the first (and only) round of reviews a single comment that would question the scientific quality of our work, nor was the request of solving a biological problem a mandatory one. Most of the reviewers’ comments were based on incomprehension or technical questions, and we believe we have addressed those carefully. For all of these reasons, we are appealing the decision of the recommender and of the managing board to reject bluntly our revised manuscript. Please find attached, the updated version of the manuscript (after the recommender’s comments), which is now online on bioRxiv (version 3).

Sincerely,

Jean Cury and co-authors

https://doi.org/10.24072/pci.evolbiol.100356.ar2

Decision by Frederic Bertels, posted 08 Dec 2020

Dear authors,

Thank you for submitting the revised version of the manuscript. Although, I feel like the current version is an improvement, I cannot recommend it for the following reasons:

1. In your reply you wrote “The adaptation of SLiM presented here is not aimed at competing with FastSimBac or ms on “simple” scenarios but rather to open new simulation possibilities”. Indeed this was not clear in the last version you submitted. One of the main reasons this was not clear is that you failed to present data to support the point that SLIM opens up new simulation possibilities. It is indeed important to compare the SLIM code to existing methods but what needs to follow is a detailed analysis of a novel use case or a novel class of use cases that are impossible or difficult to simulate with existing methods or a use case that simply solves an interesting biological problem. The use case you present now, seems to be an interesting one. Yet, currently there is only a brief mention and a figure of the results. There is no analysis and no code to replicate the figure. The reviewers and I feel like the value of the paper really hangs on the detailed analysis of such an example. Without it, the additional value the manuscript provides over existing SLIM manuscripts and existing simulation methods is small. In my opinion this is also the best way to achieve the aim you state at the end of the discussion “We hope that our work here will stimulate a wave of development of simulation-based models for bacterial population genetics.”. Scientists will certainly be animated by an amazing analysis of a simulated evolution experiment that solves an actual biological question. Like it has been done with other scripting languages such as Avida.

2. Unfortunately you have not replied to many points the reviewers have made. For example, one of my comments has been left unanswered. Why are there negative values when you normalize the results? I can see that they have now disappeared, but what happened? Also, Supplementary Figure 11 still has those negative values. Is this intended?

3. Supplementary Figures are not numbered correctly and some Supplementary Figures in the text do not seem to exist (or maybe they exist but have a different number).

4. Supplementary Figure 8 is impossible to understand. Why does A start with 2? What does burn-in through capitation mean? What is shown in the figure (e.g. what are the different colored circles?)?

5. Generally, the figure legends have not improved unfortunately. I wish I could be more positive, but in its current state I cannot recommend the submitted manuscript.

https://doi.org/10.24072/pci.evolbiol.100356.d2

Evaluation round #1

DOI or URL of the preprint: 10.1101/2020.09.28.316869

Version of the preprint: 1

Author's Reply, 30 Nov 2020

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.evolbiol.100356.ar1

Decision by Frederic Bertels, posted 06 Nov 2020

The reviewers find the approach presented here interesting, but criticize that you have not established the specific advantage of the presented approaches over existing approaches. We feel it is important to analyse common use cases where existing approaches fail or cannot be applied. So far the only comment on the advantage of SLiM over the other methods seems to be that SLiM can take circular genomes into account (How much does this matter?). Furthermore, we find it difficult to interpret the data and figures presented in the results section. For example, the data presented in Figure 2: 1. There is no 1 to 1 comparison between the WF expectation and the simulation results. For example, a simulation without recombination would be useful to show that in ideal circumstances the simulations perform as expected. 2. Why/how can the normalization lead to negative values? A better explanation of how the normalization works would be helpful interpreting the figure. It is also unclear what exactly the figures are intended to show. If the main aim of the figure is to show that rescaling does not have an effect on the data, then the figure should show a direct comparison between different scaling factors. Once it is established that the scaling factors do not change the results, SLiM could then be compared to existing methods. In general, as has been pointed out by the reviewers, improved figure legends would help with understanding the presented data. Finally, jargon and abbreviations are used to an extent that the paper becomes difficult to read. In conclusion the manuscript requires very substantial revision in order to be recommended. Importantly, we feel a revision should include data regarding the advantage of SLiM over existing methods.

Additional requirements of the managing board:

As indicated in the 'How does it work?’ section and in the code of conduct, please make sure that:

-Data are available to readers, either in the text or through an open data repository such as Zenodo (free), Dryad or some other institutional repository. Data must be reusable, thus metadata or accompanying text must carefully describe the data.

-Details on quantitative analyses (e.g., data treatment and statistical scripts in R, bioinformatic pipeline scripts, etc.) and details concerning simulations (scripts, codes) are available to readers in the text, as appendices, or through an open data repository, such as Zenodo, Dryad or some other institutional repository. The scripts or codes must be carefully described so that they can be reused.

-Details on experimental procedures are available to readers in the text or as appendices.

-Authors have no financial conflict of interest relating to the article. The article must contain a "Conflict of interest disclosure" paragraph before the reference section containing this sentence: "The authors of this preprint declare that they have no financial conflict of interest with the content of this article." If appropriate, this disclosure may be completed by a sentence indicating that some of the authors are PCI recommenders: “XXX is one of the PCI XXX recommenders.”

https://doi.org/10.24072/pci.evolbiol.100356.d1

Reviewed by anonymous reviewer 1, 06 Nov 2020

Reviewing the paper: Simulations of bacteria populations with SliM

Dear authors, I appreciate your effort in writing this manuscript. Overall I found the manuscript interesting.

Introduction |- Comments

I found the title of the paper a bit deceiving. From the title, I was expecting to see an example of bacterial populations under complex demographic scenarios and selection forces. This paper is more technical. In the first two paragraphs, you explain why simulations are so crucial in bacterial population genomics. Simulation can reveal the past and forecast the new demographic and evolutionary changes of bacterial populations. In your paper, though you do not show any direct evidence of SLiM doing that. You do not show any example where SLiM quantifies the eco-evo dynamics of bacterial populations. In the third paragraph, you compared SLiM to other simulators (e.g., a forward genetic simulator that can simulate complex scenarios including demographics and selection forces, has its language Eidos which makes easily adjustable to simulate bacterial populations).

Methods |-Comments The methods section confused me so much. For this, I'll go step by step. SLim comes together with the following characteristics: 1. Forward simulator 2. It has its own coding language, Eidos, which makes it adjustable for simulating bacterial populations 3. It allows you to simulate bacterial simulation under the assumptions of a Wright-Fisher model and a non-WF framework. This is quite clear to me. Comment: In this manuscript, you are performing simulations of bacterial populations under the non-WF framework, but you do not validate the non-WF results with any experimental data.

Methods |-Horizontal gene transfer, recombination and circularity - Comments
Horizontal gene transfer: The exchange of pieces of DNA between different organisms. The piece can be inserted at a random site or a specific site. If the incoming fragment is homologous, then the piece can be incorporated in a way that is similar to gene conversion to eukaryotes, where you do not have a reciprocal exchange of genetic material.

Comment: I see the importance of taking into consideration gene conversion, but you can potentially cite a paper reflecting its importance in the adaptation of bacterial populations, together with the frequency of gene conversion and homologous recombination. Also, you talk so much about gene conversion which at the end you do not consider it in your results, except if you refer to recombination as gene conversion which I doubt. This isn't very clear. You rightly claim that SLiM is superior to other programs because it can simulate gene conversion and because you consider the bacterial chromosome is circular. Why is this important? It is known that a bacterial chromosome, in any case, looks like a smear, a chaotic construction where DNA helixes are entangled with each other. Also, later in the paper, you counter-attack your argument of gene conversion by writing. Because we simulate the entire population; it is not possible to use gene conversion at a significant rate, otherwise ms crashes, thus there is no recombination in "burn-in"

Methods |-Burn-in - Comments

It is desirable to start a simulation with a population that is in a mutation-drift equilibrium. We have a mutation-drift equilibrium when both the mutation rate and the effective population size are stable. In a mutation drift equilibrium, the rate that the variation is lost due to drift is the same that is gained due to mutation.

Comments: I do not understand what does it mean when you say that the population size is larger than the time-span of interest guess you mean the effective population size that is needed to reach a mutation drift equilibrium is very high. Could you clear this out?

Methods |- Simulation rescaling - Comments Here you discuss the effect of rescaling into the summary statistics of the program. It's quite clear to me.

Methods |- Simulation protocol - Comments

Overall the simulation protocol is detailed and well explained. Many times, however, I was getting errors when I tried to copy-paste the code in the SLiMgui (e.g., ERROR (EidosSymbolTable::_GetValue): undefined identifier genomeSize. This error has invalidated the simulation; it cannot be run further. Once the script is fixed, you can recycle the simulation and try again) I suggest making the code more accessible, so when we test the code of the paper not to paste the line numbers as well. However, I see the importance of enumeration. In the end, I used your GitHub code where enumeration is hard to be followed.

Results The Results are quite straight forward. However, when I was reading your introduction, I was prepared for a different type of results. You did what you wrote about at the end of the introduction (you introduced the model, and that model behaves according to WF-model). Still, you also present a non-WF model whose results you do not validate from experimental data.

Figure1: rescaling ~ CPU time and memory Figure2: SFS ~ rescaling Figure3: LD ~ rescaling Figure 4: recombination rater & tract length ~ CPU & memory Figure 5: SFS ~ recombination rate Figure 6:

Comment: With the caption of your figures, you should convey the main result of the figure to be easier for the reader to skim through your soon to be published. For example, in Figure 1, you could write that by increasing the rescaling factor you observer faster CPU time, and less memory and that nonWF pops are being faster.

Discussion

In the discussion, you summarise your results and refer to the drawbacks of your simulator. I could not even find a typo. In general, I have to admit that I admire your efforts. The paper is neat, well structured, even the bibliography is written accurately. However, there is a space for improvement. Your methods section I believe that needs to be written more clearly. There are several points where the reader gets confused. You have to make from the introduction very clear your points, do not refer to gene conversion as your strong point since it is not, clear out what do you mean by recombination, pass out that this is technical paper.

https://doi.org/10.24072/pci.evolbiol.100356.rev11

Reviewed by anonymous reviewer 2, 28 Oct 2020

Download the review https://doi.org/10.24072/pci.evolbiol.100356.rev12

Reviewed by anonymous reviewer 3, 22 Oct 2020

This manuscript describes how to adapt the popular simulator SLiM to bacteria, especially to the bacterial mode of recombination. I had wondered about this possibility myself in the past, and I am delighted to see this preprint and the described protocol. However, I see several possibilities for improving the manuscript to better highlight the improvements of the described approach compared to existing approaches.

The manuscript would greatly profit from an overview figure that explains the underlying model and the different parameters used and how they go into the simulation.
The main advantage of the described approach should be presented with an example and discussed. So far, only simulations with comparisons to other programs are done, and they show convincingly that the SLiM approach works well. However, it is not obvious which advantages the presented approach has compared to ms and FastSimBac. Maybe one more complex simulation that includes selection or population structure could be added in the end to show an application of the approach. The advantages over previous approaches could also be added to the "Discussion" section.
Section 2.2.1 "mean recombination tract length of 10kb" First, the distribution could be mentioned here already, although this can be seen in the code. My main point is, however, that this value appears quite large. E.g., unselected recombination events found in 10.1371/journal.ppat.1002745 are on average 2kbp, most of the recombinations inferred in 10.1128/mBio.02494-18 are below 10kb, and the average length of homologous recombination fragments inferred in E. coli is ~500bp (10.1186/1471-2164-13-256). The simulation is presented for parameters from S. agalactiae where the mean length is even above 100kb, and this paper is based on selected recombination events, whereas unselected events should provide the parameters for the simulation (see 10.1371/journal.ppat.1002745 for the difference). Although, I understand that these parameters can be adjusted, I wondered how the simulations perform for shorter length.
Section 2.2.1 It is not clear to me how the source individual for the recombination event is chosen. Since offspring is directly added to the population, is it possible, that generated recombinants can already be the source individual for recombinants generated later in the same generation?
The authors should mention the recently released simulator CoreSimul (10.1186/s12859-020-03619-x), maybe in the introduction. If feasible, it would be interesting to see how it compares to SLiM.

Additional comments: Section 2.1.2 "Because we simulate the entire population, it is not possible to use gene conversion at a significant rate, otherwise ms crashes, thus there is no recombination in burn-in." Maybe you can be more precise and describe why ms crashed, would more RAM solve the issue? Which population size would the feasible with ms? Section 2.1.3 "The rescaling factor must also be applied to the duration of the simulation (and the duration of different events that might occur), so that the effects of drift remains similar." Maybe it could be described explicitly how the length and events should be increased or decreased. Section 2.2.1 "constant 11" Should it read "constant 1"?

https://doi.org/10.24072/pci.evolbiol.100356.rev13

User comments

No user comments yet

or Register
Submit a preprint