Close printable page

Recommendation

SARS-Cov-2 genome sequence analysis suggests rapid spread followed by epidemic slowdown in France

B. Jesse Shapiro based on reviews by Luca Ferretti and 2 anonymous reviewers

A recommendation of:

Early phylodynamics analysis of the COVID-19 epidemics in France

Gonché Danesh, Baptiste Elie,Yannis Michalakis, Mircea T. Sofonea, Antonin Bal, Sylvie Behillil, Grégory Destras, David Boutolleau, Sonia Burrel, Anne-Geneviève Marcelin, Jean-Christophe Plantier, Vincent Thibault, Etienne Simon-Loriere, Sylvie van der Werf, Bruno Lina, Laurence Josset, Vincent Enouf, Samuel Alizon and the COVID SMIT PSL group (2020), medRxiv, 2020.06.03.20119925, ver. 3 peer-reviewed and recommended by Peer Community in Evolutionary Biology https://doi.org/10.1101/2020.06.03.20119925

Read preprint in preprint server Now published in Peer Community Journal

Data used for results

Scripts used to obtain or analyze results

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

Early phylodynamics analysis of the COVID-19 epidemics in France

France was one of the first countries to be reached by the COVID-19 pandemic. Here, we analyse 196 SARS-Cov-2 genomes collected between Jan 24 and Mar 24 2020, and perform a phylodynamics analysis. In particular, we analyse the doubling time, reproduction number (Rt ) and infection duration associated with the epidemic wave that was detected in incidence data starting from Feb 27. Different models suggest a slowing down of the epidemic in Mar, which would be consistent with the implementation of the national lock-down on Mar 17. The inferred distributions for the effective infection duration and Rt are in line with those estimated from contact tracing data. Finally, based on the available sequence data, we estimate that the French epidemic wave originated between mid-Jan and early Feb. Overall, this analysis shows the potential to use sequence genomic data to inform public health decisions in an epidemic crisis context and calls for further analyses with denser sampling.

phylodynamics, SARS-Cov-2, virus, France, outbreak

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

التحليل الديناميكي المبكر لأوبئة كوفيد-19 في فرنسا

كانت فرنسا من أوائل الدول التي وصلت إليها جائحة كوفيد-19. هنا، نقوم بتحليل 196 جينومًا لـ SARS-Cov-2 تم جمعها في الفترة ما بين 24 يناير و24 مارس 2020، ونجري تحليلًا ديناميكيًا. على وجه الخصوص، نقوم بتحليل وقت التضاعف ورقم التكاثر (Rt) ومدة الإصابة المرتبطة بموجة الوباء التي تم اكتشافها في بيانات الإصابة بدءًا من 27 فبراير. وتشير النماذج المختلفة إلى تباطؤ الوباء في مارس، وهو ما سيكون متسقًا مع تنفيذ الإغلاق الوطني في 17 مارس. وتتوافق التوزيعات المستنتجة لمدة الإصابة الفعلية ونسبة الإصابة مع تلك المقدرة من بيانات تتبع الاتصال. أخيرًا، استنادًا إلى بيانات التسلسل المتاحة، نقدر أن موجة الوباء الفرنسية نشأت بين منتصف يناير وأوائل فبراير. بشكل عام، يُظهر هذا التحليل إمكانية استخدام بيانات الجينوم التسلسلي لإبلاغ قرارات الصحة العامة في سياق الأزمة الوبائية ويدعو إلى مزيد من التحليلات باستخدام عينات أكثر كثافة.

ديناميكا السل، سارس-كوف-2، الفيروس، فرنسا، تفشي المرض

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Análisis filodinámico temprano de las epidemias de COVID-19 en Francia

Francia fue uno de los primeros países afectados por la pandemia de COVID-19. Aquí, analizamos 196 genomas de SARS-Cov-2 recopilados entre el 24 de enero y el 24 de marzo de 2020 y realizamos un análisis filodinámico. En particular, analizamos el tiempo de duplicación, el número de reproducción (Rt ) y la duración de la infección asociados con la ola epidémica que se detectó en los datos de incidencia a partir del 27 de febrero. Diferentes modelos sugieren una desaceleración de la epidemia en marzo, lo que sería consistente con la implementación del bloqueo nacional el 17 de marzo. Las distribuciones inferidas para la duración efectiva de la infección y el Rt están en línea con las estimadas a partir de los datos de rastreo de contactos. Finalmente, con base en los datos de secuencia disponibles, estimamos que la ola epidémica francesa se originó entre mediados de enero y principios de febrero. En general, este análisis muestra el potencial de utilizar datos genómicos de secuencia para informar decisiones de salud pública en un contexto de crisis epidémica y exige análisis adicionales con muestreos más densos.

filodinámica, SARS-Cov-2, virus, Francia, brote

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Première analyse phylodynamique des épidémies de COVID-19 en France

La France a été l’un des premiers pays touchés par la pandémie de COVID-19. Ici, nous analysons 196 génomes du SRAS-Cov-2 collectés entre le 24 janvier et le 24 mars 2020 et effectuons une analyse phylodynamique. En particulier, nous analysons le temps de doublement, le nombre de reproduction (Rt) et la durée de l'infection associés à la vague épidémique détectée dans les données d'incidence à partir du 27 février. Différents modèles suggèrent un ralentissement de l'épidémie en mars, ce qui serait cohérent avec la mise en œuvre du confinement national le 17 mars. Les distributions déduites pour la durée effective de l'infection et le Rt sont conformes à celles estimées à partir des données de recherche des contacts. Enfin, sur la base des données de séquence disponibles, nous estimons que la vague épidémique française est apparue entre mi-janvier et début février. Globalement, cette analyse montre le potentiel d'utiliser les données génomiques de séquence pour éclairer les décisions de santé publique dans un contexte de crise épidémique et appelle à des analyses plus approfondies avec un échantillonnage plus dense.

phylodynamique, SARS-Cov-2, virus, France, épidémie

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

फ़्रांस में COVID-19 महामारी का प्रारंभिक फ़ाइलोडायनामिक्स विश्लेषण

फ्रांस उन पहले देशों में से एक था, जहां पर COVID-19 महामारी पहुंची थी। यहां, हम 24 जनवरी और 24 मार्च 2020 के बीच एकत्र किए गए 196 SARS-Cov-2 जीनोम का विश्लेषण करते हैं और एक फाइलोडायनामिक्स विश्लेषण करते हैं। विशेष रूप से, हम महामारी की लहर से जुड़े दोहरीकरण समय, प्रजनन संख्या (आरटी) और संक्रमण अवधि का विश्लेषण करते हैं जो 27 फरवरी से शुरू होने वाले घटना डेटा में पाया गया था। विभिन्न मॉडल मार्च में महामारी की गति धीमी होने का सुझाव देते हैं, जो इसके अनुरूप होगा 17 मार्च को राष्ट्रीय लॉक-डाउन का कार्यान्वयन। प्रभावी संक्रमण अवधि और आरटी के लिए अनुमानित वितरण संपर्क ट्रेसिंग डेटा से अनुमान के अनुरूप हैं। अंत में, उपलब्ध अनुक्रम डेटा के आधार पर, हमारा अनुमान है कि फ्रांसीसी महामारी लहर जनवरी के मध्य और फरवरी की शुरुआत के बीच उत्पन्न हुई थी। कुल मिलाकर, यह विश्लेषण महामारी संकट के संदर्भ में सार्वजनिक स्वास्थ्य निर्णयों को सूचित करने के लिए अनुक्रम जीनोमिक डेटा का उपयोग करने की क्षमता दिखाता है और इसके लिए कॉल करता है। सघन नमूने के साथ आगे का विश्लेषण।

फाइलोडायनामिक्स, SARS-Cov-2, वायरस, फ्रांस, प्रकोप

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

フランスにおける新型コロナウイルス感染症流行の初期の系統力学分析

フランスは、新型コロナウイルス感染症のパンデミックが最初に到達した国の 1 つです。ここでは、2020 年 1 月 24 日から 3 月 24 日までに収集された 196 個の SARS-Cov-2 ゲノムを分析し、系統力学分析を実行します。特に、2 月 27 日からの発生率データで検出された流行の波に関連する倍加時間、再生産数 (Rt )、および感染期間を分析します。さまざまなモデルは、3 月の流行の減速を示唆しており、これは次のことと一致します。 3月17日の全国的なロックダウンの実施。実効感染期間とRtの推定分布は、接触追跡データから推定された分布と一致している。最後に、利用可能な配列データに基づいて、フランスの流行の波は 1 月中旬から 2 月初旬の間に発生したと推定します。全体として、この分析は、配列ゲノムデータを使用して流行危機の状況における公衆衛生上の決定を知らせる可能性を示しており、次のことを求めています。より高密度のサンプリングによるさらなる分析。

系統力学、SARS-Cov-2、ウイルス、フランス、アウトブレイク

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Análise filodinâmica inicial das epidemias de COVID-19 na França

A França foi um dos primeiros países a ser atingido pela pandemia da COVID-19. Aqui, analisamos 196 genomas de SARS-Cov-2 coletados entre 24 de janeiro e 24 de março de 2020 e realizamos uma análise filodinâmica. Em particular, analisamos o tempo de duplicação, o número de reprodução (Rt) e a duração da infecção associados à onda epidêmica detectada nos dados de incidência a partir de 27 de fevereiro. Diferentes modelos sugerem uma desaceleração da epidemia em março, o que seria consistente com a implementação do confinamento nacional em 17 de março. As distribuições inferidas para a duração efetiva da infecção e Rt estão em linha com as estimadas a partir de dados de rastreio de contactos. Finalmente, com base nos dados de sequência disponíveis, estimamos que a onda epidémica francesa teve origem entre meados de Janeiro e início de Fevereiro. No geral, esta análise mostra o potencial de utilização de dados genómicos de sequência para informar decisões de saúde pública num contexto de crise epidémica e apela a análises adicionais com amostragem mais densa.

filodinâmica, SARS-Cov-2, vírus, França, surto

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Ранний филодинамический анализ эпидемии COVID-19 во Франции

Франция была одной из первых стран, затронутых пандемией COVID-19. Здесь мы анализируем 196 геномов SARS-Cov-2, собранных в период с 24 января по 24 марта 2020 года, и проводим филодинамический анализ. В частности, мы анализируем время удвоения, число размножения (Rt ) и продолжительность заражения, связанные с эпидемической волной, которая была обнаружена в данных о заболеваемости начиная с 27 февраля. Различные модели предполагают замедление эпидемии в марте, что соответствовало бы введение общенационального карантина 17 марта. Предполагаемые распределения эффективной продолжительности заражения и Rt соответствуют оценкам, полученным на основе данных по отслеживанию контактов. Наконец, основываясь на имеющихся данных о последовательностях, мы считаем, что эпидемическая волна во Франции возникла в период с середины января до начала февраля. В целом, этот анализ показывает потенциал использования данных о последовательностях геномных данных для обоснования решений общественного здравоохранения в контексте эпидемического кризиса и требует дальнейший анализ с более плотной выборкой.

филодинамика, SARS-Cov-2, вирус, Франция, вспышка

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

法国 COVID-19 疫情的早期系统动力学分析

法国是最早受到 COVID-19 大流行影响的国家之一。在这里，我们分析了 2020 年 1 月 24 日至 3 月 24 日期间收集的 196 个 SARS-Cov-2 基因组，并进行了系统动力学分析。特别是，我们分析了从 2 月 27 日开始的发病数据中检测到的与疫情波相关的倍增时间、繁殖数 (Rt ) 和感染持续时间。不同的模型表明 3 月份疫情放缓，这与3 月 17 日实施全国封锁。推断的有效感染持续时间和 Rt 分布与接触者追踪数据估计的分布一致。最后，根据现有的序列数据，我们估计法国的疫情浪潮起源于一月中旬至二月初。总体而言，该分析显示了在流行病危机背景下使用序列基因组数据为公共卫生决策提供信息的潜力，并呼吁通过更密集的采样进行进一步分析。

系统动力学, SARS-Cov-2, 病毒, 法国, 爆发

Submission: posted 04 June 2020
Recommendation: posted 07 August 2020, validated 18 August 2020

Cite this recommendation as:
Shapiro, B. (2020) SARS-Cov-2 genome sequence analysis suggests rapid spread followed by epidemic slowdown in France. Peer Community in Evolutionary Biology, 100107. https://doi.org/10.24072/pci.evolbiol.100107

Recommendation

Sequencing and analyzing SARS-Cov-2 genomes in nearly real time has the potential to quickly confirm (and inform) our knowledge of, and response to, the current pandemic [1,2]. In this manuscript [3], Danesh and colleagues use the earliest set of available SARS-Cov-2 genome sequences available from France to make inferences about the timing of the major epidemic wave, the duration of infections, and the efficacy of lockdown measures. Their phylodynamic estimates -- based on fitting genomic data to molecular clock and transmission models -- are reassuringly close to estimates based on 'traditional' epidemiological methods: the French epidemic likely began in mid-January or early February 2020, and spread relatively rapidly (doubling every 3-5 days), with people remaining infectious for a median of 5 days [4,5]. These transmission parameters are broadly in line with estimates from China [6,7], but are currently unknown in France (in the absence of contact tracing data). By estimating the temporal reproductive number (Rt), the authors detected a slowing down of the epidemic in the most recent period of the study, after mid-March, supporting the efficacy of lockdown measures.
Along with the three other reviewers of this manuscript, I was impressed with the careful and exhaustive phylodynamic analyses reported by Danesh et al. [3]. Notably, they take care to show that the major results are robust to the choice of priors and to sampling. The authors are also careful to note that the results are based on a limited sample size of SARS-Cov-2 genomes, which may not be representative of all regions in France. Their analysis also focused on the dominant SARS-Cov-2 lineage circulating in France, which is also circulating in other countries. The variations they inferred in epidemic growth in France could therefore be reflective on broader control policies in Europe, not only those in France. Clearly more work is needed to fully unravel which control policies (and where) were most effective in slowing the spread of SARS-Cov-2, but Danesh et al. [3] set a solid foundation to build upon with more data. Overall this is an exemplary study, enabled by rapid and open sharing of sequencing data, which provides a template to be replicated and expanded in other countries and regions as they deal with their own localized instances of this pandemic.

References

[1] Grubaugh, N. D., Ladner, J. T., Lemey, P., Pybus, O. G., Rambaut, A., Holmes, E. C., & Andersen, K. G. (2019). Tracking virus outbreaks in the twenty-first century. Nature microbiology, 4(1), 10-19. doi: 10.1038/s41564-018-0296-2
[2] Fauver et al. (2020) Coast-to-Coast Spread of SARS-CoV-2 during the Early Epidemic in the United States. Cell, 181(5), 990-996.e5. doi: 10.1016/j.cell.2020.04.021
[3] Danesh, G., Elie, B., Michalakis, Y., Sofonea, M. T., Bal, A., Behillil, S., Destras, G., Boutolleau, D., Burrel, S., Marcelin, A.-G., Plantier, J.-C., Thibault, V., Simon-Loriere, E., van der Werf, S., Lina, B., Josset, L., Enouf, V. and Alizon, S. and the COVID SMIT PSL group (2020) Early phylodynamics analysis of the COVID-19 epidemic in France. medRxiv, 2020.06.03.20119925, ver. 3 peer-reviewed and recommended by PCI Evolutionary Biology. doi: 10.1101/2020.06.03.20119925
[4] Salje et al. (2020) Estimating the burden of SARS-CoV-2 in France. hal-pasteur.archives-ouvertes.fr/pasteur-02548181
[5] Sofonea, M. T., Reyné, B., Elie, B., Djidjou-Demasse, R., Selinger, C., Michalakis, Y. and Samuel Alizon, S. (2020) Epidemiological monitoring and control perspectives: application of a parsimonious modelling framework to the COVID-19 dynamics in France. medRxiv, 2020.05.22.20110593. doi: 10.1101/2020.05.22.20110593
[6] Rambaut, A. (2020) Phylogenetic analysis of nCoV-2019 genomes. virological.org/t/phylodynamic-analysis-176-genomes-6-mar-2020/356
[7] Li et al. (2020) Early transmission dynamics in Wuhan, China, of novel coronavirus–infected pneumonia. N Engl J Med, 382: 1199-1207. doi: 10.1056/NEJMoa2001316

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Reviews

Evaluation round #1

DOI or URL of the preprint: https://www.medrxiv.org/content/10.1101/2020.06.03.20119925v1

Author's Reply, 03 Aug 2020

Download author's reply Download tracked changes file https://doi.org/10.24072/pci.evolbiol.100244.ar1

Decision by B. Jesse Shapiro, posted 22 Jul 2020

Thank your for submitting your manuscript, which has now been seen by three reviewers.

In general, the reviewers found the manuscript interesting but noted several points that need clarification and further discussion. Pending these minor revisions, I think the manuscript will merit recommendation.

https://doi.org/10.24072/pci.evolbiol.100244.d1

Reviewed by Luca Ferretti, 22 Jul 2020

This manuscript presents an exhaustive phylodynamic analysis of the early phase of the French COVID-19 epidemic.

The focus is on the dominant clade (B.1) since it is the only informative one for phylodynamic analyses. The lack of sequences from the initial days of the epidemic hinders a more detailed reconstruction of the early dynamics due to lack of resolution.

Nevertheless, the results are quite strong and informative. The beginning of the French epidemic is dated within a reasonable interval (mid-January to early February) consistent with epidemiological evidence for the European epidemic.

The doubling time is also consistent with the epidemiological evidence, although with large uncertainties.

The authors also find an increase in the doubling time along the epidemic, that would be extremely interesting in terms of a slowdown due to the non-pharmaceutical interventions implemented in France. However, I wonder if either the non-uniform sampling rate or the geographical spread of the strains could have affected this result. Intuitively, both could cause an upward bias to the inferred growth rate in the early part of the tree. In my opinion, while very suggestive, the evidence presented here is not conclusive

The birth-death skyline analysis used to infer the duration of the infectious period is very intriguing. The authors find an infectious period of 4-6 days from phylodynamic evidence. Note that in the model by Stadler et al 2013, infectiousness is constant in time until individuals are not infectious anymore, while COVID-19 has a bell-like profile of infectiousness centered around 4-6 days post infection. Hence, it is difficult to assess the agreement between the result of the authors and the known generation time distribution of COVID-19. Assuming that the relevant comparison would be with the duration of the infectious period, the estimates in this manuscript would be reasonably close to the epidemiological evidence. Instead, if the relevant comparison would be the one with the relation between exponential growth rate and R0 (Lotka-Volterra equation), the model used by the authors would lead to a significant underestimation of R0 in the initial phase or an overestimation of the the infectious interval, as well as an overestimation of Rt when the epidemic is decreasing. This could be one of the reason behind the suspiciously low (although very uncertain) value of the inferred Rt in the first period of the epidemic, and the fact that Rt>1 in later phases. Anyway, the values of the infectious period and Rt are in the right ballpark.

Overall, this is a very good and clear manuscript that provides an excellent example of the power of phylodynamics to infer quantities of epidemiological interest.

https://doi.org/10.24072/pci.evolbiol.100244.rev11

Reviewed by anonymous reviewer 1, 05 Jul 2020

Danesh et al. present a study of the phylodynamics of SARS-CoV-2 sequences from France early in the outbreak. Their analysis is based on 196 genomic sequences collected early in the outbreak (January 24 - March 24 2020) and they estimate several key epidemiological parameters from the sequence data. While this work is important and timely, some clarifications would strengthen the manuscript.

-In Figure 3, the estimated doubling time from the sequences from the second half of the epidemic (France 61-2 set) is lower than the doubling time for the sequences from the whole epidemic (France 122a) or the doubling time for the sequences from the first three quarters of the epidemic (France 81). This does not appear to be consistent with the interpretation that adding more recent sequences increases estimated doubling time. Was estimated doubling time lower at the end of the time period examined as well as at the beginning?

-The methods section states that 196 sequences are analyzed. However, 204 sequence ids are listed in Supplemental Table 1. The set of sequences used for the analysis should be clarified.

-In a few places, more detail on the methods used would be helpful. In particular, it would be helpful to provide more detail on the steps taken to align and clean the data using the augur pipeline (what parameters were used to filter and align the sequences?) Parameters used to run RDP, SMS, PhyML should be listed (where default parameters were used this should be specified). The authors should also consider including the phylogenetic tree in Figure 1 as a supplemental file.

-The motivation for the molecular clock settings chosen (above, below and equal to a previously reported value) are described only in the methods; it would be helpful to have this information in the results section when Figure 2 is discussed. It would also be helpful in the caption of Figure 2 to specify that the clock rate is in substitutions per site per year (this is also just described in the methods). Also, in Figure 2, the models should be listed in the legend in either increasing or decreasing order of clock rate (right now the slowest of the three clocks is in the middle of the legend, which is confusing).

--In the introduction (line 14), the Liu et al. reference is not the right one for the genomic sequence of SARS-CoV-2. For the initial sequencing of the virus, cite Wu et al., 2020, A New Coronavirus Associated With Human Respiratory Disease in China and Zhou et al., 2020, A pneumonia outbreak associated with a new coronavirus of probable bat origin.

Typographical suggestions:

-title: epidemics → epidemic

-line 2: pandemics → pandemic

-line 19-20: “Early results allowed to better understand the origin of SARS-Cov-2 and identify” → “Early results allowed better understanding of the origin of SARS-CoV-2 and identification of”

-line 34: “among which the temporal reproduction number” → “including the temporal reproduction number”

-line 42: “epidemics” → epidemic”

-Fig 1 legend: “because outside the main clade” → “because they are outside the main clade”

-Figure 2 legend: “fix molecular clock” → “fixed molecular clock”

-line 79: In the following of the work → in the following work

-line 87: In appendix → in the appendix

-line 95: in smaller dataset → in the smaller dataset

-line 159: “if we use a,” → delete comma

-line 234: bayesian → Bayesian (fix capitalization)

-line 250: as previous models → as in previous models

-line 264: delete comma

https://doi.org/10.24072/pci.evolbiol.100244.rev12

Reviewed by anonymous reviewer 2, 16 Jul 2020

Danesh et al. perform a phylodynamic analysis of the French SARS-CoV-2 epidemic, and notably estimate the reproduction number and the duration of infection from the phylogeny. They analyze the sensitivity of their results to the sequence sampling, and observe an effect of the lockdown on the reproduction number. The values they infer for the parameters of the epidemic agree with those of contact-tracing analyses.

I found Danesh et al.'s manuscript interesting, in particular that the phylodynamic estimates overlap with contact tracing estimates, but have a few comments and suggestions to make.

First, I found an interpretation of the phylogeny puzzling: it seems a polytomy is interpreted by the authors as evidence for multiple introductions, but I don't understand why it would be so.

Second, the sampling is uneven across French regions, with some regions entirely missing. I think the authors should address this problem, for instance by discussing its origin and its potential consequences on the estimated phylogeny and parameters.

Third, I think the manuscript can be made clearer by reorganizing some parts, and explaining some terms (see below for specific examples). Fourth, I found myself missing some technical information, for instance on the clock models that were used, on convergence diagnostics, and I would have liked to see a more systematic comparison between the prior and the posterior distributions (see below for specific examples).

My opinion is that those points should be addressed before any recommendation.

More specific comments:

p3 l55: I think it would be useful if the authors could address the lack of sequences coming from region provence Alpes Côte d'Azur. It is also a missing point in Gambaro et al., but those authors circumvented this issue by arguing that they were focusing on the epidemic in the north of France. That's not what the authors here are aiming to do, but can they still talk about the epidemic in all of France if entire regions are missing?

p4 l70: "Another interpretation, could be independent introductions in France (up to 6 events)." : I don't understand. Independent introductions would not necessarily create a polytomy, as is shown by the sequences in black. And I'm not sure where the number 6 comes from.

Fig. 1 : there seems to be a contrast between the relative number of sequences coming from Île de France in the data set and the relative number of infections or hospitalizations in Île de France. Based on the latter, I would expect many more red sequences in the phylogeny. Has there been an under-sequencing of sequences in Île de France compared to other regions ? Finally, the scattered distribution of the sequences from Île de France indicates that they may be the source of many clusters in other French regions, if the support values in the phylogeny are high enough.

Fig. 2: the legend of the distributions could be improved by indicating the unit notably.

Fig. S4: the legend needs to be improved as in Fig. S5.

Fig S3: it is not clear how the priors were specified, in particular their parameters.

p5 l90: "This was also true for the BDSKY model, where the prior shape for the recovery rate had little impact (Figure S3)": I would clarify and add something like "... but the positional parameters of the prior had an important impact."

p6 l115: the data sets (France81, France61-1) are introduced here even though they have been discussed previously when commenting on Table 1.

p6: "Since the first dataset includes more recent" : do the authors count the full dataset France 122a when writing this sentence, or do they just talk about the 3 other ones (the subsets)?

l125: "Adding more recent sequence data indeed leads to an increase in epidemic doubling time. Initially, with the first 61 sequences (which run from Feb 21 to Mar 12)": I am wondering if the doubling time is the only parameter that changes in this experiment. Indeed, altering the sequence sample may change other parameter estimates, which may be correlated with the doubling time. Besides, the sampling effort was probably not the same between before March 12 and after March 24. Are such differences in sampling effort accounted for in the model?

l130: I found this paragraph about molecular clocks confusing because it was unclear to me what clock models had been used.

Fig. S6: "convergence is limited" : what does that mean? That convergence diagnostics were charateristic of a lack of convergence of the MCMC chains?

l150-155: I think it would be useful to define what the authors mean by duration of contagiousness vs distribution of infectious periods. I assume the latter is the time between successive infections in a transmission chain, but I'd like to be sure... Also, it is not clear to me why for these estimates the authors no longer consider the influence of the priors on the rate of evolution or of sampling time.

Fig. S3, S6, S8, S9: it would help in all figures showing the impact of the prior on the posterior distributions to also display the prior distributions.

l180: "allows to infer phylogenies" : allows one to

l215: "As acknowledge in the introduction" : acknowledged

https://doi.org/10.24072/pci.evolbiol.100244.rev13