Close printable page

Recommendation

Predicting small ancestors using contemporary genomes of large mammals

Bruce Rannala based on reviews by Bruce Rannala and 1 anonymous reviewer

A recommendation of:

Reconstruction of body mass evolution in the Cetartiodactyla and mammals using phylogenomic data

Emeric Figuet, Marion Ballenghien, Nicolas Lartillot, Nicolas Galtier (2017), bioRxiv, 139147, ver. 3 peer-reviewed and recommended by PCI Evol Biol https://doi.org/10.1101/139147

Read preprint in preprint server Now published in Peer Community Journal

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

Reconstruction of body mass evolution in the Cetartiodactyla and mammals using phylogenomic data

Reconstructing ancestral characters on a phylogeny is an arduous task because the observed states at the tips of the tree correspond to a single realization of the underlying evolutionary process. Recently, it was proposed that ancestral traits can be indirectly estimated with the help of molecular data, based on the fact that life history traits influence substitution rates. Here we challenge these new approaches in the Cetartiodactyla, a clade of large mammals which, according to paleontology, derive from small ancestors. Analysing transcriptome data in 41 species, of which 22 were newly sequenced, we provide a dated phylogeny of the Cetartiodactyla and report a significant effect of body mass on the overall substitution rate, the synonymous vs. non-synonymous substitution rate and the dynamics of GC-content. Our molecular comparative analysis points toward relatively small Cetartiodactyla ancestors, in agreement with the fossil record, even though our data set almost exclusively consists of large species. This analysis demonstrates the potential of phylogenomic methods for ancestral trait reconstruction and gives credit to recent suggestions that the ancestor to placental mammals was a relatively large and long-lived animal.

ancestral characters, Bayesian inference, mammals, phylogeny, substitution rate, GC content

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

إعادة بناء تطور كتلة الجسم في Cetartiodactyla والثدييات باستخدام بيانات النشوء والتطور

إن إعادة بناء شخصيات الأسلاف على السلالة هي مهمة شاقة لأن الحالات المرصودة عند أطراف الشجرة تتوافق مع إدراك واحد للعملية التطورية الأساسية. في الآونة الأخيرة، تم اقتراح أنه يمكن تقدير سمات الأسلاف بشكل غير مباشر بمساعدة البيانات الجزيئية، استنادًا إلى حقيقة أن سمات تاريخ الحياة تؤثر على معدلات الإحلال. نحن هنا نتحدى هذه الأساليب الجديدة في Cetartiodactyla، وهي مجموعة من الثدييات الكبيرة التي، وفقًا لعلم الحفريات، تنحدر من أسلاف صغيرة. من خلال تحليل بيانات النسخ في 41 نوعًا، منها 22 نوعًا تم تسلسلها حديثًا، نقدم سلالة مؤرخة من Cetartiodactyla ونبلغ عن تأثير كبير لكتلة الجسم على معدل الاستبدال الإجمالي، ومعدل الاستبدال المترادف مقابل غير المترادف وديناميكيات GC -محتوى. يشير تحليلنا المقارن الجزيئي إلى أسلاف Cetartiodactyla الصغيرة نسبيًا، بما يتفق مع السجل الأحفوري، على الرغم من أن مجموعة البيانات لدينا تتكون بشكل حصري تقريبًا من أنواع كبيرة. يوضح هذا التحليل إمكانات أساليب النشوء والتطور في إعادة بناء سمات الأسلاف ويعطي الفضل في الاقتراحات الحديثة بأن سلف الثدييات المشيمية كان حيوانًا كبيرًا نسبيًا وطويل العمر.

شخصيات الأجداد، الاستدلال البايزي، الثدييات، السلالة، معدل الاستبدال، محتوى GC

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Reconstrucción de la evolución de la masa corporal en Cetartiodactyla y mamíferos utilizando datos filogenómicos.

Reconstruir caracteres ancestrales sobre una filogenia es una tarea ardua porque los estados observados en las puntas del árbol corresponden a una única comprensión del proceso evolutivo subyacente. Recientemente, se propuso que los rasgos ancestrales pueden estimarse indirectamente con la ayuda de datos moleculares, basándose en el hecho de que los rasgos de la historia de vida influyen en las tasas de sustitución. Aquí desafiamos estos nuevos enfoques en Cetartiodactyla, un clado de grandes mamíferos que, según la paleontología, derivan de ancestros pequeños. Al analizar los datos del transcriptoma en 41 especies, de las cuales 22 fueron secuenciadas recientemente, proporcionamos una filogenia fechada de Cetartiodactyla e informamos un efecto significativo de la masa corporal en la tasa de sustitución general, la tasa de sustitución sinónimo versus no sinónimo y la dinámica de GC. -contenido. Nuestro análisis comparativo molecular apunta hacia ancestros de Cetartiodactyla relativamente pequeños, de acuerdo con el registro fósil, aunque nuestro conjunto de datos se compone casi exclusivamente de especies grandes. Este análisis demuestra el potencial de los métodos filogenómicos para la reconstrucción de rasgos ancestrales y da crédito a sugerencias recientes de que el antepasado de los mamíferos placentarios era un animal relativamente grande y longevo.

caracteres ancestrales, inferencia bayesiana, mamíferos, filogenia, tasa de sustitución, contenido de GC

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Reconstruction de l'évolution de la masse corporelle chez les cétartiodactyles et les mammifères à l'aide de données phylogénomiques

Reconstruire des caractères ancestraux sur une phylogénie est une tâche ardue car les états observés aux extrémités de l'arbre correspondent à une prise de conscience unique du processus évolutif sous-jacent. Récemment, il a été proposé que les traits ancestraux puissent être indirectement estimés à l'aide de données moléculaires, sur la base du fait que les traits d'histoire de vie influencent les taux de substitution. Nous contestons ici ces nouvelles approches chez les Cetartiodactyla, un clade de grands mammifères qui, selon la paléontologie, dérivent de petits ancêtres. En analysant les données du transcriptome de 41 espèces, dont 22 nouvellement séquencées, nous fournissons une phylogénie datée des Cetartiodactyla et rapportons un effet significatif de la masse corporelle sur le taux de substitution global, le taux de substitution synonyme ou non synonyme et la dynamique de la GC. -contenu. Notre analyse comparative moléculaire pointe vers des ancêtres Cetartiodactyla relativement petits, en accord avec les archives fossiles, même si notre ensemble de données se compose presque exclusivement de grandes espèces. Cette analyse démontre le potentiel des méthodes phylogénomiques pour la reconstruction des traits ancestraux et donne du crédit aux suggestions récentes selon lesquelles l'ancêtre des mammifères placentaires était un animal relativement grand et à longue durée de vie.

caractères ancestraux, inférence bayésienne, mammifères, phylogénie, taux de substitution, contenu GC

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

फाइलोजेनोमिक डेटा का उपयोग करके सेटार्टियोडैक्टाइला और स्तनधारियों में शरीर द्रव्यमान विकास का पुनर्निर्माण

फ़ाइलोजेनी पर पैतृक लक्षणों का पुनर्निर्माण करना एक कठिन कार्य है क्योंकि पेड़ की युक्तियों पर देखी गई अवस्थाएँ अंतर्निहित विकासवादी प्रक्रिया के एकल अहसास से मेल खाती हैं। हाल ही में, यह प्रस्तावित किया गया था कि पैतृक लक्षणों का अप्रत्यक्ष रूप से आणविक डेटा की मदद से अनुमान लगाया जा सकता है, इस तथ्य के आधार पर कि जीवन इतिहास के लक्षण प्रतिस्थापन दर को प्रभावित करते हैं। यहां हम सेटार्टियोडैक्टाइला में इन नए दृष्टिकोणों को चुनौती देते हैं, जो बड़े स्तनधारियों का एक समूह है, जो जीवाश्म विज्ञान के अनुसार, छोटे पूर्वजों से प्राप्त हुआ है। 41 प्रजातियों में ट्रांसक्रिपटोम डेटा का विश्लेषण करते हुए, जिनमें से 22 को नए अनुक्रमित किया गया था, हम सेटार्टियोडैक्टाइला की एक दिनांकित फाइलोजेनी प्रदान करते हैं और समग्र प्रतिस्थापन दर, पर्यायवाची बनाम गैर-पर्यायवाची प्रतिस्थापन दर और जीसी की गतिशीलता पर शरीर द्रव्यमान के एक महत्वपूर्ण प्रभाव की रिपोर्ट करते हैं। -सामग्री। हमारा आणविक तुलनात्मक विश्लेषण जीवाश्म रिकॉर्ड के अनुरूप अपेक्षाकृत छोटे सेटार्टियोडैक्टाइल पूर्वजों की ओर इशारा करता है, भले ही हमारे डेटा सेट में लगभग विशेष रूप से बड़ी प्रजातियां शामिल हैं। यह विश्लेषण पैतृक गुण पुनर्निर्माण के लिए फाइलोजेनोमिक तरीकों की क्षमता को प्रदर्शित करता है और हाल के सुझावों को श्रेय देता है कि अपरा स्तनधारियों का पूर्वज अपेक्षाकृत बड़ा और लंबे समय तक जीवित रहने वाला जानवर था।

पैतृक लक्षण, बायेसियन अनुमान, स्तनधारी, फाइलोजेनी, प्रतिस्थापन दर, जीसी सामग्री

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

系統ゲノムデータを用いた鯨歯目および哺乳類の体重進化の再構築

系統発生上の祖先の形質を再構成することは困難な作業です。なぜなら、木の先端で観察された状態は、根底にある進化プロセスの 1 つの実現に対応するからです。最近、生活史形質が置換率に影響を与えるという事実に基づいて、分子データを利用して先祖形質を間接的に推定できることが提案されました。今回我々は、古生物学によれば小型の祖先から派生した大型哺乳類のクレードである頭蓋指目において、これらの新しいアプローチに挑戦します。 41 種のトランスクリプトームデータを分析し、そのうち 22 種が新たに配列決定されたことにより、鯨歯目の古い系統発生を提供し、全体の置換率、同義置換率と非同義置換率、および GC の動態に対する体重の有意な影響を報告します。 -コンテンツ。私たちの分子比較分析は、たとえ私たちのデータセットがほぼ大型の種だけで構成されていたとしても、化石記録と一致して、比較的小さなクサチオダクティラの祖先を示しています。この分析は、祖先形質の再構築における系統ゲノム学的手法の可能性を実証し、有胎盤哺乳類の祖先が比較的大型で長命の動物であったという最近の示唆を信頼できるものとします。

祖先形質、ベイズ推論、哺乳類、系統発生、置換率、GC 含量

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Reconstrução da evolução da massa corporal em Cetartiodactyla e mamíferos usando dados filogenômicos

Reconstruir caracteres ancestrais em uma filogenia é uma tarefa árdua porque os estados observados nas pontas da árvore correspondem a uma única realização do processo evolutivo subjacente. Recentemente, foi proposto que as características ancestrais podem ser estimadas indiretamente com a ajuda de dados moleculares, com base no fato de que as características da história de vida influenciam as taxas de substituição. Aqui desafiamos estas novas abordagens no Cetartiodactyla, um clado de grandes mamíferos que, de acordo com a paleontologia, deriva de pequenos ancestrais. Analisando dados do transcriptoma em 41 espécies, das quais 22 foram recentemente sequenciadas, fornecemos uma filogenia datada de Cetartiodactyla e relatamos um efeito significativo da massa corporal na taxa de substituição global, na taxa de substituição sinônima vs. -contente. Nossa análise comparativa molecular aponta para ancestrais Cetartiodactyla relativamente pequenos, de acordo com o registro fóssil, embora nosso conjunto de dados consista quase exclusivamente de espécies grandes. Esta análise demonstra o potencial dos métodos filogenómicos para a reconstrução de características ancestrais e dá crédito a sugestões recentes de que o ancestral dos mamíferos placentários era um animal relativamente grande e de vida longa.

caracteres ancestrais, inferência bayesiana, mamíferos, filogenia, taxa de substituição, conteúdo de GC

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Реконструкция эволюции массы тела Cetartiodactyla и млекопитающих по филогеномным данным

Реконструкция наследственных признаков филогении — трудная задача, поскольку наблюдаемые состояния на вершинах дерева соответствуют единственной реализации лежащего в основе эволюционного процесса. Недавно было высказано предположение, что наследственные черты можно косвенно оценить с помощью молекулярных данных, основываясь на том факте, что черты истории жизни влияют на скорость замещения. Здесь мы бросаем вызов этим новым подходам в отношении Cetartiodactyla, клады крупных млекопитающих, которые, согласно палеонтологии, произошли от мелких предков. Анализируя данные транскриптома 41 вида, из которых 22 были заново секвенированы, мы предоставляем датированную филогению Cetartiodactyla и сообщаем о значительном влиянии массы тела на общую скорость замен, соотношение синонимических и несинонимичных замен и динамику GC. -содержание. Наш молекулярный сравнительный анализ указывает на относительно небольших предков Cetartiodactyla, что согласуется с летописью окаменелостей, хотя наш набор данных почти исключительно состоит из крупных видов. Этот анализ демонстрирует потенциал филогеномных методов реконструкции наследственных признаков и подтверждает недавние предположения о том, что предком плацентарных млекопитающих было относительно крупное и долгоживущее животное.

предковые признаки, байесовский вывод, млекопитающие, филогения, скорость замещения, содержание GC

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

使用系统发育数据重建鲸鱼和哺乳动物的体重进化

在系统发育学上重建祖先特征是一项艰巨的任务，因为在树的尖端观察到的状态对应于潜在进化过程的单一实现。最近，基于生活史特征影响替代率的事实，有人提出可以借助分子数据间接估计祖先特征。在这里，我们对鲸齿兽中的这些新方法提出了挑战，鲸齿兽是一个大型哺乳动物的分支，根据古生物学，它们起源于小型祖先。通过分析 41 个物种（其中 22 个是新测序的）的转录组数据，我们提供了鲸鱼的系统发育史，并报告了体重对整体替代率、同义与非同义替代率以及 GC 动态的显着影响-内容。我们的分子比较分析指出鲸鱼祖先相对较小，与化石记录一致，尽管我们的数据集几乎完全由大型物种组成。该分析证明了系统发育学方法在祖先性状重建方面的潜力，并认可了最近的建议，即胎盘哺乳动物的祖先是一种相对较大且长寿的动物。

祖先特征，贝叶斯推理，哺乳动物，系统发育，替代率，GC含量

Submission: posted 18 May 2017
Recommendation: posted 05 December 2017, validated 05 December 2017

Cite this recommendation as:
Rannala, B. (2017) Predicting small ancestors using contemporary genomes of large mammals. Peer Community in Evolutionary Biology, 100042. https://doi.org/10.24072/pci.evolbiol.100042

Recommendation

Recent methodological developments and increased genome sequencing efforts have introduced the tantalizing possibility of inferring ancestral phenotypes using DNA from contemporary species. One intriguing application of this idea is to exploit the apparent correlation between substitution rates and body size to infer ancestral species' body sizes using the inferred patterns of substitution rate variation among species lineages based on genomes of extant species [1].
The recommended paper by Figuet et al. [2] examines the utility of such approaches by analyzing the Cetartiodactyla, a clade of large mammals that have mostly well resolved phylogenetic relationships and a reasonably good fossil record. This combination of genomic data and fossils allows a direct comparison between body size predictions obtained from the genomic data and empirical evidence from the fossil record. If predictions seem good in groups such as the Cetartiodactyla, where there is independent evidence from the fossil record, this would increase the credibility of predictions made for species with less abundant fossils.
Figuet et al. [2] analyze transcriptome data for 41 species and report a significant effect of body mass on overall substitution rate, synonymous vs. non-synonymous rates, and the dynamics of GC-content, thus allowing a prediction of small ancestral body size in this group despite the fact that the extant species that were analyzed are nearly all large.
A comparative method based solely on morphology and phylogenetic relationships would be very unlikely to make such a prediction. There are many sources of uncertainty in the variables and parameters associated with these types of approaches: phylogenetic uncertainty (topology and branch lengths), uncertainty about inferred substitution rates, and so on. Although the authors do not account for all these sources of uncertainty the fact that their predicted body sizes appear sensible is encouraging and undoubtedly the methods will become more statistically sophisticated over time.

References

[1] Romiguier J, Ranwez V, Douzery EJP and Galtier N. 2013. Genomic evidence for large, long-lived ancestors to placental mammals. Molecular Biology and Evolution 30: 5–13. doi: 10.1093/molbev/mss211

[2] Figuet E, Ballenghien M, Lartillot N and Galtier N. 2017. Reconstruction of body mass evolution in the Cetartiodactyla and mammals using phylogenomic data. bioRxiv, ver. 3 of 4th December 2017. 139147. doi: 10.1101/139147

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Reviews

Evaluation round #1

DOI or URL of the preprint: 10.1101/139147

Version of the preprint: 1

Author's Reply, 08 Nov 2017

Download author's reply https://doi.org/10.24072/pci.evolbiol.100067.ar1

Decision by Bruce Rannala, posted 30 Sep 2017

Here are my comments to the authors:

This is an important paper that just needs a few minor changes/clarifications. The authors should revise according to the recommendations of the two reviewers (myself and an anonymous reviewer). In particular, the anonymous reviewer and I both had some concerns about the uncertainty of the phylogeny. I would like to see a bit more analysis to determine whether incomplete lineage sorting may be a source of phylogenetic ambiguity for these data and I would like to see the raxML tree with branch lengths included in the paper (as suggested by the anonymous reviewer). Please also respond directly to the following comments by myself and the anonymous reviewer:

Page 12, Correlations of substitution rates/ratios and LHTs: I am not familiar with the COEVOL program but if it is producing the posterior distribution of the correlation coefficient why not provide the posterior mean and credible set rather than a p-value? (which is not a very Bayesian thing to do).

Tables 1 and 2: I think the legends must be reversed.

Figure 2: Are the points on this graph mean posterior dN/dS versus log_10(BM)? This should be stated in the legend.

l88: how strong are the reported correlations ? l101: same question.

l151: I am not sure what is meaning of this sentence. Does this refer to a better reconstruction of ancestral LHT ?

l169: Are you referring to phylogenetic inertia for leaf nodes ? Meaning that there is no need to actually compute a correlation in actual species before proceeding to the inference.

l197: few words on the "home made scripts" would be welcome. How do they filter out mis-aligned regions ?

l209: any justification for the log_10 transformation ?

l261: any insight on why this index rather than any other ?

l289: Kr/Kc ratio... As it is not so standard, can you define it ?

Table 1: what about reporting the median/mean/mode ? Plots of the posterior densities would also be very informative regarding the strength and robustness of the estimations.

https://doi.org/10.24072/pci.evolbiol.100067.d1

Reviewed by Bruce Rannala, 20 Jul 2017

This paper evaluates the statistical behavior of new methods for analyzing associations between life-history traits (LHTs) and rates\ of molecular evolution (dS and dN/dS). The basic idea is to study a group (Cetartiodactyla) with a fairly well resolved phylogeny a\ nd multiple fossil calibrations to evaluate whether the results seem sensible in this case. If so, that would provide some evidence \ that the results obtained in groups with poor fossil records might also be reasonable. The paper is well-written and the introductio\ n does a very nice job of summarizing the LHT methods and the motivation for the study. The results (positive correlations between b\ ody mass, age at maturity and dN/dS) fit the predictions of the reduced Ne theory as does the negative correlation with GC3. It seem\ s the method is producing reasonable results. I have a few concerns, some minor, some less so:

Page 9, phylogeny reconstruction: if dN/dS systematically varies across the group and the cause is a decreased Ne in larger species \ this might create more uncertainity of relationships among small species than among large species -- I wonder whether this could be \ a source of bias? Have the authors considered trying a species tree inference method that accounts for incomplete lineage sorting (w\ hich would have more effect with larger Ne) to see whether the results are consistent with the tree from concatenated sequences? Lat\ er in the paper it is noted that some alternative topologies produce similar results for correlations between rates and LHTs but I a\ m still curious.

Page 12, Correlations of substitution rates/ratios and LHTs: I am not familiar with the COEVOL program but if it is producing the po\ sterior distribution of the correlation coefficient why not provide the posterior mean and credible set rather than a p-value? (whic\ h is not a very Bayesian thing to do).

Tables 1 and 2: I think the legends must be reversed.

Figure 2: Are the points on this graph mean posterior dN/dS versus log_10(BM)? This should be stated in the legend.

https://doi.org/10.24072/pci.evolbiol.100067.rev11

Reviewed by anonymous reviewer 1, 20 Sep 2017

The ms by Figuet et al. is a case study on the inference of ancestral Life History Trait (LHT) using molecular markers, specifically dS, dN/dS and GC3. I found the ms scientifically sound, easy to follow and quite appealing. I have only few minor points that could potentially help broadening the readership.

Since the authors aim at convincing paleontologists (l124), a special effort to make all the analyzes crystal clear for non-specialists could be a good choice. As it is now, I am not sure paleontologists will be able to follow.

My main scientific question concerns the lack of coherence between the approaches: dN/dS suggests small body-sizes (part 1 --substitution mapping--) but at the same time not (part 2 --coevol--). Conversely, dS is the main driver in the coevol approach but the length of internal branches from the raxML tree are not shown. Finally, why the GC3 signal has not been included in the coevol approach to check its consistency to the part 3. Although the three parts all point to the same direction, it would be nice to dedicate some discussion on why the metrics (dS, dN/dS and CG3) differ in their predictions when using different approaches. The lack of coherent signal using dN/dS in the coevol framework is especially puzzling.

It would not hurt to emphasize that part 1 is done on a classical molecular phylogenetic tree (i.e. not ultra-metric) whilst the second is performed on a calibrated ultra-metric tree. I am not sure about the third part. Calibration has its own issues that could be discussed in line with my previous comment.

Can the authors show the raxML phylogeny ? As it is used for the first part of the analysis, it would be nice to have a look at it.

I also have a list of minor points/interrogations that will be easily addressed. They often all are of the same nature: the text is sometimes not self-sufficient; thus providing extra-information on methods or choices may not hurt. Although interested specialists will likely know or read the cited literature, casual readers would benefit from extra pieces of information within this ms.

l88: how strong are the reported correlations ? l101: same question

l151: I am not sure what is meaning of this sentence. Does this refer to a better reconstruction of ancestral LHT ?

l169: Are you referring to phylogenetic inertia for leaf nodes ? Meaning that there is no need to actually compute a correlation in actual species before proceeding to the inference.

l197: few words on the "home made scripts" would be welcome. How do they filter out mis-aligned regions ?

l209: any justification for the log_10 transformation ?

l261: any insight on why this index rather than any other ?

l289: Kr/Kc ratio... As it is not so standard, can you define it ?

Table 1: what about reporting the median/mean/mode ? Plots of the posterior densities would also be very informative regarding the strength and robustness of the estimations.

https://doi.org/10.24072/pci.evolbiol.100067.rev12