The evolutionary puzzle of the host-parasite-endosymbiont Russian doll for apples and aphids
Large-scale geographic survey provides insights into the colonization history of a major aphid pest on its cultivated apple host in Europe, North America and North Africa
Recommendation: posted 22 October 2021, validated 26 October 2021
Each individual multicellular organism, each of our bodies, is a small universe. Every living surface -skin, cuticle, bark, mucosa- is the home place to milliards of bacteria, fungi and viruses. They constitute our microbiota. Some of them are essential for certain organisms. Other could not live without their hosts. For many species, the relationship between host and microbiota is so close that their histories are inseparable. The recognition of this biological inextricability has led to the notion of holobiont as the organism ensemble of host and microbiota. When individuals of a particular animal or plant species expand their geographical range, it is the holobiont that expands. And these processes of migration, expansion and colonization are often accompanied by evolutionary and ecological innovations in the interspecies relationships, at the macroscopic level (e.g. novel predator-prey or host-parasite interactions) and at the microscopic level (e.g. changes in the microbiota composition). From the human point of view, these novel interactions can be economically disastrous if they involve and threaten important crop or cattle species. And this is especially worrying in the present context of genetic standardization and intensification for mass-production on the one hand, and of climate change on the other.
With this perspective, the international team led by Amandine Cornille presents a study aiming at understanding the evolutionary history of the rosy apple aphid Dysaphis plantaginea Passerini, a major pest of the cultivated apple tree Malus domestica Borkh (1). The apple tree was probably domesticated in Central Asia, and later disseminated by humans over the world in different waves, and it was probably introduced in Europe by the Greeks. It is however unclear when and where D. plantaginea started parasitizing the cultivated apple tree. The ancestral D. plantaginea could have already infected the wild ancestor of current cultivated apple trees, but the aphid is not common in Central Asia. Alternatively, it may have gained access only later to the plant, possibly via a host jump, from Pyrus to Malus that may have occurred in Asia Minor or in the Caucasus. In the present preprint, Olvera-Vázquez and coworkers have analysed over 650 D. plantaginea colonies from 52 orchards in 13 countries, in Western, Central and Eastern Europe as well as in Morocco and the USA. The authors have analysed the genetic diversity in the sampled aphids, and have characterized as well the composition of the associated endosymbiont bacteria. The analyses detect substantial recent admixture, but allow to identify aphid subpopulations slightly but significantly differentiated and isolated by distance, especially those in Morocco and the USA, as well as to determine the presence of significant gene flow. This process of colonization associated to gene flow is most likely indirectly driven by human interactions. Very interestingly, the data show that this genetic diversity in the aphids is not reflected by a corresponding diversity in the associated microbiota, largely dominated by a few Buchnera aphidicola variants. In order to determine polarity in the evolutionary history of the aphid-tree association, the authors have applied approximate Bayesian computing and machine learning approaches. Albeit promising, the results are not sufficiently robust to assess directionality nor to confidently assess the origin of the crop pest. Despite the large effort here communicated, the authors point to the lack of sufficient data (in terms of aphid isolates), especially originating from Central Asia. Such increased sampling will need to be implemented in the future in order to elucidate not only the origin and the demographic history of the interaction between the cultivated apple tree and the rosy apple aphid. This knowledge is needed to understand how this crop pest struggles with the different seasonal and geographical selection pressures while maintaining high genetic diversity, conspicuous gene flow, differentiated populations and low endosymbiontic diversity.
- Olvera-Vazquez SG, Remoué C, Venon A, Rousselet A, Grandcolas O, Azrine M, Momont L, Galan M, Benoit L, David GM, Alhmedi A, Beliën T, Alins G, Franck P, Haddioui A, Jacobsen SK, Andreev R, Simon S, Sigsgaard L, Guibert E, Tournant L, Gazel F, Mody K, Khachtib Y, Roman A, Ursu TM, Zakharov IA, Belcram H, Harry M, Roth M, Simon JC, Oram S, Ricard JM, Agnello A, Beers EH, Engelman J, Balti I, Salhi-Hannachi A, Zhang H, Tu H, Mottet C, Barrès B, Degrave A, Razmjou J, Giraud T, Falque M, Dapena E, Miñarro M, Jardillier L, Deschamps P, Jousselin E, Cornille A (2021) Large-scale geographic survey provides insights into the colonization history of a major aphid pest on its cultivated apple host in Europe, North America and North Africa. bioRxiv, 2020.12.11.421644, ver. 3 peer-reviewed and recommended by Peer Community in Evolutionary Biology. https://doi.org/10.1101/2020.12.11.421644
Ignacio Bravo (2021) The evolutionary puzzle of the host-parasite-endosymbiont Russian doll for apples and aphids. Peer Community in Evolutionary Biology, 100134. https://doi.org/10.24072/pci.evolbiol.100134
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.
Evaluation round #2
DOI or URL of the preprint: https://www.biorxiv.org/content/10.1101/2020.12.11.421644v2
Version of the preprint: 2
Author's Reply, 27 Sep 2021
Decision by Ignacio Bravo, posted 23 Aug 2021
first of all, please apologise for the too long time between the reviewers’ response and this answer of mine.
In the current revision of their manuscript, Olvera-Vázquez and coworkers have addressed most of the points raised during the first PCI review round. Most of the questions have been properly addressed, and I think the review process has helped clarify the message. Nevertheless, I consider that a number of questions still remain confusing in my eyes and that still require to be elucidated, as detailed below:
-Treatment of admixed individuals.
The authors were unable to assign 175 individuals (a third of the total individuals analysed) to any of the five genetic clusters. These individuals were thus not included in any further analyses. The large number of admixed individuals raises an important concern for the pertinence and interpretation of subsequent analyses. This possible caveat needs to be properly identified and a clear word of caution must be raised in the discussion and possibly in the abstract. The implication of the heterogenous distribution of the admixed individuals in the RF-ABC approach needs also to be explicitly stated.
I think an indication of the geographical distribution of these admixed individuals is needed. I pointed this in my previous decision and the authors answered “We believe that the map Figure 2 is already presenting those results. The Western European and Spanish genetic clusters are the most admixed, as shown in the mean membership coefficient per site”. I am afraid I disagree with this answer: nothing in figure 2 indicates the geographical distribution of the admixed individuals. From data presented in FigS6 it seems that the number of admixed individuals is not evenly distributed across sampling sites. For instance, samples collected in the USA, Morocco or Romania seem to contain less admixed individuals than samples from France. (Note: it may be that the admixed individuals have been included for the analyses depicted in Figure 1, but this is unclear.)
I suggest that the authors include in the pie charts in figure 2c a sixth category corresponding to the admixed individuals. I would also suggest to avoid overlapping the pie charts (the precise location is given elsewhere) and to make the individual size of each pie chart proportional to the total number of individuals analysed in the sample site.
The 3-D PCA is unclear (as any 3-D representation is). I would suggest to present instead two 2-D representations of PC1vsPC2 and PC1vsPC3. This may help highlight the apparently true isolation of the blue genetic cluster and may also help visualise the apparently intermediate location of admixed individuals between the green and red genetic clusters.
-distribution of samples used for 16S rDNA analyses.
I understand from the answer to my comment that the authors are aware of the lack of representation of certain geographical regions in these analyses. For the sake of clarity and to avoid generalisations, I would suggest to make it explicit in the discussion what geographical locations were undersampled for metagenomics or not included with respect to the aphid genome markers.
-regarging isolation-by-distance analyses.
The Sp-based analyses have been performed using only the individuals allocated to one of the five genetic clusters, and the same seems to hold true for results in Fig S11. However, this analysis should probably be performed using all individuals in each orchard, including the admixed ones. Please include in the figure the results for the fit that are included in the text (F value, P value, R2). I would also recommend to perform the linear fit without the most distant sampling sites in the USA, and also probably to perform the linear fit only for the European samples. Please specify also the units for the x-axis.
Reviewed by Pedro Simões, 13 Jun 2021
Reviewed by anonymous reviewer, 18 Jun 2021
Evaluation round #1
DOI or URL of the preprint: 10.1101/2020.12.11.421644
Version of the preprint: 1
Author's Reply, 20 May 2021
Decision by Ignacio Bravo, posted 29 Jan 2021
In this text Olvera-Vázquez and coworkers present an exhaustive analysis of the genetic diversity of Dysaphis plantaginea, an economically important pest of the cultivated apple. The authors have targetted three loci by Sanger sequencing, have used 30 SSR markers and have sequenced a small fragment of the 16 rDNA of the aphids’ endosymbionts, on over 660 samples from North America, North Africa and Europe. The authors have then applied phylogenetic inference, genetic population analyses and random-forest algorithms to try to infer the evolutionary history of this species, and geographic progression and the colonisation advancement and possible bottlenecks. Unfortunately, the authors were not able to obtain samples from Eastern and Central Asia, which has hampered a global response to the question of the origin and expansion routes of this aphid (but see below on the presence of Iranian samples in their dataset). Together with the two reviewers, I agree that the data and the approaches used are appropriate to address the question, which is interesting from a fundamental as well as from an applied perspective. Nevertheless, and also together with the reviewers, I identify some instances in which the logic of the question is unclear, while in other instances the data and methods are not fully exploited. I recommend the authors to address all the points raised by the reviewers, as well as those listed below. ______ The two Iranian Dysaphis samples are not monophyletic and distant from all other D. plantaginea samples. I would suggest to verify that this sister position in the tree is consistent for all three markers used. Also, I am not sure I understand the authors’ choice to not have genotyped these two samples. The authors say that “Those two samples were not included in the population genetics analyses using SSR, as we only had two representants (representatives) from this Caucasian region (Table S1).” But in table S1, besides these two samples, there are seven samples taken in Iran, at a single location, that have been used for the bacterial 16S characterisation and one of them also for the three-loci sequencing. I think the authors could/should include in the genetic analyses at least these seven samples, and most likely the two more divergent ones. The authors use different algorithms for inferring genetic populations among their individuals. For the 582 individuals genotyped the authors are able to allocate 407 them into one of five different genetic populations. The authors present a bimodal distribution of population probability assignment in S10 to substantiate their threshold choice for the probability of assigning one individual to a population. It is however unclear whether the presented values are all probabilities (i.e. 5*582) or only the maximum values for each individual. A clarification is needed, as the interpretation of the bimodality would differ. In their classification algorithm the authors do not present anymore the data regarding the geographical distribution of the “admixed” individuals. A description and a discussion of the homo/heterogeneity in the geographical distribution of the admixed individuals could be needed. Also, a presentation in the main text of (for instance) the K5 in Fig S6, before and after having masked the admixed individuals may also be needed and useful for interpretation. The authors present in Fig 2D a PCA to display the relationships between the genotyped individuals. A proper description is needed here, because the nature and number of the variables displayed is missing. This information is essential, because the fraction of information explained by the two axes shown is very low (the third axis is displayed without this information). This suggests that the number of dimensions per individual is very large, but is very unclear. These dimensions cannot be the probabilities assigned per individual to belong to each of the five populations (which could be also an appropriate representation). They also cannot be the 29 SSR used, which could be the other option. I guess they are the actual genetic data retrieved but this needs clarification. I have then a problem with the table used to calculate the Fst table in Fig 2e. The values have been calculated for the five genetic populations identified, but I am not convinced that the level used to estimate Fst (the inferred five populations) is the appropriate one, instead of having used the locations as the level of integration to estimate genetic differentiation. The authors have amplified and sequenced a small 251bp stretch of the 16 rDNA for 175+3 aphid extracts. I have a problem with the distribution of the samples used for 16S sequencing: the authors claim that these 175 samples “represent the range of our sampling”, but I am not sure I agree. From table S1 I understand for instance that no sample from Morocco has been submitted for 16S sequencing. Also, the authors claim that “92 % of the reads were assigned to a single B. aphidicola OTU, which was found associated with all D. plantaginea”, but this OTU is absent from five Iranian samples. Overall, I think that the analyses of endosymbiont diversity are not really exploited. An attempt of trying to link the endosymbiont diversity to local clusters, e.g. the presence and meaning of Stenotrophomonas in four French samples, and the presence of Serratia in samples from France, Iran and Spain (incompletely described in the text in L713), or to the genetics of the host would be welcome here. Additional comments. In some instances the figures mentioned do not exist (e.g. L684, pointing to Figure 3c) or do not match the one referred to (e.g. L628, pointing to FigS11). Also in some instances the dimensions and the variables plotted in the different axis are unclear (e.g. “distance” in fig S11, “proportion of assignation” in fig S10. L300-309: Please describe how manual corrections to chromatograms were introduced and how often they were required. Please mention in paragraph in line 292 that trpB was also sequenced.