Many emerging diseases arise by parasites switching to new host species, while other parasites seem to remain with same host lineage for very long periods of time, even over timescales where an ancestral host species splits into two or more new species. The ability to understand these dynamics would form an important part of our understanding of infectious disease.
Experiments are clearly important for understanding these processes, but so are comparative studies, investigating the variation that we find in nature. Such comparative data do show strong signs of non-randomness, and this suggests that the epidemiological and ecological processes might be predictable, at least in part. For example, when we map patterns of parasite presence/absence onto host phylogenies, we often find that certain host clades harbour many more parasites than expected, or that closely-related hosts harbour closely-related parasites. Nevertheless, it remains difficult to interpret these patterns to make inferences about ecological and epidemiological processes. This is partly because non-random associations can arise in multiple ways. For example, parasites might be inherited from the common ancestor of related hosts, or might switch to new hosts, but preferentially establish on novel hosts that are closely related to their existing host. Infection might also influence the shape of host phylogeny, either by increasing the rate of host extinction or, conversely, increasing the rate of speciation (as with manipulative symbionts that might induce reproductive isolation).
These various processes have, by and large, been studied in isolation, but the model introduced by Engelstädter and Fortuna , makes an important first step towards studying them together. Without such combined analyses, we will not be able to tell if the processes have their own unique signatures, or whether the same sort of non-randomness can arise in multiple ways.
A major finding of the work is that the size of a host clade can be an important determinant of its overall infection level. This had been shown in previous work, assuming that the host phylogeny was fixed, but the current paper shows that it extends also to situations where host extinction and speciation takes place at a comparable rate to host shifting. This finding, then, calls into question the natural assumption that a clade of host species that is highly parasite ridden, must have some genetic or ecological characteristic that makes them particularly prone to infection, arguing that the clade size, rather than any characteristic of the clade members, might be the important factor. It will be interesting to see whether this prediction about clade size is borne out with comparative studies.
Another feature of the study is that the framework is naturally extendable, to include further processes, such as the influence of parasite presence on extinction or speciation rates. No doubt extensions of this kind will form the basis of important future work.
 Engelstädter J and Fortuna NZ. 2018. The dynamics of preferential host switching: host phylogeny as a key predictor of parasite prevalence and distribution. bioRxiv 209254 ver. 5 peer-reviewed by Peer Community In Evolutionary Biology. doi: 10.1101/209254
Please make some final typographical changes. Change the use of Figure and Fig. to be consistent. Figs in brackets except L273, 311, 317 etc.
I am very happy with the answers made by the authors to my remarks and suggestions, and I therefore warmly support the "recommendation" of this manuscript by PCI Evol Biol.
There are just a few minor corrections that should be made:
L 85-87 -> the reference to the phylogenetic effect of parasites is a bit weird here (and contains typo). I would move it to the discussion, L394, where other determinants of host-shifts are listed.
L123-127 -> Sentence starting with "In contrast [...]", a word is missing somewhere, it makes no sense.
L300 -> "in a with few" -> "in a few "
L368-369 -> When, talking about the assumption of the model, it may be good to say that those are in the presence of PDE.: "Our model makes a number of predictions: all else being equal and in the presence of PDE, 1) [...]'
L432-433 -> The authors say that they don't do within-host speciation and loss while they now do (cf. L168-171). This sentence should be removed or explained that it could be explored more intensely.
Dear Lucy, Thank you for these comments. We have now made all the suggested changes and uploaded the new version on BioRxiv. (This version also has the PCI front page, no line numbers, embedded figures etc. as recommended in the email I received from PCI.) We're also in the process of finishing our R package and will put it on GitHub some time next week. Best wishes, Jan
This manuscript describes a new model, which takes in to account host parasite cospeciation events as well as biased host switching in the form of the "host phylogenetic distance effect" i,e. the ability of parasites to preferentially switch and establish on closely related hosts. Importantly, the model makes predictions, which should be relatively straightforward to test with comparative data.
Both reviewers thought the paper was very well written and had just minor comments to make on the structure, specifically about incorporating some of the supplementary figures in to the text. I agree with their suggestions that it is hard to make it through the results without referring to the supplement. In particular, in agreement with one of the reviewers, I very much liked the section on the importance of host trees with specific examples and wondered whether there was a way of including a cartoon schematic of Figures S2A-D without having the actual large figures moved to the main text. It would also be nice to see the final infection frequency of hosts with these trees, as to my eyes, it is not obvious.
One of the reviewers also has some comments on how tree imbalance metrics affect infection frequencies and some further clarification points for the discussion.
In this manuscript entitled “The dynamics of preferential host switching: host phylogeny as a key predictor of parasite prevalence and distribution”, the authors propose a model to simulate the evolution of parasite species along a host phylogeny, taking into account the phylogenetic distance effect (PDE), i.e. the fact (commonly observed) that host-shifts occur preferentially towards hosts that are closely related (phylogenetically) to their host of origin. They evaluate the impact of this PDE and other host-tree-related parameters (speciation rate, extinction rates, tree size, etc.) on the prevalence and distribution of parasites across the host species. They end up with important predictions under the model assumptions: we should see more parasite when the host trees display a few large clades than when they show many small ones; host species turnover increases parasite prevalence, and small host clades harbour less parasites than large ones. I had a great time reading this paper that I found very clear and well written. I think that it is an important piece of work that clarifies many points and will be useful for future work, for comparing observations to predictions and better understanding host-parasite dynamics in biological systems. I only have some minor points that I would like to mention. First, the authors test the effect of PDE , which is nice because it is never considered in cophylogenetic methods, but at the same time they decide to forget about classical events such as failure-to-speciate or within-host speciation. The authors consider that “they do not expect this to affect their results qualitatively” (L. 426). I wonder why they did not include that in the model from the start, given the simplicity (or am I wrong?) of considering it. This should be justified. Second, it was suggested and experimentally shown by de Vienne (2009) that a parasite may be better at infecting a host that its close relatives can infect, making the parasite phylogeny a predictor of host shifts as well. This should be mentioned. Third, the author mention earlier work that estimated the impact of the shape of the tree (imbalance level) on the parasite-related parameters. Why didn't the authors look at the impact of this imbalance level (like Colless index) on the results? I think that none of the features considered relate to that, even indirectly. This may be interesting to add in order to better compare with previous work. Fourth, the authors consider that host-shift and speciation are concomitant. In our 2009 paper, we showed that (with our model) congruence between host and parasite phylogenies was only obtained when the time between shift and speciation was dependent on the distance of the shift (large distance, small time to speciation). This is apparently not the case here. This could be noted. Finally, I think that it would make sense to move figure S2 to the main text (at least one of the 4 panels), because it is discussed at length in the manuscript, and because the whole paper is on host and parasite trees and we don’t see any (apart from the sketch of Figure1).
This manuscript introduces a stochastic model predicting the effects of host phylogenetic distance on the distribution of (specialised) parasites across host clades. The authors generate testable predictions about the distribution and frequency of infection with and without this effect.
The manuscript tackles a genuine knowledge gap as highlighted by de Vienne et al. (2007). It is well written and arguments follow a logical order making its predictions and conclusions easy to interpret.
The model presented is fairly simple and makes some large assumptions: the most obvious being that parasites are host specific and that speciation only occurs when host speciation does. This limits the model's predictive power for a large swathe of parasites; however, the authors are upfront with the limitations of their model and its application to more host-specific parasites. This is not a strong criticism (I will spare you the George Box quotation).
One general comment is that there seems to be a strong reference to supplementary figures throughout the paper. I’m aware figure space is limited, but this may make reading the results a little difficult depending on the eventual format of the paper when published (some elaborated below).
Minor Comments: - 236. Introduce abbreviation: “Most recent common ancestor (MRCA)” - 249/Figure2. I would like to see in the main text the comparison of fig. 2 to figure S1, especially 2D vs S1D. The distribution of correlation coefficients between parasite and host with and without PDE seems like an important result that deserves to be in the main text.
- 309/Figure 5 (y-axis). Combination of 'parasite survival' and 'number of infected hosts'. Parasites must survive to be counted as infecting at the end of the simulation but are 'probability of survival' and 'number of hosts infected' the same thing or a combined measure? Please clarify or relabel.