
Unraveling the Complexities of Mitochondrial Inheritance in Macoma balthica: Insights from Doubly Uniparental Inheritance
Discordant population structure inferred from male- and female-type mtDNAs from Macoma balthica, a bivalve species characterized by doubly uniparental inheritance of mitochondria
Abstract
Recommendation: posted 05 February 2025, validated 07 February 2025
Castilho, R. (2025) Unraveling the Complexities of Mitochondrial Inheritance in Macoma balthica: Insights from Doubly Uniparental Inheritance. Peer Community in Evolutionary Biology, 100540. 10.24072/pci.evolbiol.100540
Recommendation
The study by Le Cam and colleagues, entitled “Discordant population structure inferred from male- and female-type mtDNAs from Macoma balthica, a bivalve species characterized by doubly uniparental inheritance of mitochondria”, provides a fascinating exploration of the genetic structure and evolutionary dynamics of the Baltic tellin, Macoma balthica, a bivalve species exhibiting doubly uniparental inheritance (DUI) of mitochondria. This work is a significant contribution to the field of evolutionary biology, particularly in understanding how sex-specific mitochondrial inheritance can shape population structure and genetic diversity in marine organisms.
DUI is a remarkable exception to the typical maternal inheritance of mitochondria in metazoans, where both males and females can transmit their mitochondria, but through different routes. In species with DUI, females pass on their mitochondria to all offspring, while males transmit their mitochondria exclusively to their sons. This system results in males being heteroplasmic, carrying both maternal (F-type) and paternal (M-type) mitochondrial DNA (mtDNA). The study leverages this unique inheritance pattern to investigate the genetic diversity, divergence, and population structure of M. balthica across its distribution range, from the North Sea to the Gironde Estuary in Southern France.
One of the most striking findings of this study is the discordant population structure inferred from the male- and female-type mtDNAs. The authors sequenced the cox1 gene from both F-type and M-type mtDNA in 302 male individuals across 14 sampling sites. They found that the genetic differentiation between northern and southern populations was nearly three times higher for the M-type mtDNA compared to the F-type mtDNA. This discrepancy was further highlighted by the geographic localization of the strongest genetic break, which differed significantly between the two markers. For the F-type mtDNA, the break was located at the Finistère Peninsula, while for the M-type mtDNA, it was found at the Cotentin Peninsula, approximately 250 km apart. The authors propose several explanations for these differences, including a higher mutation rate, relaxed negative selection, and variations in effective population sizes for the M-type mtDNA. These factors could contribute to the observed divergence in genetic structure between the two mitochondrial types. Additionally, the study suggests that mito-nuclear genetic incompatibilities, arising from the interaction between mitochondrial and nuclear genes involved in oxidative phosphorylation and ATP production, could play a role in maintaining these genetic barriers.
The study also provides valuable insights into the phylogeographic history of M. balthica. The divergence times estimated for the F-type and M-type mtDNA clades suggest that the split between the northern and southern populations occurred before the last glacial maximum (LGM). This finding supports a scenario of pre-LGM vicariance rather than post-glacial primary intergradation. The authors' use of net divergence to estimate the timing of cladogenesis events within the Macoma species complex adds a robust temporal dimension to their phylogeographic analysis.
Another intriguing aspect of the study is the evidence of asymmetric introgression and hybrid zone dynamics. The authors observed that the genetic clines for the F-type and M-type mtDNAs were not only discordant in their geographic locations but also in their widths. The cline for the M-type mtDNA was significantly narrower than that for the F-type mtDNA, suggesting different selective pressures or migration balances acting on the two mitochondrial types. This finding raises important questions about the role of sex-specific selection and gene flow in shaping the genetic structure of populations with DUI.
The study also highlights the importance of considering sex-specific genetic markers in population genetic studies. One of the most intriguing aspects is the implication that M-type mtDNA may be evolving under different constraints than its female counterpart. The authors found higher nucleotide diversity and net divergence in M-type sequences suggestive of a relaxed selective regime or an increased mutation rate, which aligns with previous studies on DUI species. Given that M-type mitochondria function predominantly in sperm cells, the potential for oxidative damage, reduced purifying selection, and a higher mutation rate could explain these patterns. More broadly, this finding reinforces the idea that mitochondrial genomes, though typically constrained by purifying selection, can evolve along sex-specific trajectories when their inheritance is decoupled from maternal transmission.
While the study provides compelling evidence for the role of DUI in shaping the genetic structure of M. balthica, it also raises several questions for future research. For instance, the mechanisms underlying the higher mutation rate and relaxed selection in the M-type mtDNA remain to be fully elucidated. Additionally, the potential role of mito-nuclear incompatibilities in maintaining genetic barriers warrants further investigation. The authors suggest that future studies should explore the fitness consequences of inter-lineage crosses and the potential for asymmetric hybrid fitness in the context of DUI.
In conclusion, Le Cam and colleagues have made a significant contribution to our understanding of the evolutionary dynamics of mitochondrial inheritance in bivalves. Their findings not only shed light on the complex interplay between genetic, demographic, and selective factors in shaping population structure but also underscore the importance of considering sex-specific genetic markers in evolutionary studies. This work opens up new avenues for research into the role of DUI in speciation, adaptation, and the maintenance of genetic diversity in marine organisms.
References
Sabrina Le Cam, Brémaud Julie, Vanessa Becquet, Valerie Huet, Pascale Garcia, Amélia Viricel, Sophie Breton, Eric Pante (2025) Discordant population structure inferred from male- and female-type mtDNAs from Macoma balthica, a bivalve species characterized by doubly uniparental inheritance of mitochondria. bioRxiv, ver.4 peer-reviewed and recommended by PCI Evol Biol https://doi.org/10.1101/2022.02.28.479517
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.
This work was funded by the ANR (DRIVE project, grant n. ANR-18-CE02-0004-01) and by the contrat de plan Etat-Région (CPER/FEDER) ECONAT (RPC DYPOMAR)
Evaluation round #2
DOI or URL of the preprint: https://doi.org/10.1101/2022.02.28.479517
Version of the preprint: Version 3
Author's Reply, 20 Jan 2025
Decision by Rita Castilho, posted 23 Oct 2024, validated 09 Sep 2022
Dear authors,
I am sorry to have taken so much time to process this round of reviews. The summer is always a bit tricky to get these things rolling!
As you can read both reviewers have devoted time and attention to in-depth re-reviews. I think your manuscript merits the chance for another round of improvements, hopefully, the last...
Regarding reviewer 1 suggestions, can you particularly address the circularity of µ to get at Ne, since theta = 2Neµ where here Ne represents the relative abundance of individuals carrying M and F genomes. Also, model testing should probably be included to compare the population structure of the two genomes. Should you decide not to include it, you should at least justify it.
Reviewer 2, has a lot to suggest, and I would highlight the possibility of estimating expansion time and address the observation that the analysis of diversity statistics is largely oriented to comparison under the expectation of neutral equilibrium, which does not seem to be the case.
Of course, there are many more instances deserving your attention and consideration, and I hope you can address them all in a justifiable and thoughtful manner.
With best wishes,
Rita
Reviewed by John Wares, 02 Aug 2022
I was glad to read the latest version of "Discordant population structure..." by Le Cam et al. I note the update on taxonomy, which explains my confusion in the last version.
I still find it problematic how they are attempting to estimate µ to get at Ne, it is all a bit circular because of course theta = 2Neµ where here Ne represents the relative abundance of individuals carrying M and F genomes. So by estimating µ not from a distant relative - a concern I had previously - but by the greatest divergence within their data set, they are incorporating the population structure that they note varies between the two genomes. So, avoiding the saturation problem of using phylogenetic substitution patterns but a very messy problem with trying to evaluate the ratio of Ne of the two mitogenomes using haplotype diversity. Effectively, the ratio of theta - which can be estimated with pi or with haplotype diversity - is sufficient and not much more can be gleaned from theese data in that regard because the diversity itself is not representative of a single panmictic population.
It does appear that for their haplotype diversity and other contrasts they are using trimmed mitochondrial sequence from the male data, so that is useful. The population structure story overall, and how it varies between the two data partitions - is still quite interesting. However beyond the results themselves the authors spend a lot of time attempting 'post hoc' description of how the data fit different models, when that could be more rigorously tested.
This is a valuable contribution, it feels like the authors are trying to extend the inference beyond what is feasible with the available data however.
Reviewed by Risto Väinölä, 24 Aug 2022
I am pleased to see the authors have taken practically all the suggestions into account and dealt with the problem of relating the results to previously published data, and the associated biases. This is exemplary.
Pending those corrections, in the previous review I did not touch much on the population genetic/statistical analyses of the data, and stressed that those should be considered in the evolutionary/phylogenetic framework or context where the variation is supposed to have been molded. As the main directions to proceed from now: There is space and need to clarify the implicated biogeographical background; on the other hand some of the presented analyses do not appear very illustrative or sensible in this framework.
Introduction, M & M
[There i a marked copy of the beginning [only] of the pdf text containing minor linguistic remarks/suggestions to clarify the message, as well as comments. To pick up some:
About the overall distribution of M. balthica, as far as I know the range in NW Atlantic does not extend to USA (now “Virginia”), but the previous “M.b.” there are currently attributed to M. petalum.
M&M, from the explanation on 154 ff, it remains unclear to me which sequences/taxa were used for the primer design of M. balthica M genome. Capt et al. (2020) see below?
For the previous evidence of DUI in Macoma, reference is made (only) to Pante et al. (2017). That paper however presented only a hypothesis, no compelling evidence? There would be more appropriate and compelling subsequent papers to cite? Capt et al. (2020) https://doi.org/10.1038/s41598-020-57975-y is missed by mistake?
Study context, scenarios and analyses.
The context is about a hybrid zone / contact zone. It should be most important to explain or speculate more closely on the nature and history of the French contact zone in the beginning.
On 97-99 two contact zones are mentioned. The nature of the North Sea-Baltic zone as an Atlantic-Pacific inter-subspecies contact might be somehow decipherable, but the nature of the other, focal zone is now hardly explained at the outset. So please make explicit that (i) there is an inter-subspecies zone (not dealt with) and an intra-subspecies zone analysed [suggestions in text]. (ii) explain what is the hypothesis on the origin/nature of the latter zone [the major context for interpreting data]. (iii) present the results (also) in the conventional ways of transition zone analysis.
In general when studying hybrid zones we would imply either a secondary or primary transition, and by reference to a “contact” zone we’d assume secondary contact is the working hypothesis. If not, please directly put forward the alternatives (you only return on this on ln 465), and if yes, please base your examination on this scenario, be explicit about it and, in any instance, use the standard presentations of presenting clinal variation and hybrid zones.
Supposing a secondary contact framework, among the most essential phenomena shaping the variation would have been isolation in allopatry, and introgression/expansion after that. The analyses however focus discussing three other relevant phenomena, population size, mutation, and purifying selection, largely not putting them in the (spatio)temporal framework of the isolation and introgression history. Sorry to say but without that framework it is hard and frustrating to evaluate the meaning of and interest in the mutation patterns.
The isolation/contact scenario is necessarily tied to time, which the present treatment tries to cancel out from the analysis (ignoring rates and ages of events). While there are uncertainties, there do exist concepts of these ages (“post-glacial”, “prior to last glacial cycle”, “one or more cycles”) and even published estimates of molecular rates (for F type), which, even if not taken literally, do give the order of magnitude and basis for evaluation of what time scale the observed divergences and the generation/loss of singleton mutations do represent. There is also the scale of divergence between the M and F genomes, entering MK analysis, which is still not commented on. Finally, in conventional population genetics a mutation is a convenient ‘neutral’ unit of time, if calibration goes out of question.
Since the value of the results in documenting the discordance of variation (in space and time) in the transition zone (cline), I would very strongly also recommend using some conventional ways of presentation:
1. The standard way is to plot frequencies and statistics against geographical (shoreline) distance. Please do this for haplogroup frequencies, and other relevant statistics when illustrative (e.g. diversity); separate panels on top of each other for F and M. [now there is only Fig 4 of using this presentation but for a very obscure statistic.
2. If implying an isolation-contact cycle, state or speculate (present a hypothesis) on the correspondence, or lack thereof, of the mtDNA haplogroups and refugia (are these refugial lineages – as implied in earlier work).
3. Make the analysis/discussion of differences in introgression (mixing) of haplogroups following contact [or before it] a part of your treatment.
Comments/suggestions on the current analyses and illustrations (partly technical)
Considering the series of pie diagrams for M vs F haplotypes in Fig. 1 and Fig. 2, it is confusing that F is on the left in one fig and on the right in the other.
As to the scales of these figures (conventionally, circle area corresponds to haplotype frequency), evidently in Fig. 1 the scale icon does not correspond to the scale in the actual network. In Fig. 2 the scale in turn is not according to area but to diameter, please check and harmonize.
I appreciate the choice of not directly implying a (historical) correspondence between the major F & M haplogroups and to use “a neutral” indexing for M haplogroups. Nevertheless, it is kind of tedious compare patterns if using index 1 for the southern F types but index 3 for the southern M type, instead of starting consistently 1,2,3 from the south (now hard to read Fig. 1 and 3).
The diagrams in Fig 1 (center panel) and Fig 3 are in a way non-standard. While they do express a sense of discrepancy, they are not self-explanatory, and (i) at least for me the procedure of constructing the Fig. 1 diagram remains unclear and there should be an explanation of this ‘algorithm’ in the legend. And Fig. 3 too. (ii) the information of discrepancy in those two diagrams is largely the same (altho more detailed in Fig. 1); (iii) as regards the discrepancy of M-F haplotype distribution in individuals, the “analysis” in Fig. 3 is not very informative if it is not contrasted with any expectation (which evidently would depend on the geographical/contact context), and (iv) if intra-population correlations are not distinguished from those caused by the pooling of differentiated populations.
Indeed in the Results l 296 ff intra-population disequilibria are reported, a most interesting finding, also as there is evidently the looming idea in the background of signals of incompatibility in the M-F associations. But, in reference to this and the previous paragraph, there seems to be nothing about these analyses in the Material and Methods section! Please detail the analyses, and theis justification there, as appropriate.
The Discussion contains much thoughtful consideration of the observations (statistics) and of the nature of the contact zone, including references to hypotheses on refugial history and introgression, very relevant and clever. I will not go to that, but return to the ways data are treated and presented in the first place, which make the discussion unnecessarily tedious.
In general the treatment and analysis of diversity statistics (Tables and further, see below) is largely oriented in comparing them with the expectation under neutral equilibrium. Unfortunately in this instance such ‘tests’ can appear misplaced and irrelevant. This is because of the context, in which such an equilibrium null hypothesis is not reasonable: then it is waste of space to discuss (indirectly through the statistics) why results do not fit such models. And worse, when there is not enough statistical power the null hypotheses are accepted even if the geographical and haplotree patterns would point to the opposite.
A null hypothesis is of a unique independent population in mutation-drift balance (a state which cannot be attained in a postglacial time scale) and not affected by gene flow/introgression. But interpreting your populations in this context, and deviations in terms of mutation/Ne/selection, is insufficient or unnecessary, since gene flow and time (unbalance) are equally important actors. While the interest is in the clines and contact, it (only) makes sense to examine the data in that context from the start.
You want to keep the MK test since that is a commonly used test; yet as applied it is not very informative. Particularly, comparing outcomes of tests from northern and southern populations separately (Table) does not make sense given the relative time scales of M vs. F divergence compared to North vs. South. They probably represent >100 fold different time scales, and no meaningful difference can be thought to arise in the latest 1% of time. The more general aim of seeing whether selection affects differently in M vs. F also is not addressed by this test, but the outcome just tells there is selection on nonsyn sites, which is kind of trivial. I suppose the analysis of differences in strength/pattern of selection is topic of other studies, not only of the 500 bp here.
Further, the approach to the Nef/Nem ratio (215-234) appears a strange combination of misunderstanding and misinterpretation. Considering the formulae, please note that this is (only) about the basic equilibrium expectation of IAM for haploid data, i.e. we expect diversity [theta] = Nu, and more complicated formulae here are just spurious. [the effective no. alleles is *by definition* ne = 1-1/[theta] (or 1-1/H). Substituting that is just to make the formula look more complicated, it is not needed!]. At equilibrium we expect the observed diversity H=theta and you would have Nf/Nm= Hf/Hm (or Nf/Nm=Hf/Hm*u/u).
(i) now note, we cannot assume any of the (individual or combined) populations to be in mutation-drift equilibrium, so the approach does not fit here for estimating Ne.
(ii) the justification to using the u/u term was that there would have been differences in the F vs M mutation rates (ln 225). I would not deny that (see below), but confusingly you do yourself, stating that there were no differences (254). Why then this game.
(iii) the reasoning for the equal mutation rates can hardly be followed, also does not make sense to me. Further the nature and implementation of “relative rate test” remains unclear here [did not check the github though]. [is it that in this single instance you take the deepest divergence in the network (“TMRCA”) as the sole measure of divergence and divergence rate, assuming it should be of same age in both.
Instead of spurious testing, it could be more reasonable to use some common “at face value” approaches to phylogeographic haplotype data in evaluating the diversity results:
“Both F&M haplotype networks involve three core haplotypes, each associated with a set of one-step satellites (singletons, mostly).”
Without making explicit matches of FvsM cores, we see that the F cores are 3-9 mutations from each other, while M cores are 1-2 steps. Supposing that at least part of the inter-core differences represent refugial lineages, it seems that F lineages either evolved faster or are older. [yes there is an odd more distant M haplotype (isn’t it M. b. balthica?); whatever, involving it in age/rate comparisons should need a separate discussion/justification]. You may use other distance estimates, qualitatively the result should be the same. But using populations diversity estimates such as FST, they will be strongly affected be population mixing and introgression that confound the results. - This appears very different from the message in the Abstract?
In phylogeography, it is a useful and justified approach to interpret the core-satellite star phylogenies as indication of population expansion after a bottleneck. That involves an extreme non-equilibrium situation. Then (as also implemented in mismatch analyses) you may roughly estimate expansion time (in mutation units) from the frequency of the satellites (~frequency of singletons). Conversely, if you have estimate of mutation rate, you can estimate expansion time. I recommend this approach. There should be many examples; Laakkonen et al (2013) BMC Evol Biol comes to mind (end of page 8, and 13-14). - Here, pooling the cores and satellites, you will have c 10% mutated haplotypes in F and 15-20% in M, implying that either effective (neutral) mutation rate is 1.5-2 fold in M, or M expansion was older. Clearly this is in contrast to the result from inter-core differences. (it is close to the ratio of singletons).
Combine this with the geographical approach of plotting clines/statisics on shoreline distance, and pick up the peculiarity that introgression of the b2a to the Bay of Biscay was, exceptionally, mainly by single b2b haplotype (potential indication of selection].
Download the reviewEvaluation round #1
DOI or URL of the preprint: https://doi.org/10.1101/2022.02.28.479517
Author's Reply, 11 Jul 2022
Decision by Rita Castilho, posted 29 Apr 2022
Dear Authors,
Thank you for submitting your work for a recommendation at PCI Evol Biol.
Two reviewers and I have assessed your work. The recommendations from the reviewers are are-level and well balanced and while agreeing on several points, also point out distinctive aspects. We all agree that the data presented has interesting aspects: comparative DUI studies of geographical variation are scarce, and the chosen model allows pinpoint parallels and contrasts in other European marine taxa. This study will significantly contribute to the marine spatial pop gen and DUI literature if the issues raised by both reviewers are dealt with in detail.
Both reviewers’ constructive concerns are valid and well-argued, and therefore should be fully rebutted.
You have written a very nice and broad introduction that will help the general reader be up and running on the DUI-system peculiarities. The most problematic points are methodological:
1. The reviewers and I agree, question the presentation (or the estimation) of some population genetic statistics in the context of the manuscript MK test, divergence rate and population size estimates). Authors must reconsider focusing the results on the sex-specific spatial breaks, which are the most robust part of this paper.
2. Correctness and standardization of the taxonomic denomination (L. b. rubra vs L. balthica?) and the haplotype designation (see reviewer 1 comments on those issues).
3. Correspondence of the lineages b1, b2, b3 and d between the F vs. M genomes seems to constitute a major source of lack of clarity between this study and previous studies.
4. The particulars of the calibration ages used seem inappropriate and must be re-addressed.
There are many more points raised by the reviewers, but these stand out as the most relevant and somewhat impact the interpretation and conclusion of the study.
Best wishes,
Rita Castilho
Reviewed by Risto Väinölä, 25 Apr 2022
The manuscript describes contrasting population structures in distinct female and male inherited mitochondrial lineages of the intertidal bivalve Limecola balthica (Macoma balthica) along the West European coast. These are interesting data in several respects. While DUI has turned out to be unexpectedly widespread in bivalves, and levels of variability and evolutionary rates between lineages have been reported, comparative studies of geographical variation are rare, and the data here are unusual and significant already per se. On the other hand, the L. balthica complex turns out to have a complicated biogeographical history, which finds both parallels and contrasts in other European marine taxa.
The potential of the system in elucidating the evolution of isolation mechanism between hybridizing lineages or taxa is presented as an argument for the importance and interest of the data, whereas at this point any inferences about this importance cannot be made.
I would see that the raw data on the haplotype diversity and on the discordant transition zones as such could be important results worth publishing. In addition the ms presents a series of standard (and non-standard) population genetic statistics on these data, with their supposed evolutionary interpretations (e.g. MK test, divergence rate and population size estimates…), which to me however appear largely misplaced, irrelevant or erroneous in this context, and their presentation should be reconsidered. The point is that interpreting these issues from the data are (should be) based on a model (scenario) of the genealogical and biogeographical history of the lineage variation (isolation and invasion events and their ages), but the scenarios underlying the current treatment seem to be ad hoc, unclear and likely erroneous; this is related to ignoring parts of the published record and hypotheses of mtDNA variation in Limecola.
The variation of F mtDNA has been well explored by several research groups also previously. As noted in the introduction, there are two main lineages & taxa in Europe, L. b. rubra (b-lineage sensu Nikula et al.) and L. b. balthica (d lineage). L. b. rubra is thought to have been resident in Europe through most of the Pleistocene (2-3 Myr), whereas M. b. balthica = d lineage is a post-glacial invader from the Pacific (c. 10 ky). The variation dealt with in this study is only that of the rubra lineage at least as concerns the F genome [and that should be clearly stated, and even acknowledged in the taxonomic denomination of the study subject, L. b. rubra rather than L. balthica]. The haplogroups (F) I, II and III in this study correspond to H4, H3 and H1+2 of Becquet et al. (2012, 2013) and to the sublineages b3, b2, b1 of Nikula et al. (2007), and it would be fair to the reader to use uniform nomenclature for clarity. The genealogy of these lineages is probably most clearly depicted in Fig. 3 of Nikula et al (2007); while that tree is based on cox3 haplotypes, the identity of each (sub)lineage is also reported in terms of cox1 and thus cross-validated with the current and other reports (of Luttikhuizen, Becquet, Layton, etc.).
The main problem arising here and likely undermining a large part of the “inferences” in this paper is the interpretation of the correspondence of the lineages b1, b2, b3 and d between the F vs. M genomes. From the published record (of other research groups) the “Pacific” d lineage (or C group of Luttikhuizen) is common within the Baltic Sea (+Barents & White), but absent from the European Atlantic coast. In the Baltic Sea it is mixed with b1+2+3, but towards the north (the Umeå area) it becomes absolutely dominant (> 95 %, Nikula et al. Fig. 1; also Luttikhuizen 2003). The unexplained / unreported problem is that in this study, and in the authors’ previous larger Baltic and Barents Sea material (Becquet et al 2012, 2013), is that the dominant Baltic/Barents d lineage is not recognized or reported at all. Why it is so, is not, but should, be very clearly reported, and implications of the discrepancy be explained. Now indeed in this study there are only a couple of F haplotypes from the inner Baltic Umeå sote, neither of them d lineage! (But possibly a larger number were sequenced but not reported, of females?) A critical question is: if the authors have failed to detect/report the dominant Baltic F d lineage (M. b. balthica), on what grounds it is assumed that they would also consistently miss the (unknown) M d lineage of M. b. balthica; and why that question is not raised? From genotypic data (Nikula et al. 2008) there are no significant interlocus / mitonuclear disequilibria in the Northern Baltic and there is no reason to think the M and F haplotypes should be associated. Now it should be noted that the putative M I haplogroup here is almost exclusively reported from the Baltic (save one site in North Sea, info in Fig vs. Table is unclear about which site actually), a pattern rather to be expected for a d lineage. Given the data available, it would seem a more reasonable hypothesis that haplogroup M I phylogeographically corresponds to the d lineage (M. b. balthica) rather than to F I / b3, and the two main core haplotypes within haplogroup M IIa rather correspond to F I & II of current data (=b2+b3). This should of course be easily checked from pure Pacific M. b. balthica data. That cannot be required, but if not verified, the basis for most other statistics comparing the F vs. M variation in this ms will be lost.
If no credible data on the phylogeographic identity of the M haplogroups can be given, the situation should anyway be acknowledged. The option might remain then to leave out the three Baltic data points entirely (and thus the M I / putative d lineage), and restrict the report to the Atlantic/North Sea data and transition zones.
Another confusing issue of interpreting genealogies appears in the estimation of M vs F mutation or divergence rates through a comparison to corresponding DUI sequences in a pair of rather distantly related clams, Donax. The logic of the procedure and even of comparisons involved remain unclear, and should be explained by depicting the genealogy (tree) of all sequences involved and the branch lengths estimated. (What is the age of DUI in Limecola/Macoma by the way?) But it should be immediately evident also that using calibration ages c. 100 times older than the ingroup branches is not a viable approach in general, and that the substitution model used K2p (implied with uniform rates) will not be appropriate for such calibration but would yield rates an order-of-magnitude off the point. Indeed the ms involves three separate estimations of the substitution model which provide vastly discordant results, from nearly uncorrected K2p uniform to an extreme rate-heterogeneous model (gamma parameter 0.11-0.15), and these models do not correspond logically to the depths (ages) of the genealogy from which they were inferred, and are in no way commented.
At the same time, it goes unmentioned that there exist alternative calibration approaches in the literature and on a more relevant time-frame based on the main phylogeographic scenario of trans-arctic invasions (putting the d-vs-b lineage split at 2-3.5 Mya): any rate and age estimates should be also compared to those [or rather those could be used exclusively]. (Luttikhuizen went wrong here, and dismissed the alternative calibration points from Mytilus and Acropsis, which are congruent with the Macoma trans-Arctic scenario and time scale also).
As noted, many of the statistics in the tables now would not be biologically meaningful, in reference to the confusion of the phylogeographic scenarios / genealogies discussed above, and it makes no sense to touch or elaborate on them until the data basis and relevance of comparisons have been reconsidered.
Reviewed by John Wares, 17 Apr 2022
Review of Le Cam et al for PCI. Very interesting manuscript, well considered though I have some concerns they can address. They used M and F transmitted mtDNA to understand variation in structure and Ne, higher divergence in male, basically higher mutation, and maybe relaxed selection.
Overall, a super interesting system – and they suggest may lead to greater rates of barriers to gene flow arising, eg speciation. A great introduction, and interesting that they are doing this in my old friend Macoma (now Limecola), no I’ve never published work on L. balthica but have encountered it. Nice introduction to remind everybody of the many quirks of DUI as well as structure in L. balthica.
I’m going to point out here that the mechanisms they used for estimating mutation rate µ (1) should be moved up in the manuscript, as they refer to ‘known’ µ well before the contrasts with Donax are listed (both species are Tellinids) and (2) are somewhat problematic in my opinion, as the divergence of Limecola and Donax are considered to be on the order of 100mya, but mitochondrial COI is thought to saturate mutations at 3rd position sites for even Miocene divergences, an order of magnitude less. Thus, I agree that the M rates of mutation appear to be twice as high (likely) but suspect the rates themselves are not as useful as the authors would like; perhaps separating by codon position would be valuable for considering the rate at 1st/2nd.
The authors sexed each under microscope. Found that male mitochondria only found in gonad, though they sampled mantle for somatic. I guess this is possible in Tellinids unlike Mytilids. Was not clear where the M and F primers came from, are they Limecola specific or would they work in other bivalves?
Basic methods for sequencing, quality, haplotype frequency, networks, calculated H, pi, phiST, AMOVA, Taj D – and they estimated the ratio of Ne between the two using a peculiar older “effective number of alleles” Crow & Kimura – seems that a Hill number more appropriate?
How did they estimate male µ twice as high this is not listed (line 227). Method (ii) listed is basically dependent on Hill number as Hd is inverse Simpson, modified a bit. That method does not rely on µ to my knowledge? Line 236, not sure I like this as it is very indirect, we all know strong selection on COI (though on male COI not sure how it differs).
The skyline plots hmmm an indirect way of estimating Ne from gene tree shape, fine – but they will be confounded by data that deviate strongly from neutral expectations of course!
OK now they get at µ on line 247, estimated between Limecola and Donax, with a divergence time of 90-140 My – I don’t like this much, as COI tends to saturate at Miocene divergence. Maybe good enough for nonsynonymous; no separation of 1st, 2nd, 3rd.
Before I read results, it seems structure will be straightorward to see if distinct between the two for the same animals. The diversity is what is harder to struggle with – what mechanisms lead to distinct levels or patterns of diversity? – in any case they assume the male mitotype originated prior to the TMRCA, I don’t know how typical the ‘reset’ in DUI has happened in Tellinids, seems more variable in Mytilids.
Results:
Far more haplotypes in coxlm-long than coxlf, but *it is longer* – here they should compare cox1m; by the same measure there can be more divergence among haplotypes as the sequence is 200nt longer. However it is still more haplotypes for cox1m (the shorter version), so it is robust but that distinction has to be considered.
Line 290 divergence rates substituions per site (s/s) not standard notation, again I think the clock evidence should be presented earlier – does seem more divergence about 2 fold in cox1m but again my concerns above. The actual rate is not known because the time of divergence is not known with certainty but the 2 fold ratio I can buy.
Super intriguing to get distinct patterns between the two sex-associated mitogenomes…also interesting to see some combinations rare or absent, perhaps genomic conflict; and those odd haplotype combinations were in the hybrid zone sampling sites – cool!
Huh but π is greater in the F (lines 334-342), and D is more strongly negative in M than F. I would agree from Table 1 it is a consistent trend to more negative D values, so more rare alleles, thus the higher number of haplotypes seen above.
Lines 363-372. Rem we know strong purifying selection, but ancient divergence you get many fixed nonsyn mutations between M and F
I still really don’t get the Ne approaches, relative µ seems higher in M but I think estimated problematically and that could influence this. The higher number of singletons/rare alleles will of course suggest “exponential increase” but could also be stronger constraints on M.
Discussion
As noted in lines 406-409, the Male-type mtDNA diverges more rapidly and is far more polymorphic, “with more haplotypes, more haplotypes represented by a single male, and more segregating sites” – exactly, more singletons and rare mutations, which is what Tajima’s D has picked up. Perhaps the higher mutation rate (still under the significant constraints of COI, eg. See work by Dave Rand).
Overall because I think there are concerns both about the estimation of µ and that estimators of Ne will falter when there is evolution strongly affected by non-neutral processes, I don’t put much weight behind the estimators of Ne. The sex-specific spatial breaks are the most robust part of this paper, with pairwise Fst often being larger because of the greater number of private alleles I suspect.
Overall, I really like the paper and think with some consideration of these issues it would be a great contribution both to the marine spatial pop gen literature as well as the really fascinating DUI literature.