Identification of a gene cluster amplification associated with organophosphate insecticide resistance: from the diversity of the resistance allele complex to an efficient field detection assay

Stephanie Bedhomme based on reviews by Diego Ayala and 2 anonymous reviewers

A recommendation of:
Julien Cattel, Chloé Haberkorn, Fréderic Laporte, Thierry Gaude, Tristan Cumer, Julien Renaud, Ian W. Sutherland, Jeffrey C. Hertz, Jean-Marc Bonneville, Victor Arnaud, Camille Noûs, Bénédicte Fustec, Sébastien Boyer, Sébastien Marcombe, Jean-Philippe David. A genomic amplification affecting a carboxylesterase gene cluster confers organophosphate resistance in the mosquito Aedes aegypti: from genomic characterization to high-throughput field detection (2020), bioRxiv, 2020.06.08.139741, ver. 4 peer-reviewed and recommended by Peer Community in Evolutionary Biology. . 10.1101/2020.06.08.139741
Submitted: 09 June 2020, Recommended: 03 November 2020
Cite this recommendation as:
Stephanie Bedhomme (2020) Identification of a gene cluster amplification associated with organophosphate insecticide resistance: from the diversity of the resistance allele complex to an efficient field detection assay. Peer Community in Evolutionary Biology, 100114. 10.24072/pci.evolbiol.100114

The emergence and spread of insecticide resistance compromises the efficiency of insecticides as prevention tool against the transmission of insect-transmitted diseases (Moyes et al. 2017). In this context, the understanding of the genetic mechanisms of resistance and the way resistant alleles spread in insect populations is necessary and important to envision resistance management policies. A common and important mechanism of insecticide resistance is gene amplification and in particular amplification of insecticide detoxification genes, which leads to the overexpression of these genes (Bass & Field, 2011). Cattel and coauthors (2020) adopt a combination of experimental approaches to study the role of gene amplification in resistance to organophosphate insecticides in the mosquito Aedes aegypti and its occurrence in populations of South East Asia and to develop a molecular test to track resistance alleles.
Their first approach consists in performing an artificial selection on laboratory Ae. Aegypti populations started with individuals collected in Laos. In the selected population, an initial 90% mortality by adult exposure to the organophosphate insecticide malathion is imposed. This population shows a steep increase in resistance to malathion and other organophosphate insecticides, which is absent in the paired control population. The transcriptomic patterns of the control and the evolved populations as well as of a reference sensitive population reveals, among other differences, the over-expression of five carboxy/choline esterase (CCE) genes in the insecticide selected population. These five genes happen to be clustered in the Ae. aegypti genome and whole genome sequencing of a highly resistant population combined to qPCR test on genomic DNA showed that the overexpression of these genes is due to gene amplification. Although it would have been more elegant to have replicate selected and control populations and to perform the transcriptomic and the genomic analyses directly on the experimental populations, the authors gather a set of experimental evidence which combined to previous knowledge on the function of the amplified and over-expressed genes and on their implication in organophosphate insecticide resistance in other species allow to discard the possibility that this gene amplification spread by drift in the selected population.
In a second part of the paper, copy number variation for CCE genes is checked in field sample populations. This test reveals the presence of resistance alleles in half of the fourteen South East Asia populations sampled. Very interestingly, it also reveals a high level of complexity and diversity among the resistance alleles: it shows first the existence, both in the experimental and the field populations, of at least two amplified alleles (differing by the number of genes amplified) and second a high variation in the copy number of amplified genes. This indicates that gene amplification as a molecular resistance mechanism has actually lead to a high diversity of resistance alleles. These alleles are likely to differ both by the level of resistance conferred and the fitness cost imposed in the absence of the insecticide and these two values are affecting the evolution of their frequency in the field and ultimately the spread of resistance.
The last part of the paper is devoted to the development of a high-throughput Taqman assay which allows to determine rapidly the copy number of one of the esterase genes amplified in the resistance alleles described earlier. This assay is nicely validated and will definitely be a useful tool to determine the occurrence of these resistance alleles in field population. The fact that it gives access to the copy number will also allow to follow its copy number across time and get insight into the complexity of resistance evolution by gene amplification.
To sum up, this paper studies the implication of carboxy/choline esterase genes amplification in organophosphate resistance evolution in Ae. aegypti, reveals the diversity among individuals and populations of this resistance mechanism, because of variation both in the identity of the genes amplified and in their copy number and sets up a fast and efficient tool to detect and follow the spread of these resistant alleles in the field. Additionally, the different experimental approaches adopted have generated genomic and transcriptomic data, of which only the part related to CCE gene amplification has been exploited. These data are very likely to reveal other genomic and expression determinants of resistance that will give access to an extra degree of complexity in organophosphate insecticide resistance determinism and evolution.

References

Bass C, Field LM (2011) Gene amplification and insecticide resistance. Pest Management Science, 67, 886–890. https://doi.org/10.1002/ps.2189
Cattel J, Haberkorn C, Laporte F, Gaude T, Cumer T, Renaud J, Sutherland IW, Hertz JC, Bonneville J-M, Arnaud V, Nous C, Fustec B, Boyer S, Marcombe S, David J-P (2020) A genomic amplification affecting a carboxylesterase gene cluster confers organophosphate resistance in the mosquito Aedes aegypti: from genomic characterization to high-throughput field detection. bioRxiv, 2020.06.08.139741, ver. 4 peer-reviewed and recommended by PCI Evolutionary Biology. https://doi.org/10.1101/2020.06.08.139741
Moyes CL, Vontas J, Martins AJ, Ng LC, Koou SY, Dusfour I, Raghavendra K, Pinto J, Corbel V, David J-P, Weetman D (2017) Contemporary status of insecticide resistance in the major Aedes vectors of arboviruses infecting humans. PLOS Neglected Tropical Diseases, 11, e0005625. https://doi.org/10.1371/journal.pntd.0005625


Revision round #2

2020-10-13

Dear authors,

The three reviewers and myself have acknowledge the changes made following the comments made on the first version of the manuscript and the four of us are convinced that this paper has to be published and recommended. However, before I recommend it, I would like the authors to make the wording changes strongly suggested by two of the reviewers. These changes are necessary to make the interpretation of your data and of the interest of the new method more cautious and realistic.

Stephanie Bedhomme

Additional requirements of the managing board:
As indicated in the 'How does it work?’ section and in the code of conduct, please make sure that:
-Data are available to readers, either in the text or through an open data repository such as Zenodo (free), Dryad or some other institutional repository. Data must be reusable, thus metadata or accompanying text must carefully describe the data.
-Details on quantitative analyses (e.g., data treatment and statistical scripts in R, bioinformatic pipeline scripts, etc.) and details concerning simulations (scripts, codes) are available to readers in the text, as appendices, or through an open data repository, such as Zenodo, Dryad or some other institutional repository. The scripts or codes must be carefully described so that they can be reused.
-Details on experimental procedures are available to readers in the text or as appendices.
-Authors have no financial conflict of interest relating to the article. The article must contain a "Conflict of interest disclosure" paragraph before the reference section containing this sentence: "The authors of this preprint declare that they have no financial conflict of interest with the content of this article." If appropriate, this disclosure may be completed by a sentence indicating that some of the authors are PCI recommenders: “XXX is one of the PCI XXX recommenders.”

Reviewed by anonymous reviewer, 2020-10-02 16:13


The manuscript has been improved and for the most part, my concerns/ questions have been sufficiently addressed.

I do have one last comment about the wording describing the utility of the high-throughput field detection assay. In their response to reviewer comments, the authors explain that "the high-throughput diagnostic assay developed was not presented as a molecular assay allowing to track organophosphate resistance in this mosquito species but rather as a new tool to efficiently track this specific resistant mechanism." However in the manuscript, they are a bit less reserved about the implications of the assay: "will improve the tracking and management of organophosphate resistance in natural mosquito populations". If CCE amplification is not the dominant mechanism of organophosphate resistance in SEA, it may not be a useful assay for understand general resistance patterns. I assume that the frequency of this mechanism as opposed to others is not know and so I think the authors need to be more cautious in their statements about its utility for general resistance tracking and management.

Reviewed by anonymous reviewer, 2020-09-28 13:17


This remains a nice paper that should be published. My comment about the strong interpretation and the claim of "confirmation" is one that I stand by (for reasons that I explain below), but an alternative to re-wording this term would be to at least argue against the possibility of drift in the manuscript (see below).

The authors' reply was the following:

"We agree that despite the large number of mosquitoes used for selection, our experimental selection design using a single replicate does not allow to fully control for genetic drift effects. However, the fact that this same CCE gene amplification was previously associated with organophosphate resistance in natural Ae. aegypti populations (see Faucon et al 2015, 2017, Poupardin et al 2014, Gouindin et al 2017, Marcombe et al 2009, ...) but also in Aedes albopictus (Grigoraki et al 2016 and 2017) clearly refutes the role of genetic drift in its increase frequency in our selected line. We agree that it would have been interesting to compare the frequency of this marker in dead and survivors of our laboratory lines following insecticide exposure but this would still not have constituted a proof of its role in resistance as a marker could have no functional role in resistance but still be segregated between dead and survivors as a consequence of being genetically linked with another resistance locus. A better way to perform such genotype-phenotype association study would be to use individuals or lines obtained from controlled crosses with a susceptible strain (at least F2) in order to break (by recombination) genetic links between markers and segregate individuals with different insecticide doses before investigating their CCE amplification genotype. While we such experiment will likely confirm the association of this marker with resistance, we thought that performing such heavy experiment was not necessary given the multiple evidences (including functional validation) supporting its role in organophosphate resistance. In conclusion, while we agree that our study does not constitute an irrefutable proof of the role of this CCE gene duplication in resistance we believe that the data we generated together with the published literature on this subject constitute enough evidence to support the role of this marker in organophosphate resistance in this mosquito species."

I disagree that previous evidence of the amplification being associated with organophosphate resistance in natural populations "refutes" the role of drift in their selected line, as the essence of the argument is circular: the known importance of the amplification is used to conclude that the results must be due to selection on the amplification, and these results are then used to claim "confirmation" of the importance of the amplification. If the importance of this amplification is so absolutely certain that it is unimaginable that it would not increase in frequency in the selected line, then the experiment has no value (we already knew the answer before it was conducted), the results add nothing to existing knowledge, and no "confirmation" is necessary. If on the other hand it was possible that selection would not lead to an increase in amplification frequency, then it is also feasible that the amplification could increase in frequency through drift. While the observed increase in frequency in a single replicate line with only two time points is indeed probably due to the importance of the amplification in resistance, it does not "confirm" it. While I would still prefer it if the word "confirm" were replaced with alternatives, such as "support" or "add weight to" or "are in accordance with", I think an alternative would be for the authors to instead add a few sentences arguing why the results are unlikely to be due to drift. For example, the increased expression in those same amplified genes that was seen after selection is unlikely to be selectively neutral, suggesting that drift is unlikely. Also, the fact that at least two independent amplification haplotypes have increased in frequency makes drift less likely. In either case, I would also change "confers" to "conferring" in the title, to remove the implication that the resistance association is a novel finding of this paper. I am happy for the editor to decide whether or not to recommend these changes.

I would also like to revisit my comment about line 515 of the original manuscript (now line 549). My comment was:

line 515: "no false negative was observed" is redundant given the first part of this sentence. Also, could you comment on whether any false positives were observed?

and the reply from the authors was:

As stated in the manuscript, although the sybrgreen qPCR assay slightly over-estimated copy numbers, all individuals identified as positive using this assay were also found positive using the TaqMan assay (no false positives). Reciprocally, all individuals identified as negative using the sybrgreen qPCR assay were also negative using the TaqMan assay (no false negatives).

The way this is phrased in the manuscript is still confusing in my opinion. The fact that "all individuals identified as positive using [the qPCR] assay were also found positive using the Taqman assay" is an indication that the Taqman assay had no false NEGATIVES (ie: no cases where the Taqman assay was negative when the qPCR gold standard was positive). On the other hand the fact that "all individuals identified as negative using the sybrgreen qPCR assay were also negative using the TaqMan assay" is an indication that the Taqman assay had no false POSITIVES (no cases where the Taqman assay was positive when it should have been negative). Thus, the way that the manuscript is currently phrased is saying twice that there were no false negatives, and is not currently saying that there were no false positives.

Minor points and typos:

lines 51-52: In response to my previous comment, the authors explained what they mean here by "their usefulness", but the sentence in the manuscript remains unchanged and, in my opinion, unclear. First, it is not grammatically clear whether "their" refers to insecticides or amplifications. The authors replied to clarify that it referred to amplifications, but have not made the text clearer. Also, the meaning of the statement in the manuscript is odd: it is basically saying that resistance alleles (amplifications in this case) are useful for monitoring resistance alleles. Again, the authors explained what they meant in their reply, but the sentence in the manuscript is no less confusing that before.

line 54: developing -> to develop

line 75: Either change "such" to "this" or make "mechanism" plural (such mechanisms have been shown to be major drivers...)

line 86: "later" -> "latter"

line 111: such tool -> such a tool

line 360: The authors say that they have modified this sentence to remove the word "invalidating", but the sentence has remained unchanged.

line 438: This has been clarified by the authors in their reply, but remains unclear in the manuscript. It still comes across that 33% of G5-Mala individuals have haplotype A and 67% of G5-Mala individuals have haplotype B. In fact, as clarified by the authors in their reply, it is 33% of the amplified haplotypes that were A and 67% that were B. I think it would be clearer if the authors included exact numbers, as they did in their reply, or rephrased this to make it clear that 33% and 67% are not percentages of the G5-Mala population. Also, since it seems that the B haplotype cannot be detected in an AB heterozygote (because the presence of A masks the presence of B), I suggest clarifying that the that 67% is a minimum value (some of the 33% that have A might also have had B).

line 577: Change "aiming at preventing" to either "aiming to prevent" or "aimed at preventing"

line 683: increase of the frequency -> increase in the frequency

line 684: comma after "selection"

Reviewed by Diego Ayala, 2020-10-11 15:44


Dear Editor,

I acknowledge the effort carried out by the authors in this new version. I have carefully gone through the manuscript’ changes and answers to the reviewers and editor and I have been impressive by the quality of the details. The editor has arisen an important issue about the possibility that other genes left aside in the present manuscript could also be involved in the resistant phenotype. Although it is a potential caveat that it has to be considered, I have to say that I agree with the more conservative approach assumed by the authors and their explanations.

The manuscript will help to better understand the insecticide resistance in this major vector and will help its evolution across the world.


Revision round #1

2020-08-17

Dear authors,

Three reviewers have now read your manuscript entitled “A novel assay for tracking carboxylesterase gene amplifications conferring organophosphate resistance in the mosquito Aedes aegypti: from experimental evolution to field application”. The three of them acknowledged the quality and interest of your work but the three of them also pointed some problems, related both to the content and to the clarity of the manuscript.

I agree with the vast majority of the points they raised and would like to add that the manuscript lacks a bit of focus: the title announces a methodological paper (“A novel assay…”) but the paper is actually combining field sampling and population analysis, artificial selection, gene expression and genomics and finally the development of a novel assay. It is not always easy to follow the logic of the choice of the populations for each analysis (e.g. why not using the “experimentally evolved population” for the whole genome sequencing?) and it gives the impression that a lot of data have been acquired but not exploited. For example, (1) the whole genome sequencing is only used for the coverage data and not for the sequences themselves when the sequences would probably give access to a more complete picture of the resistance mechanisms at work (SNPs for example), (2) among the genes that present a differential expression, a lot are left aside and the study then focuses on three of them that had previously already been identified as involved in insecticide resistance. There is thus a high risk to ignore other genes involved in resistance.

All these points have to be addressed before I can decide whether this manuscript can be recommended in PCI Evolutionary Biology.

Additional requirements of the managing board:
As indicated in the 'How does it work?’ section and in the code of conduct, please make sure that:
-Data are available to readers, either in the text or through an open data repository such as Zenodo (free), Dryad or some other institutional repository. Data must be reusable, thus metadata or accompanying text must carefully describe the data.
-Details on quantitative analyses (e.g., data treatment and statistical scripts in R, bioinformatic pipeline scripts, etc.) and details concerning simulations (scripts, codes) are available to readers in the text, as appendices, or through an open data repository, such as Zenodo, Dryad or some other institutional repository. The scripts or codes must be carefully described so that they can be reused.
-Details on experimental procedures are available to readers in the text or as appendices.
-Authors have no financial conflict of interest relating to the article. The article must contain a "Conflict of interest disclosure" paragraph before the reference section containing this sentence: "The authors of this preprint declare that they have no financial conflict of interest with the content of this article." If appropriate, this disclosure may be completed by a sentence indicating that some of the authors are PCI recommenders: “XXX is one of the PCI XXX recommenders.”

Reviewed by anonymous reviewer, 2020-07-29 06:10


I enjoyed reading this interesting study that identifies and characterizes an evolutionary route to insecticide resistance in the mosquito Ae. Aegypti. I have some suggestions, both general and specific that I hope will help improve the manuscript.

First, I found it a little difficult to keep track of all the populations/ individuals, both environmental and evolved, and what analysis was performed on which. I'd suggest going through the manuscript, and making sure that this is very clear. In some cases, appears that multiple naming conventions are used. For example, which population is NakR (see Fig 3)? Is this the Thai resistant population? Also, is the “Laos resistant population” the same are the experimentally evolved population G5-Mala? It’s also not clear to me why some of the analyses appear to be done on G5 evolved populations and other analyses are done on G6 populations. This isn’t outlined in the methods – can you clarify what was done here?

My second general suggestion/ question is regarding how comprehensive the TaqMan multiplex assay that the authors have developed is at detecting resistance in the field. The authors provide evidence that the assay does a great job of detecting individuals with CNV in three CCE genes identified as being important. However, are there other undetected evolutionary routes to resistance occurring these environmental populations that will go undetected? The authors have sampled a larger number of populations in SEA and assayed for CNV in CCE genes, but how many of those mosquitoes that don’t have CNV in CCE genes are also resistant?

Some more specific comments:
Line 164: What does “calibrated” mean in this context?
Figure 1: Did the G5-NS increase significantly compared to G1? Was there a statistical test performed here?
Paragraph starting at line 336. The authors focus on over-transcribed genes that are potentially associated with known resistance mechanisms. Is there any possibility that under-transcribed genes could have an effect on resistance? Also, were there any genes out of the candidate ones identified here that also had mutations that could be important? (Later, on line 670, the authors suggests that SNPs have been shown to be important for resistance in other studies – is this a possible additional mechanism in this study as well?)
Figure 4B: We would expect to see both haplotypes in G1, correct? Type B must just be at very low frequency and so just wasn’t detected in the sample. Or was haplotype B the result of a de novo mutation?
Lines 480-482: “No significant differences were observed between field populations and the different lines (G1, G5-NS, G5-Mala), suggesting that insecticide selection rather select for positive individuals than for individuals carrying a higher number of copies.” Maybe copy number does matter and the data reflect balancing selection for an intermediate copy number, not an absence of selection on copy number.
Line 543: “integrated strategies aiming at preventing” should be changed to “integrated strategies aimed at preventing”
Line 545: “threaten their efficacy.” Should be changed to “threatens their efficacy.”
Line 595: “including AAEL00113” Shouldn’t this be “including AAEL005113”?
Line 631-634: “… our experimental evolution approach demonstrated that the frequency of these resistance alleles increases rapidly in populations submitted to insecticide selection pressure while a susceptible allele is favoured in absence of selection.” I don’t think the data actually demonstrate this. As far as I can tell there is no significant difference between the control and the G5-NS.
Lines 672-673: “Further work is required to precise the interplay…” change to something like “Further work is required to precisely understand the interplay…”
Line 680: “In term of…” -> “In terms of…”

Reviewed by Diego Ayala, 2020-08-05 16:15


The manuscript by Cattel et al., has investigated the gene amplification profiles of carboxylesterase genes related to insecticide resistance in Aedes aegypti, major vector of arboviruses. This is an original and comprehensive study about the evolution of the metabolic resistance of organophosphates, worldwide used for vector control. The authors carried out a compelling work combining fieldwork, laboratory experiments and molecular works. Finally, the authors provide a new molecular tool to monitor insecticide resistance associated to copy number variation of a carboxylesterase gene (AAEL023844).

Overall, the study is well written and the main ideas are clearly structured and addressed. I have no found major issues and I enjoyed reading the manuscript. I really appreciated the discussion section, which examines different evolutionary aspects of the CCE, such as the duplication origin or the evolutionary dynamics of the amplifications. I mention that, because reading the title, you can expect “only a novel assay”. Moreover, I do not think that the term “experimental evolution” can be applied to this work. To my knowledge, the authors applied the standard method to select insecticide resistance in mosquitoes, and they do not follow up CNV across generations with different malathion doses, or carry out fitness experiments.

Here, I list a number of minor points, which would help to improve the clarity of the manuscript. I refer to the version from bioRxiv prepint doi.org/10.1101/2020.06.08.139741:

.- line 100 : CCEae3A  CCEAE3A
.- line 131 : Ae  Aedes
.- lines 131- 132. How many breeding sites per site? Could a low number of breeding sites limit the genetic diversity of each population due to the presence of siblings in the same breeding site? Could the authors include this information in Table S1.
.- line 139. How were the adult females collected ? larvae ? adults ?
.- line 145: Could you define ‘population’? it means you already pooled the larvae of the different breeding sites ? for how many generations ?
.- line 150: ‘thirty-three-days’ , should it be ‘three days’ ?
.- line 154-155 : if the females freely mated before the insecticide selection assays, could it affect the dynamics of malathion resistance ? Why do you not expose the males? Less resistant?
.- line 305: Why the malathion results for G5-Mala are not equal between Fig 1 and Fig 2
.- line 401: Could you provide the % of each haplotype ? In Fig 4, it seems 100% Haplotype A
.- line 401-403: Could you provide the % of the two haplotypes for G5-NS ?
.- line 452: “Thai population resistant”, missing reference
.- line480: I guess that it is with the Sybergreen method ?
.- line 539: Why different numbers between TaqMan and Sybergreen ?

.- Discussion:
- Do the authors consider that all the duplications are closely located such as in An. gambiae (Assogba et al., 2015)? If it is not the case, could it affect their results?
- As the organophosphates are less and less used for vector control and replaced by the pyrenothroids, how do the authors explain the presence across the SEA sampled pops (lines 635-636) ? are they involved in other detoxification mechanism that could explain their persistence ?
- The authors say that the “carrying high gene copy number are not preferentially selected” (line 653-654), but looking at the Figure 6, the copy number are higher in natural populations than in G5-Mala.

Tables and Figures
.- Figure 6: If it is possible, change the colors since they are the same than for Haplotypes in the other Figures.
.- Table S1: Include geographical coordinates of each village

Reviewed by anonymous reviewer, 2020-07-23 16:53


This study investigates the importance of amplifications in CCE genes for resistance against organophosphate insecticides. The authors first find that selection with malathion in a line of Aedes aegypti mosquitoes leads to an increase in resistance over 5 generations. Using RNAseq, they then identify a set of seven detoxification genes over-expressed in the selected line and focus the study on a cluster of upregulated CCE genes, finding that these genes are the subject of amplifications in wild populations from South East Asia and in their selection lines, with the frequency of the amplified haplotypes increasing after selection. As part of this work, the authors have developed a Taqman assay to detect the presence of the CNV.

There are some nice results in this paper, but I find that throughout the manuscript the results are interpreted with too much conviction. I think the CNVs which the authors have found probably are as important as they claim, but this is mainly based on accumulated evidence from previous research. One of the main claims of the paper is that it "confirms" the importance of the CCE genes and their associated CNVs in insecticide resistance. This conclusion is too strong given the results. While I do believe that the CNVs are indeed associated with resistance, the results of a selection experiment with just one replicate for each line does not constitute proof. The possibility that the CNVs increased in frequency due to drift (given the strong bottleneck between generations) cannot be excluded. A better test would be to genotype the mosquitoes used in the bioassays to see whether the CNVs are at a higher frequency among the survivors. This would provide much more convincing evidence that the CNVs are indeed associated with resistance. This could even be done for the other organophosphates (assuming the sample size is large enough), which would then provide evidence that the CNVs are at least partly responsible for the cross-resistance observed in the study. Since this would provide clearer evidence of an association with resistance, it is not clear to me why this was not done.

Other points:

line 52: I don't understand the second half of this sentence. What does "their" refer to? Duplications? Insecticides? Why are either useful for monitoring resistance alleles? One is a resistant allele itself, and the other exerts the selective pressure to maintain the allele in the population.

line 53: Replace "aimed at characterising" with either "aims to characterise" or "aimed to characterise".

line 56: I would replace "confirmed" with "was consistent with". Given that there was only one replicate of each of the selected and unselected lines, it is very difficult to prove that the increase in frequency of the CNV was not the result of random drift.

line 59: What does "large" mean in this case? Are you referring to the size of the genome covered by the amplification? Or to the number of extra copies? Is a CNV coverging three genes really that large?

line 60: This sentence uses the word "amplification" in one place and "duplication" in another. Is the distinction important to the meaning of the sentence? If so, please clarify this distinction.

lines 61-62: What do you mean by "high copy number variations"?

line 65: change "shall" to "will"

lines 80-90: Can also add the neofunctionalisation of detox genes (Zimmer et al 2018)

line 87: change "mosquitoes genera" to "mosquito genera"

lines 100 and 103: Is there a reason why CCEae3A is capitalised differently in these two sentences. It seems to me that in both cases, the names refer to genes (not proteins) and so should be written Cceae3a. However, I don't think it really matters as long as it is consistent (or there is an important distinction underlying the different capitalisation). On the other hand, on lin 106, I think this is referring to the protein, and so it is fine to leave it fully capitalised.

line 108: "this CCE amplification" is confusing here because it was last mentioned at the start of the paragraph, between which there are two more sentences referring to another species.

line 117: Again, this wording is too strong. Firstly, as before, with just a single replicate of each line, it cannot be excluded that the increase in frequency of these genes was a result of drift. Second, even if that is not the case and these CNVs are associated with malathion resistance, it does not mean that they are the cause of the cross-resistance. Other resistance alleles could have also been selected, which could have caused the cross-resistance.

line 156: Here and elsewhere: correct spelling is "Three-day-old" when used as an adjective.

line 156: also "non-blood-fed"

line 168: change "other insecticide" to "other insecticides".

line 175: what are "calibrated" third instar larvae?

line 215: Please clarify that these 100 individuals were pooled into a single library for sequencing.

line 225: remove "were considered".

line 235: change "prior amplification" to "prior to amplification"

lines 236-237: why were no technical replicates run for individual samples?

lines 237-238: commas used for decimal points here, but points used elsewhere.

line 247: 2.5 is an oddly specific threshold. On what grounds was it chosen?

line 247: It's not clear to me how the two different structural variations were distinguished? Is it that if all three genes were amplified, then haplotype A was assigned, but if only the first two genes were amplified, then haplotype B was assigned? If so, what would happen if, say, only one gene was found to be amplified, but not the other two?

line 340: What are the "candidate genes"?

line 351: "Invalidating" is a strong word here. I suggest "and may thus not be specifically associated with malathion resistance".

Figure 3B: Using raw coverage is not ideal here since the library size (total number of reads obtained from sequencing) may have been different for the two lines. Coverage should be normalised to the genome-wide average to properly compare these two plots.

line 382: change CEE to CCE

line 387: remove comma.

line 390: What is the evidence for this "slight elevation"? Was it significant?

lines 392-393: But this was not significant.

line 397: But the CNV is present, so how can we explain the lack of over-expression?

line 399: I would say "at least two"

line 403: How did you distinguish the haplotypes? What were the criteria for calling haplotype A and haplotype B? How are the frequencies expressed? Are they the allele frequencies? If so, then the sum of the two haplotypes (33 + 66) should equal the cumulated frequency (84). Or are they the proportion of individuals that carry at least one copy of the haplotype, in which case this should be specified. If the presence of haplotype B is determined based on the absence of a CNV in Cceae1a, is it possible to identify A/B heterozygotes? Or will the presence of A necessarily mask the presence of B? If it is not possible to identify both haplotypes in the same individual, then I again don't understand how the cumulated frequency can be smaller than the sum of the two haplotypes.

line 476: I can't find Figure S1 in the supplementary materials. Maybe I'm looking in the wrong place, in which case I apologise.

lines 480-482: I don't understand this sentence. Which comparison revealed no significant difference (also, please show the results of the statistical test, at least the p-value)? The different lines were very different to each other, as were the populations, so how is it possible that NONE of the possible comparisons between field populations and the lines were significant? Also, if selection is not selecting for higher numbers of copies, it seems very suprising that copy number could ever reach 80.

line 515: "no false negative was observed" is redundant given the first part of this sentence. Also, could you comment on whether any false positives were observed?

line 554: As above, "confirmed" is too strong a word here.

lines 567-568: This does not follow from the previous sentences, where CCEs were not mentioned.

lines 570-573: How can the level of resistance to temephos be compared to the other insecticides? Since the assays used are necessarily different (larval vs adult) with different concentrations of different insecticides, it is not clear whether a lower mortality against one insecticide can be equated with "higher" resistance. This is particularly the case since the baseline mortality to temephos was already much lower than it was to the other inseciticides. Thus, what would a similar level of resistance look like in terms of a change in mortality?

line 578: It could still have a key role in resistance while offering cross-resistance to other insecticides.

line 605: Why is "loci" italicised?

line 630: change "is" to "might be".

lines 632-633: I would again say that this is too strongly worded. Only one replicate line has shown that the frequency of the CCEs increased. Certainly the second claim of the sentence, stating that the wild-type allele is selected in the absence of insecticidal selection, cannot be made, since the difference between the unselected line and the parental generation was not significant.

line 635: As above, the conclusion of "significant fitness costs" is not supported. At best, it can be written as a speculative suggestion.

line 654: I don't understand the logic. If there is a balance between the fitness cost of high copy number and the benefit of increased resistance, you would still expect to see lower copy number in unselected samples, since the cost of higher copies would be higher and there would be no benefit.

line 681: change "term" to "terms"

line 682: change "such CNV marker" to "such a CNV marker" or "a CNV marker such as this"