Gillingham, Mark A. F., Montero, B. Karina, Wilhelm, Kerstin, Grudzus, Kara, Sommer, Simone and Santos, Pablo S. C.Please use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
<p>Genotyping novel complex multigene systems is particularly challenging in non-model organisms. Target primers frequently amplify simultaneously multiple loci leading to high PCR and sequencing artefacts such as chimeras and allele amplification bias. Most next-generation sequencing genotyping pipelines have been validated in non-model systems whereby the real genotype is unknown and the generation of artefacts may be highly repeatable. Further hindering accurate genotyping, the relationship between artefacts and copy number variation (CNV) within a PCR remains poorly described. Here we investigate the latter by experimentally combining multiple known major histocompatibility complex (MHC) haplotypes of a model organism (chicken, Gallus gallus, 43 artificial genotypes with 2-13 alleles per amplicon). In addition to well defined “optimal” primers, we simulated a non-model species situation by designing “naive” primers, with sequence data from closely related Galliform species. We applied a novel open-source genotyping pipeline (ACACIA) to the data, and compared its performance with another, previously published, pipeline. ACACIA yielded very high allele calling accuracy (>98%). Non-chimeric artefacts increased linearly with increasing CNV but chimeric artefacts leveled when amplifying more than 4-6 alleles. As expected, we found heterogeneous amplification efficiency of allelic variants when co-amplifying multiple loci. Using our validated ACACIA pipeline and the example data of this study, we discuss in detail the pitfalls researchers should avoid in order to reliably genotype complex multigene systems. ACACIA and the datasets used in this study are publicly available at GitLab and FigShare (https://gitlab.com/psc_santos/ACACIA and https://figshare.com/projects/ACACIA/66485).</p>
open-source genotyping pipeline, ACACIA, next generation sequencing, amplicon genotyping, allele dropout, PCR amplification bias, sequencing bias, multigene family, MHC
Bioinformatics & Computational Biology, Evolutionary Ecology, Genome Evolution, Molecular Evolution
Helena Westerdahl, Sebastian Ernesto Ramos-Onsins, Paul J. McMurdie , Arnaud Estoup, Vincent Segura, Jacek Radwan , Torbjørn Rognes , William Stutz , Kevin Vanneste , Thomas Bigot, Jill A. Hollenbach , Wieslaw Babik , Marie-Christine Le Paslier , A. Murat Eren , Miguel Alcaide , Morgan G. I. Langille , Melissa Gymrek