Submit a preprint

716

mtDNA "Nomenclutter" and its Consequences on the Interpretation of Genetic Datause asterix (*) to get italics
Vladimir Bajić, Vanessa Hava Schulmann, Katja NowickPlease use the format "First name initials family name" as in "Marie S. Curie, Niels H. D. Bohr, Albert Einstein, John R. R. Tolkien, Donna T. Strickland"
2024
<p style="text-align: justify;">Population-based studies of human mitochondrial genetic diversity often require the classification of mitochondrial DNA (mtDNA) haplotypes into more than 5400 described haplogroups, and further grouping those into hierarchically higher haplogroups. Such secondary haplogroup groupings (e.g., “macro-haplogroups”) vary across studies, as they depend on the sample quality, technical factors of haplogroup calling, the aims of the study, and the researchers' understanding of the mtDNA haplogroup nomenclature. Retention of historical nomenclature coupled with a growing number of newly described mtDNA lineages results in increasingly complex and inconsistent nomenclature that does not reflect phylogeny well. This “clutter” leaves room for grouping errors and inconsistencies across scientific publications, especially when the haplogroup names are used as a proxy for secondary groupings, and represents a source for scientific misinterpretation.</p> <p style="text-align: justify;">Here we explore the effects of phylogenetically insensitive secondary mtDNA haplogroup groupings, and the lack of standardized secondary haplogroup groupings on downstream analyses and interpretation of genetic data. We demonstrate that frequency-based analyses produce inconsistent results when different secondary mtDNA groupings are applied, and thus allow for vastly different interpretations of the same genetic data. The lack of guidelines and recommendations on how to choose appropriate secondary haplogroup groupings presents an issue for the interpretation of results, as well as their comparison and reproducibility across studies.</p> <p style="text-align: justify;">To reduce biases originating from arbitrarily defined secondary nomenclature-based groupings, we suggest that future updates of mtDNA phylogenies aimed for the use in mtDNA haplogroup nomenclature should also provide well-defined and standardized sets of phylogenetically meaningful algorithm-based secondary haplogroup groupings such as “macro-haplogroups”, “meso-haplogroups”, and “micro-haplogroups”. Ideally, each of the secondary haplogroup grouping levels should be informative about different human population history events. Those phylogenetically informative levels of haplogroup groupings can be easily defined using TreeCluster, and then implemented into haplogroup callers such as HaploGrep3. This would foster reproducibility across studies, provide a grouping standard for population-based studies, and reduce errors associated with haplogroup nomenclatures in future studies. &nbsp;</p> <p style="text-align: justify;">&nbsp;</p>
https://doi.org/10.5281/zenodo.10156923You should fill this box only if you chose 'All or part of the results presented in this preprint are based on data'. URL must start with http:// or https://
https://doi.org/10.5281/zenodo.10156923You should fill this box only if you chose 'Scripts were used to obtain or analyze the results'. URL must start with http:// or https://
https://doi.org/10.5281/zenodo.10156923You should fill this box only if you chose 'Codes have been used in this study'. URL must start with http:// or https://
Mitochondria, mtDNA, mtDNA nomenclature, haplogroup, macro-haplogroup, meso-haplogroup, micro-haplogroup, classification, grouping, PhyloTree, HaploGrep
NonePlease indicate the methods that may require specialised expertise during the peer review process (use a comma to separate various required expertises).
Bioinformatics & Computational Biology, Human Evolution, Other, Phylogenetics / Phylogenomics, Phylogeography & Biogeography, Population Genetics / Genomics
Hansi Weissensteiner (hansi.weissensteiner@i-med.ac.at), Antonio Torroni (ti.vpinu.negvpi@inorrot), Alessandro Achilli (alessandro.achilli@unipv.it), Anna Olivieri (anna.olivieri@unipv.it), Maria Pala (M.Pala@hud.ac.uk), Sebastian Schönherr (sebastian.schoenherr@i-med.ac.at), Francesca Gandini (francesca.gandini01@universitadipavia.it), Carina M. Schlebusch (carina.schlebusch@ebc.uu.se), Gabriel Renaud (gabre@dtu.dk), Joshua Daniel Rubin (jdru@dtu.dk), Hansi Weissensteiner suggested: Dr. Alberto Gómez-Carballa -> Alberto.Gomez.Carballa@sergas.es, Hansi Weissensteiner suggested: Dr. Nicola Rambaldi Migliore -> nicola.rambaldi01@universitadipavia.it, Maciej Chyleński [maciej.ch@amu.edu.pl] suggested: Anna Juras anna.juras@amu.edu.pl, Stefania Vai [stefania.vai@unifi.it] suggested: Martina Lari martina.lari@unifi.it No need for them to be recommenders of PCIEvolBiol. Please do not suggest reviewers for whom there might be a conflict of interest. Reviewers are not allowed to review preprints written by close colleagues (with whom they have published in the last four years, with whom they have received joint funding in the last four years, or with whom they are currently writing a manuscript, or submitting a grant proposal), or by family members, friends, or anyone for whom bias might affect the nature of the review - see the code of conduct
e.g. John Doe [john@doe.com]
2023-11-20 11:16:36
Torsten Günther