Close printable page

Recommendation

Combining molecular information on chromatin organisation with eQTLs and evolutionary conservation provides strong candidates for the evolution of gene regulation in mammalian brains

Marc Robinson-Rechavi based on reviews by Marc Robinson-Rechavi and Charles Danko

A recommendation of:

Evolutionary analysis of candidate non-coding elements regulating neurodevelopmental genes in vertebrates

Francisco J. Novo (2017), bioRxiv, 150482, ver. 4 peer-reviewed and recommended by Peer Community in Evolutionary Biology https://doi.org/10.1101/150482

Read preprint in preprint server Now published in Peer Community Journal

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

Evolutionary analysis of candidate non-coding elements regulating neurodevelopmental genes in vertebrates

Many non-coding regulatory elements conserved in vertebrates regulate the expression of genes involved in development and play an important role in the evolution of morphology through the rewiring of developmental gene networks. Available biological datasets allow the identification of non-coding regulatory elements with high confidence; furthermore, chromatin conformation data can be used to confirm enhancer-promoter interactions in specific tissue types and developmental stages. We have devised an analysis pipeline that integrates datasets about gene expression, enhancer activity, chromatin accessibility, epigenetic marks, and Hi-C contact frequencies in various brain tissues and developmental stages, leading to the identification of eight non-coding elements that might regulate the expression of three genes with important roles in brain development in vertebrates. We have then performed comparative sequence and microsynteny analyses in order to reconstruct the evolutionary history of the regulatory landscape around these genes; we observe a general pattern of ancient regulatory elements conserved across most vertebrate lineages, together with younger elements that appear to be mammal and primate innovations.

evolution of developmental enhancers; vertebrate brain evolution;

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

التحليل التطوري للعناصر المرشحة غير المشفرة التي تنظم الجينات النمائية العصبية في الفقاريات

تنظم العديد من العناصر التنظيمية غير المشفرة المحفوظة في الفقاريات التعبير عن الجينات المشاركة في التطور وتلعب دورًا مهمًا في تطور التشكل من خلال إعادة توصيل شبكات الجينات التنموية. تسمح مجموعات البيانات البيولوجية المتاحة بتحديد العناصر التنظيمية غير المشفرة بثقة عالية؛ علاوة على ذلك، يمكن استخدام بيانات تشكيل الكروماتين لتأكيد تفاعلات المعزز والمعزز في أنواع الأنسجة المحددة ومراحل النمو. لقد ابتكرنا خط تحليل يدمج مجموعات البيانات حول التعبير الجيني، ونشاط المعزز، وإمكانية الوصول إلى الكروماتين، والعلامات اللاجينية، وترددات الاتصال Hi-C في أنسجة المخ المختلفة ومراحل النمو، مما يؤدي إلى تحديد ثمانية عناصر غير مشفرة قد تنظم التعبير عن ثلاثة جينات لها أدوار مهمة في نمو الدماغ في الفقاريات. لقد أجرينا بعد ذلك تحليلات تسلسلية مقارنة وتحليلات دقيقة من أجل إعادة بناء التاريخ التطوري للمشهد التنظيمي حول هذه الجينات؛ نلاحظ نمطًا عامًا من العناصر التنظيمية القديمة المحفوظة عبر معظم سلالات الفقاريات، جنبًا إلى جنب مع العناصر الأحدث التي تبدو وكأنها ابتكارات من الثدييات والرئيسيات.

تطور المعززات التنموية. تطور دماغ الفقاريات.

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Análisis evolutivo de elementos candidatos no codificantes que regulan genes del neurodesarrollo en vertebrados.

Muchos elementos reguladores no codificantes conservados en los vertebrados regulan la expresión de genes implicados en el desarrollo y desempeñan un papel importante en la evolución de la morfología mediante el recableado de las redes de genes del desarrollo. Los conjuntos de datos biológicos disponibles permiten la identificación de elementos reguladores no codificantes con alta confianza; Además, los datos de conformación de la cromatina se pueden utilizar para confirmar las interacciones potenciador-promotor en tipos de tejido y etapas de desarrollo específicos. Hemos ideado un proceso de análisis que integra conjuntos de datos sobre expresión genética, actividad potenciadora, accesibilidad a la cromatina, marcas epigenéticas y frecuencias de contacto Hi-C en varios tejidos cerebrales y etapas de desarrollo, lo que lleva a la identificación de ocho elementos no codificantes que podrían regular la Expresión de tres genes con funciones importantes en el desarrollo del cerebro en vertebrados. Luego hemos realizado análisis comparativos de secuencias y microsintenios para reconstruir la historia evolutiva del panorama regulatorio en torno a estos genes; observamos un patrón general de elementos reguladores antiguos conservados en la mayoría de los linajes de vertebrados, junto con elementos más jóvenes que parecen ser innovaciones de mamíferos y primates.

evolución de potenciadores del desarrollo; evolución del cerebro de vertebrados;

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Analyse évolutive d'éléments candidats non codants régulant les gènes du neurodéveloppement chez les vertébrés

De nombreux éléments régulateurs non codants conservés chez les vertébrés régulent l'expression de gènes impliqués dans le développement et jouent un rôle important dans l'évolution de la morphologie grâce au recâblage des réseaux de gènes du développement. Les ensembles de données biologiques disponibles permettent l’identification d’éléments régulateurs non codants avec une grande confiance ; en outre, les données sur la conformation de la chromatine peuvent être utilisées pour confirmer les interactions amplificateur-promoteur dans des types de tissus et des stades de développement spécifiques. Nous avons conçu un pipeline d'analyse qui intègre des ensembles de données sur l'expression des gènes, l'activité amplificatrice, l'accessibilité de la chromatine, les marques épigénétiques et les fréquences de contact Hi-C dans divers tissus cérébraux et stades de développement, conduisant à l'identification de huit éléments non codants susceptibles de réguler la expression de trois gènes jouant un rôle important dans le développement du cerveau chez les vertébrés. Nous avons ensuite effectué des analyses comparatives de séquence et de microsynténie afin de reconstruire l'histoire évolutive du paysage régulateur autour de ces gènes ; nous observons un modèle général d'éléments régulateurs anciens conservés dans la plupart des lignées de vertébrés, ainsi que des éléments plus jeunes qui semblent être des innovations de mammifères et de primates.

évolution des stimulateurs de développement ; évolution du cerveau des vertébrés ;

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

कशेरुकियों में न्यूरोडेवलपमेंटल जीन को विनियमित करने वाले उम्मीदवार गैर-कोडिंग तत्वों का विकासवादी विश्लेषण

कशेरुकियों में संरक्षित कई गैर-कोडिंग नियामक तत्व विकास में शामिल जीन की अभिव्यक्ति को नियंत्रित करते हैं और विकासात्मक जीन नेटवर्क की रीवायरिंग के माध्यम से आकृति विज्ञान के विकास में महत्वपूर्ण भूमिका निभाते हैं। उपलब्ध जैविक डेटासेट उच्च आत्मविश्वास के साथ गैर-कोडिंग नियामक तत्वों की पहचान की अनुमति देते हैं; इसके अलावा, क्रोमेटिन संरचना डेटा का उपयोग विशिष्ट ऊतक प्रकारों और विकासात्मक चरणों में एन्हांसर-प्रमोटर इंटरैक्शन की पुष्टि करने के लिए किया जा सकता है। हमने एक विश्लेषण पाइपलाइन तैयार की है जो मस्तिष्क के विभिन्न ऊतकों और विकासात्मक चरणों में जीन अभिव्यक्ति, एन्हांसर गतिविधि, क्रोमैटिन पहुंच, एपिजेनेटिक निशान और हाय-सी संपर्क आवृत्तियों के बारे में डेटासेट को एकीकृत करती है, जिससे आठ गैर-कोडिंग तत्वों की पहचान हो सकती है जो इसे नियंत्रित कर सकते हैं। कशेरुकियों में मस्तिष्क के विकास में महत्वपूर्ण भूमिका निभाने वाले तीन जीनों की अभिव्यक्ति। फिर हमने इन जीनों के आसपास नियामक परिदृश्य के विकासवादी इतिहास का पुनर्निर्माण करने के लिए तुलनात्मक अनुक्रम और माइक्रोसिंटेनी विश्लेषण किया है; हम अधिकांश कशेरुक वंशों में संरक्षित प्राचीन नियामक तत्वों का एक सामान्य पैटर्न देखते हैं, साथ में युवा तत्व जो स्तनपायी और प्राइमेट नवाचार प्रतीत होते हैं।

विकासात्मक वृद्धिकर्ताओं का विकास; कशेरुक मस्तिष्क का विकास;

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

脊椎動物の神経発達遺伝子を調節する非コード要素候補の進化的解析

脊椎動物に保存されている多くの非コード制御要素は、発生に関与する遺伝子の発現を制御し、発生遺伝子ネットワークの再配線を通じて形態の進化に重要な役割を果たしています。利用可能な生物学的データセットにより、非コード制御要素を高い信頼性で同定できます。さらに、クロマチン立体構造データを使用して、特定の組織タイプおよび発生段階におけるエンハンサーとプロモーターの相互作用を確認することができます。私たちは、さまざまな脳組織および発達段階における遺伝子発現、エンハンサー活性、クロマチンアクセス可能性、エピジェネティックマーク、および Hi-C 接触頻度に関するデータセットを統合する分析パイプラインを考案しました。これにより、脳の機能を調節する可能性のある 8 つの非コーディング要素の同定につながります。脊椎動物の脳の発達に重要な役割を果たす 3 つの遺伝子の発現。次に、これらの遺伝子の周りの制御状況の進化の歴史を再構築するために、比較配列分析とマイクロシンテニー分析を実行しました。私たちは、ほとんどの脊椎動物系統にわたって保存されている古代の調節要素の一般的なパターンと、哺乳類および霊長類の革新と思われるより若い要素を観察しました。

発達促進剤の進化。脊椎動物の脳の進化。

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Análise evolutiva de candidatos a elementos não codificantes que regulam genes do neurodesenvolvimento em vertebrados

Muitos elementos reguladores não codificantes conservados em vertebrados regulam a expressão de genes envolvidos no desenvolvimento e desempenham um papel importante na evolução da morfologia através da religação de redes genéticas de desenvolvimento. Os conjuntos de dados biológicos disponíveis permitem a identificação de elementos reguladores não codificantes com elevada confiança; além disso, os dados de conformação da cromatina podem ser usados para confirmar interações intensificador-promotor em tipos de tecidos específicos e estágios de desenvolvimento. Desenvolvemos um pipeline de análise que integra conjuntos de dados sobre expressão gênica, atividade potenciadora, acessibilidade da cromatina, marcas epigenéticas e frequências de contato Hi-C em vários tecidos cerebrais e estágios de desenvolvimento, levando à identificação de oito elementos não codificantes que podem regular o expressão de três genes com papéis importantes no desenvolvimento cerebral em vertebrados. Em seguida, realizamos análises comparativas de sequência e microssintenia para reconstruir a história evolutiva do cenário regulatório em torno desses genes; observamos um padrão geral de elementos reguladores antigos conservados na maioria das linhagens de vertebrados, juntamente com elementos mais jovens que parecem ser inovações de mamíferos e primatas.

evolução dos estimuladores do desenvolvimento; evolução do cérebro dos vertebrados;

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Эволюционный анализ кандидатов в некодирующие элементы, регулирующие гены нейроразвития у позвоночных

Многие некодирующие регуляторные элементы, консервативные у позвоночных, регулируют экспрессию генов, участвующих в развитии, и играют важную роль в эволюции морфологии посредством перестройки генных сетей развития. Доступные наборы биологических данных позволяют с высокой уверенностью идентифицировать некодирующие регуляторные элементы; кроме того, данные о конформации хроматина могут быть использованы для подтверждения взаимодействий энхансер-промотор в конкретных типах тканей и на стадиях развития. Мы разработали конвейер анализа, который объединяет наборы данных об экспрессии генов, активности энхансеров, доступности хроматина, эпигенетических метках и частотах контактов Hi-C в различных тканях мозга и на разных стадиях развития, что приводит к идентификации восьми некодирующих элементов, которые могут регулировать экспрессия трех генов, играющих важную роль в развитии мозга у позвоночных. Затем мы провели сравнительный анализ последовательности и микросинтении, чтобы реконструировать эволюционную историю регуляторного ландшафта вокруг этих генов; мы наблюдаем общую картину древних регуляторных элементов, сохранившихся в большинстве линий позвоночных, а также более молодые элементы, которые, по-видимому, являются инновациями млекопитающих и приматов.

эволюция усилителей развития; эволюция мозга позвоночных;

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

脊椎动物神经发育基因候选非编码元件的进化分析

脊椎动物中保守的许多非编码调控元件调节参与发育的基因的表达，并通过发育基因网络的重新布线在形态进化中发挥重要作用。可用的生物数据集可以高可信度地识别非编码调控元件；此外，染色质构象数据可用于确认特定组织类型和发育阶段的增强子-启动子相互作用。我们设计了一个分析流程，整合了有关不同脑组织和发育阶段的基因表达、增强子活性、染色质可及性、表观遗传标记和 Hi-C 接触频率的数据集，从而识别出可能调节基因表达的八个非编码元件。在脊椎动物大脑发育中发挥重要作用的三个基因的表达。然后，我们进行了比较序列和微同线性分析，以重建这些基因周围调控景观的进化历史；我们观察到大多数脊椎动物谱系中保守的古代调控元件的一般模式，以及似乎是哺乳动物和灵长类动物创新的年轻元件。

发育促进剂的进化；脊椎动物大脑的进化；

Submission: posted 29 June 2017
Recommendation: posted 06 October 2017, validated 06 October 2017

Cite this recommendation as:
Robinson-Rechavi, M. (2017) Combining molecular information on chromatin organisation with eQTLs and evolutionary conservation provides strong candidates for the evolution of gene regulation in mammalian brains. Peer Community in Evolutionary Biology, 100035. https://doi.org/10.24072/pci.evolbiol.100035

Recommendation

In this manuscript [1], Francisco J. Novo proposes candidate non-coding genomic elements regulating neurodevelopmental genes.

What is very nice about this study is the way in which public molecular data, including physical interaction data, is used to leverage recent advances in our understanding to molecular mechanisms of gene regulation in an evolutionary context. More specifically, evolutionarily conserved non coding sequences are combined with enhancers from the FANTOM5 project, DNAse hypersensitive sites, chromatin segmentation, ChIP-seq of transcription factors and of p300, gene expression and eQTLs from GTEx, and physical interactions from several Hi-C datasets. The candidate regulatory regions thus identified are linked to candidate regulated genes, and the author shows their potential implication in brain development.

While the results are focused on a small number of genes, this allows to verify features of these candidates in great detail. This study shows how functional genomics is increasingly allowing us to fulfill the promises of Evo-Devo: understanding the molecular mechanisms of conservation and differences in morphology.

References

[1] Novo, FJ. 2017. Evolutionary analysis of candidate non-coding elements regulating neurodevelopmental genes in vertebrates. bioRxiv, 150482, ver. 4 of Sept 29th, 2017. doi: 10.1101/150482

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Reviews

Reviewed by Charles Danko, 22 Sep 2017

Francisco Novo appears to have made changes in his manuscript that address comments raised during my first review. I am happy to recommend his manuscript, which I believe will be of interest to reviewers in the field.

https://doi.org/10.24072/pci.evolbiol.100035.rev21

Reviewed by Marc Robinson-Rechavi, 28 Sep 2017

The revised manuscript has taken all remarks into account. Notably, the revised title, abstract and discussion are much clearer and reflect better the results.

https://doi.org/10.24072/pci.evolbiol.100035.rev22

Evaluation round #1

DOI or URL of the preprint: 10.1101/150482

Version of the preprint: 2

Author's Reply, 31 Aug 2017

Dear Dr. Robinson-Rechavi,

I would like to thank you and Dr. Danko for your comments and suggestions in the reviews of this manuscript. In the following paragraphs I answer those points and explain how I have incorporated them in the new version of the manuscript that has been uploaded to bioRxiv. I hope this new version can be considered suitable for recommendation by PCI Evol Biol.

As a general remark about the extent and goals of this work, it is important to understand that it represents an attempt to gain functional knowledge about conserved putative neurodevelopmental regulatory elements by accruing information from a variety of computational and experimental datasets. This strategy ranks non-coding elements according to the likelihood that they behave as regulatory elements of particular gene(s) in specific tissues. It is only partially true that the selection of these regions was “largely based” on the frequency of HiC contacts: without additional data supporting the functionality of a region in a tissue (histone marks, chromatin accessibility, TF binding, functional variants in the vicinity, etc), they would not be selected. The trickiest point is the assignment of an enhancer to one of the genes in its vicinity in specific tissues, and HiC contact frequency was chiefly used to make such assignments; as Dr. Danko rightly points out, this rationale does not always hold, but in the absence of other data it remains the best way to infer enhancer-gene pairs. In fact, Fulco et al. found that quantitative measures of chromatin state and chromosome conformation are strongly predictive of enhancer functionality, correctly ranking 6 out of 7 distal MYC enhancers in their study. I have mentioned this in the revised version of the manuscript (both in Results and Discussion). Out of potential dozens of regions that show the required features, I have settled only on those which can be assigned a potential functional role with high confidence. This is the reason why only 8 regions have been selected for evolutionary analysis; this low number might look disappointing, but on the other hand it ensures that those regions are very strong candidates. As I mention in the Discussion, validation of these predictions will require complex functional studies to show that the deletion or mutation of these sequences changes the expression of their putative target genes in specific brain areas and developmental stages and, furthermore, alters neurodevelopmental pathways. Such studies should be ideally performed in model animals representative of several vertebrate lineages. This is a huge task that will take years to complete, so any effort at prioritizing the regions/genes to analyse could be of great help. The aim of this work was to characterize a set of regions with high likelihood of behaving as enhancers of neurodevelopmental genes in vertebrates, so that other researchers who have the technology to validate them can do so with minimal waste of time and resources. I believe that such goal has been accomplished as far as available datasets permit.
Dr. Robinson-Rechavi rightly points out that I depict evolution as an anagenetic process. I took it for granted that potential readers would understand that when I refer to “earlier then lamprey”, for instance, I am referring to ancestral species living before the lineage leading to present-day lampreys split from the vertebrate tree. However, I understand that such language might lead to confusion and have corrected the manuscript accordingly.
Dr. Robinson-Rechavi suggests that blastn might miss some distant orthologs. Since we are dealing with non-coding regions, there is no obvious alternative to this approach. I have used an E-value cut-off of 10-6 for blastn, which is the standard procedure. I have now mentioned in Methods that this might miss some very divergent orthologous sequences.
Dr. Robinson-Rechavi is surprised by the omission of references to teleost-specific whole-genome duplication. Although I have mentioned in several places that teleosts show additional copies of some BREs, I decided not to go into that issue because it did not affect the main results and conclusions of the work and it could distract readers from the main message. However, I have now added a few lines in Results (BRE1) explaining that the extra copy of that region seen in zebrafish, fugu, tetraodon and stickleback is in keeping with the fact that teleost fish genomes have undergone one additional whole-genome duplication (on top of the two WGD common to all other vertebrates), and added a recent reference on the subject.
As for other remarks by Dr. Robinson-Rechavi, the chance of finding three random genes in a particular order and orientation is of 1 in 48 (0.021). In the case mentioned, in fact, there are four genes (including RBMS1), so the random probability of this specific arrangements is 1/384 (0.003). I have added this to the manuscript. I have also rephrased the allusion to the classical view of promoter-enhancer interactions to make it sound less aggressive.
Dr. Danko raises an interesting point about the background signal of virtual 4C plots and the fact that nearby regions tend to show high contact frequencies. I had already taken this into account when selecting putative enhancer-gene associations, giving more weight to distant peaks than to nearby peaks and doing “reverse” 4C plots (fixing the anchoring point either on the enhancer or the promoter to see if the interaction is seen in both cases; compare Figure 2 and supplementary Figure S1, for instance). I have made this clearer in the revised version (in Methods).
Dr. Danko asks whether the ncRNAs overlapping some of the BREs might represent enhancer RNAs (eRNAs). Most active enhancers are known to produce bidirectional short eRNAs, but they are unlikely to be identical to the ncRNAs annotated in Genecode since these are usually longer and undergo splicing (which is not a feature of eRNAs; see https://doi.org/10.1146/annurev-genet-110711-155459 for a recent review on the subject). Other studies have suggested that some enhancers act as promoters of ncRNAs (or viceversa), but this is a complex issue still unresolved and I did not want to dwell too much into it so as not to distract readers from the main message.
As mentioned in #1 above, the functional validation of these conserved elements will be difficult and time-consuming, especially if it is going to be done across most vertebrate lineages. I have tried to gather all published functional information about these enhancers in other species (mostly mouse, chicken and zebrafish), but very little is known about the genes they regulate in specific brain regions during development. Following Dr. Danko’s recommendation, I have changed the title and toned down the claims of functional causality.

I would like to thank you again for taking the time to read critically this manuscript. I am sure that it has improved substantially following your comments and recommendations.

Best regards,

https://doi.org/10.24072/pci.evolbiol.100087.ar1

Decision by Marc Robinson-Rechavi, posted 02 Aug 2017

Dear Francisco Novo,

Thank you for submitting your manuscript for recommendation at PCI Evol Biol. We are aware that this is a very new concept, and we appreciate that you are giving it a chance. The process is also new for us, so please do not hesitate to give us feedback, our common aim must be to make the best science possible available.

As you will see, the expert reviewer I invited and myself found your approach interesting, but also that there were problems with your interpretation of the data. Thus I am not proposing at present a public recommendation of your manuscript. But I hope that the two reviews will be helpful for you to improve the work and the manuscript, to re-request a recommendation at PCI Evol Biol or to submit directly an improved manuscript to a classical journal.

Best regards Marc

https://doi.org/10.24072/pci.evolbiol.100087.d1

Reviewed by Marc Robinson-Rechavi, 02 Aug 2017

In this manuscript, FJ Novo used genome-wide "epigenetic" marks (histone modifications, DNA methylation, chromatin accessibility, transcription factor binding) with chromatin contacts and gene expression data, to detect putative regulatory elements in the human brain. The evolution of these elements was then studied by comparative genomics.

I am very sympathetic to the aims of this paper, and the starting point of integrating functional genomics in one species with comparative genomics is sound. But I was disappointed both by the results and by the writing. I recommend to the author

I was disappointed that all the functional genomics integration led to the study of only 3 genes. Moroever, while correlative evidence is sufficient to discuss large scale patterns, I expect stronger evidence than that presented on page 8 to specifically infer the function of a regulatory element. Especially given the "manual inspection" step, which means that the analysis cannot be reproduced and is inherently subjective. Page 10, the link with educational attainment is interesting, but it should be noted that such complex phenotypes, like size or life expectancy, can be affected by an extremely high number of pathways. Thus this does not necessarily imply a role in the brain, in itself.

The manuscript systematically represents evolution as a progress from "lamprey or earlier species" to fishes, to "chicken onwards", which is erroneous. These are all present day species, which have evolved for the same time. We do not have evidence of functional genomics of the ancestral "earlier" species. It is possible and interesting to infer some of their characteristics from comparative data in a phylogenomic framework, but that is not done here.

"BRE1 is a vertebrate innovation appearing in Gnathostomes": since homology was determined by Blastn, it is possible that other species have an ortholog, but which is too divergent for detection. For protein sequences, it is not unusual that Blastp fails to detect true orthologs, which are detected by psi-Blast.

"We observed that coelacanth, spotted gar and elephant shark have orthologs for TANK, PSDM14 and TBR1 in the same order and orientation than mammals": how does this compare to an expectation from 3 random genes?

It is surprising that the manuscript discusses a duplication in teleostei fishes (pp 11-12) without mentioning the teleost fish genome duplication, and the enrichment in transcription factors and in brain expressed genes in the retention of genes.

"The classical and largely outdated view of promoter-enhancer interactions suggested that a regulatory element would most likely regulate the activity of the closest gene": reference needed, or you risk attacking a straw man.

https://doi.org/10.24072/pci.evolbiol.100087.rev11

Reviewed by Charles Danko, 30 Jul 2017

The manuscript by Francisco Novo, Identification and evolutionary analysis of eight non-coding genomic elements regulating neurodevelopmental genes, describes a detailed evolutionary analysis of candidate non-coding regulatory elements. Eight regulatory elements were selected based on their proximity to three genes – TBR1, EMX2, and LMO4 – which encode transcription factors likely to play roles in nervous system development. The bulk of the study describes an analysis of publicly available genomic data to identify the location of regulatory elements, combined with an effort to characterize the evolutionary origin of these candidate enhancers using a number of sequence based analyses. Overall this study is well done and will be of substantial interest to researchers in the field.

Comments:

(1) The candidate enhancers selected for detailed analysis were largely chosen based on the frequency of contacts in Hi-C data collected from human fetal brain. Novo makes the assumption that these regulatory elements, which bear the marks associated with enhancers and form loop interactions with the target genes of interest, regulate the transcription of these genes. Although there is mounting evidence to support the notion that these enhancers are more likely to regulate expression of the candidate genes (see especially Fulco CP et. al. (2016) Science, PMID# 27708057), there are undoubtedly exceptions to this assumption and no direct functional validation is available for most of the regulatory elements in the present study. The manuscript would benefit from toning down the language that implies a causal relationship between candidate enhancers and the genes of interest (including in the title). I also think that some discussion on the limitations of Hi-C data for this task, mostly noting that it is not a direct functional validation of enhancer activity, would also be useful.

(2) Special care should be taken when interpreting contact frequencies in the Virtual 4C plots that are nearby the anchor points (shown in Figures 2, 4 and Supplementary Fig. S6), especially near EMX2 (Fig. 4). Hi-C data, and indeed all chromosome confirmation and capture data, has a high signal in nearby regions that lie along the “diagonal” of a Hi-C heatmap. This is often interpreted by many authors as “background”. The Y-axis reads “Hi-C read value”, which I take to mean the un-normalized contact frequencies between two loci – it would be useful for readers to make it clear if normalization was applied to correct for the decay as a function of distance that is commonly found in Hi-C contact frequencies. In either case, it is possible that these contacts are biologically relevant, but this limitation should be considered carefully, and noted in the text, when interpreting the biological function of these putative loop interactions.

(3) Many enhancers in mammals recruit RNA polymerase II, which transcribes short, unstable non-coding RNAs (Kim et. al. (2010) Nature, PMID# 20393465). Could the poorly characterized non-coding RNAs overlapping several of the BREs reflect transcription of enhancer-templated RNAs transcribed from the enhancer itself?!

(4) The author tracks the evolutionary origin of DNA sequences that are identified as candidate enhancers using experiments in either human or mouse. Many enhancers that have an orthologous sequence in the genome of another species are not conserved at the functional level (see a variety of work by Duncan Odom’s lab, as well as others). While Novo is careful throughout the manuscript not to imply that DNA sequence conservation reflects functional conservation, adding an explicit note to the text that there is a major disconnect between conservation at these two levels would be useful for readers.

In addition, several of the enhancers described herein are conserved at the DNA sequence level in both human and mouse. In these cases a direct comparison between publicly available data in human and mouse may help to sort this out.

(5) Fig. 4 would be easier to read if the position of BREs near EMX2 were included.

https://doi.org/10.24072/pci.evolbiol.100087.rev12