GC content alone is associated with distinct functional classes of human enhancers.
Because enhancers can be located hundreds of kilobases away from their target genes, it can be challenging to accurately predict their functions. A new report in GENETICS uses sequence composition to distinguish two enhancer classes that have distinct functions and spatial organization in humans.
Enhancers are regulatory DNA sequences that aid in transcription initiation. In some ways, enhancers are like promoters, since both are bound by transcription factors as part of transcription initiation. Unlike promoters, which are located near the transcriptional start site of the genes they regulate, enhancers are sequentially far away from their targets, typically coming into long-distance contact with gene promoters via 3D DNA looping. Since it is difficult to identify enhancers through sequence information alone, our understanding of them is somewhat primitive compared with other DNA regulatory elements.
Lecellier, Wasserman, and Mathelier were interested in classifying enhancers based on their sequences. The percentage of a given sequence that is guanine and cytosine (the GC content or %GC) can be used to classify promoters, so they investigated whether a similar approach could be useful for enhancer classification. To perform this analysis, they took advantage of the FANTOM5 project, which recently cataloged tens of thousands of enhancers across the human genome.
The enhancers were divided into two simple groups: those with higher %GC and those with lower %GC than the median overall. The authors compared the properties of the two groups, finding that different transcription factors were predicted to be associated with each group. Each group was also associated with different DNA shapes (e.g. bending) and distinct localization in chromatin loops, suggesting that the enhancer sequence composition is linked to the 3D architecture of the chromatin.
The authors then examined whether the two groups of enhancers had distinct biological functions. By consolidating previous reports, they compiled lists of thousands of genes predicted to be targets of each class of enhancer, and they analyzed these genes as proxies for the biological functions of the enhancers across different cell and tissue types. They found that enhancers with a higher %GC were associated with ubiquitous gene expression, whereas enhancers with a lower %GC were associated with specific patterns of expression in particular subsets of cells.
In particular, lower %GC enhancers were linked to immune response genes. To test this association against experimental data, the authors used data obtained from dendritic cells infected with Mycobacterium tuberculosis. This data tracked changes in chromatin accessibility, which can be mediated by enhancer activity. They found that lower %GC enhancers were significantly more activated in infected cells, providing experimental support for their observations.
CITATION:
Human enhancers harboring specific sequence composition, activity, and genome organization are linked to the immune response
,