Skip Navigation

NAR Top Articles - Computational Biology

Computational Biology

View all categories

March 2015


Predicting enhancer transcription and activity from chromatin modifications
Zhu, Y; Sun, L; Chen, Z; Whitaker, JW; Wang, T; Wang, W
Nucleic Acids Res. 2013, 41, 10032-10043
Free Full Text
Enhancers play a pivotal role in regulating the transcription of distal genes. Although certain chromatin features, such as the histone acetyltransferase P300 and the histone modification H3K4me1, indicate the presence of enhancers, only a fraction of enhancers are functionally active. Individual chromatin marks, such as H3K27ac and H3K27me3, have been identified to distinguish active from inactive enhancers. However, the systematic identification of the most informative single modification, or combination thereof, is still lacking. Furthermore, the discovery of enhancer RNAs (eRNAs) provides an alternative approach to directly predicting enhancer activity. However, it remains challenging to link chromatin modifications to eRNA transcription. Herein, we develop a logistic regression model to unravel the relationship between chromatin modifications and eRNA synthesis. We perform a systematic assessment of 24 chromatin modifications in fetal lung fibroblast and demonstrate that a combination of four modifications is sufficient to accurately predict eRNA transcription. Furthermore, we compare the ability of eRNAs and H3K27ac to discriminate enhancer activity...

A high-resolution network model for global gene regulation in Mycobacterium tuberculosis
Peterson, EJR; Reiss, DJ; Turkarslan, S; Minch, KJ; Rustad, T; Plaisier, CL; Longabaugh, WJR; Sherman, DR; Baliga, NS
Nucleic Acids Res. 2014, 42, 11291-11303
Free Full Text
The resilience of Mycobacterium tuberculosis (MTB) is largely due to its ability to effectively counteract and even take advantage of the hostile environments of a host. In order to accelerate the discovery and characterization of these adaptive mechanisms, we have mined a compendium of 2325 publicly available transcriptome profiles of MTB to decipher a predictive, systems-scale gene regulatory network model. The resulting modular organization of 98% of all MTB genes within this regulatory network was rigorously tested using two independently generated datasets: a genome-wide map of 7248 DNA-binding locations for 143 transcription factors (TFs) and global transcriptional consequences of over-expressing 206 TFs. This analysis has discovered specific TFs that mediate conditional co-regulation of genes within 240 modules across 14 distinct environmental contexts. In addition to recapitulating previously characterized regulons, we discovered 454 novel mechanisms for gene regulation during stress, cholesterol utilization and dormancy...

Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
Varemo, L; Nielsen, J; Nookaew, I
Nucleic Acids Res. 2013, 41, 4378-4391
Free Full Text
Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should be preferred, and the variety of available GSA software and implementations pose a difficulty for the end-user who wants to try out different methods. To address this, we have developed the R package Piano that collects a range of GSA methods into the same system, for the benefit of the end-user. Further on we refine the GSA workflow by using modifications of the gene-level statistics. This enables us to divide the resulting gene set P-values into three classes, describing different aspects of gene expression directionality at gene set level. We use our fully implemented workflow to investigate the impact of the individual components of GSA by using microarray and RNA-seq data. The results show that the evaluated methods are globally similar and the major separation correlates well with our defined directionality classes...

Interplay of microRNAs, transcription factors and target genes: linking dynamic expression changes to function
Nazarov, PV; Reinsbach, SE; Muller, A; Nicot, N; Philippidou, D; Vallar, L; Kreis, S
Nucleic Acids Res. 2013, 41, 2817-2831
Free Full Text
MicroRNAs (miRNAs) are ubiquitously expressed small non-coding RNAs that, in most cases, negatively regulate gene expression at the post-transcriptional level. miRNAs are involved in fine-tuning fundamental cellular processes such as proliferation, cell death and cell cycle control and are believed to confer robustness to biological responses. Here, we investigated simultaneously the transcriptional changes of miRNA and mRNA expression levels over time after activation of the Janus kinase/Signal transducer and activator of transcription (Jak/STAT) pathway by interferon-gamma stimulation of melanoma cells. To examine global miRNA and mRNA expression patterns, time-series microarray data were analysed. We observed delayed responses of miRNAs (after 24-48 h) with respect to mRNAs (12-24 h) and identified biological functions involved at each step of the cellular response. Inference of the upstream regulators allowed for identification of transcriptional regulators involved in cellular reactions to interferon-gamma stimulation...

Profiling the transcription factor regulatory networks of human cell types
Zhang, SH; Tian, DC; Tran, NH; Choi, KP; Zhang, LX
Nucleic Acids Res. 2014, 42, 12380-12387
Free Full Text
Neph et al. (2012) (Circuitry and dynamics of human transcription factor regulatory networks. Cell, 150: 1274-1286) reported the transcription factor (TF) regulatory networks of 41 human cell types using the DNaseI footprinting technique. This provides a valuable resource for uncovering regulation principles in different human cells. In this paper, the architectures of the 41 regulatory networks and the distributions of housekeeping and specific regulatory interactions are investigated. The TF regulatory networks of different human cell types demonstrate similar global three-layer (top, core and bottom) hierarchical architectures, which are greatly different from the yeast TF regulatory network. However, they have distinguishable local organizations, as suggested by the fact that wiring patterns of only a few TFs are enough to distinguish cell identities. The TF regulatory network of human embryonic stem cells (hESCs) is dense and enriched with interactions that are unseen in the networks of other cell types. The examination of specific regulatory interactions suggests that specific interactions play important roles in hESCs.

Mechanistic insight into ligand binding to G-quadruplex DNA
Di Leva, FS; Novellino, E; Cavalli, A; Parrinello, M; Limongelli, V
Nucleic Acids Res. 2014, 42, 5447-5455
Free Full Text
Specific guanine-rich regions in human genome can form higher-order DNA structures called G-quadruplexes, which regulate many relevant biological processes. For instance, the formation of G-quadruplex at telomeres can alter cellular functions, inducing apoptosis. Thus, developing small molecules that are able to bind and stabilize the telomeric G-quadruplexes represents an attractive strategy for antitumor therapy. An example is 3-(benzo[d]thiazol-2-yl)-7-hydroxy-8-((4-(2-hydroxyethyl)piperazin-1-yl)methyl)-2H-chromen-2-one (compound 1), recently identified as potent ligand of the G-quadruplex [d(TGGGGT)](4) with promising in vitro antitumor activity. The experimental observations are suggestive of a complex binding mechanism that, despite efforts, has defied full characterization. Here, we provide through metadynamics simulations a comprehensive understanding of the binding mechanism of 1 to the G-quadruplex [d(TGGGGT)](4). In our calculations, the ligand explores all the available binding sites on the DNA structure and the free-energy landscape of the whole binding process is computed...

TEMP: a computational method for analyzing transposable element polymorphism in populations
Zhuang, JL; Wang, J; Theurkauf, W; Weng, ZP
Nucleic Acids Res. 2014, 42, 6826-6838
Free Full Text
Insertions and excisions of transposable elements (TEs) affect both the stability and variability of the genome. Studying the dynamics of transposition at the population level can provide crucial insights into the processes and mechanisms of genome evolution. Pooling genomic materials from multiple individuals followed by high-throughput sequencing is an efficient way of characterizing genomic polymorphisms in a population. Here we describe a novel method named TEMP, specifically designed to detect TE movements present with a wide range of frequencies in a population. By combining the information provided by pair-end reads and split reads, TEMP is able to identify both the presence and absence of TE insertions in genomic DNA sequences derived from heterogeneous samples; accurately estimate the frequencies of transposition events in the population and pinpoint junctions of high frequency transposition events at nucleotide resolution. Simulation data indicate that TEMP outperforms other algorithms such as PoPoolationTE, RetroSeq, VariationHunter and GASVPro. TEMP also performs well on whole-genome human data derived from the 1000 Genomes Project...

A novel reannotation strategy for dissecting DNA methylation patterns of human long intergenic non-coding RNAs in cancers
Zhi, H; Ning, SW; Li, X; Li, YY; Wu, W; Li, X
Nucleic Acids Res. 2014, 42, 8258-8270
Free Full Text
Despite growing consensus that long intergenic non-coding ribonucleic acids (lincRNAs) are modulators of cancer, the knowledge about the deoxyribonucleic acid (DNA) methylation patterns of lincRNAs in cancers remains limited. In this study, we constructed DNA methylation profiles for 4629 tumors and 705 normal tissue samples from 20 different types of human cancer by reannotating data of DNA methylation arrays. We found that lincRNAs had different promoter methylation patterns in cancers. We classified 2461 lincRNAs into two categories and three subcategories, according to their promoter methylation patterns in tumors. LincRNAs with resistant methylation patterns in tumors had conserved transcriptional regulation regions and were ubiquitously expressed across normal tissues. By integrating cancer subtype data and patient clinical information, we identified lincRNAs with promoter methylation patterns that were associated with cancer status, subtype or prognosis for several cancers. Network analysis of aberrantly methylated lincRNAs in cancers showed that lincRNAs with aberrant methylation patterns might be involved in cancer development and progression...

Molecular evolutionary and structural analysis of the cytosolic DNA sensor cGAS and STING
Wu, XM; Wu, FH; Wang, XQ; Wang, LL; Siedow, JN; Zhang, WG; Pei, ZM
Nucleic Acids Res. 2014, 42, 8243-8257
Free Full Text
Cyclic GMP-AMP (cGAMP) synthase (cGAS) is recently identified as a cytosolic DNA sensor and generates a non-canonical cGAMP that contains G(2',5')pA and A(3',5')pG phosphodiester linkages. cGAMP activates STING which triggers innate immune responses in mammals. However, the evolutionary functions and origins of cGAS and STING remain largely elusive. Here, we carried out comprehensive evolutionary analyses of the cGAS-STING pathway. Phylogenetic analysis of cGAS and STING families showed that their origins could be traced back to a choanoflagellate Monosiga brevicollis. Modern cGAS and STING may have acquired structural features, including zinc-ribbon domain and critical amino acid residues for DNA binding in cGAS as well as carboxy terminal tail domain for transducing signals in STING, only recently in vertebrates. In invertebrates, cGAS homologs may not act as DNA sensors. Both proteins cooperate extensively, have similar evolutionary characteristics, and thus may have co-evolved during metazoan evolution. cGAS homologs and a prokaryotic dinucleotide cyclase for canonical cGAMP share conserved secondary structures and catalytic residues...

Predicting DNA methylation level across human tissues
Ma, BS; Wilker, EH; Willis-Owen, SAG; Byun, HM; Wong, KCC; Motta, V; Baccarelli, AA; Schwartz, J; Cookson, WOCM; Khabbaz, K; Mittleman, MA; Moffatt, MF; Liang, LM
Nucleic Acids Res. 2014, 42, 3515-3528
Free Full Text
Differences in methylation across tissues are critical to cell differentiation and are key to understanding the role of epigenetics in complex diseases. In this investigation, we found that locus-specific methylation differences between tissues are highly consistent across individuals. We developed a novel statistical model to predict locus-specific methylation in target tissue based on methylation in surrogate tissue. The method was evaluated in publicly available data and in two studies using the latest IlluminaBeadChips: a childhood asthma study with methylation measured in both peripheral blood leukocytes (PBL) and lymphoblastoid cell lines; and a study of postoperative atrial fibrillation with methylation in PBL, atrium and artery. We found that our method can greatly improve accuracy of cross-tissue prediction at CpG sites that are variable in the target tissue [R-2 increases from 0.38 (original R-2 between tissues) to 0.89 for PBL-to-artery prediction; from 0.39 to 0.95 for PBL-to-atrium; and from 0.81 to 0.98 for lymphoblastoid cell line-to-PBL based on cross-validation...

Back to the top