NAR Top Articles - Computational Biology
A modular framework for gene set analysis integrating multilevel omics data
Sass, S; Buettner, F; Mueller, NS; Theis, FJ
Nucleic Acids Res. 2013, 41, 9622-9633
Free Full Text
Modern high-throughput methods allow the investigation of biological functions across multiple 'omics' levels. Levels include mRNA and protein expression profiling as well as additional knowledge on, for example, DNA methylation and microRNA regulation. The reason for this interest in multi-omics is that actual cellular responses to different conditions are best explained mechanistically when taking all omics levels into account. To map gene products to their biological functions, public ontologies like Gene Ontology are commonly used. Many methods have been developed to identify terms in an ontology, overrepresented within a set of genes. However, these methods are not able to appropriately deal with any combination of several data types. Here, we propose a new method to analyse integrated data across multiple omics-levels to simultaneously assess their biological meaning. We developed a model-based Bayesian method for inferring interpretable term probabilities in a modular framework. Our Multi-level ONtology Analysis (MONA) algorithm performed significantly better than conventional analyses of individual levels and yields best results even for sophisticated models including mRNA fine-tuning by microRNAs...
Interplay of microRNAs, transcription factors and target genes: linking dynamic expression changes to function
Nazarov, PV; Reinsbach, SE; Muller, A; Nicot, N; Philippidou, D; Vallar, L; Kreis, S
Nucleic Acids Res. 2013, 41, 2817-2831
Free Full Text
MicroRNAs (miRNAs) are ubiquitously expressed small non-coding RNAs that, in most cases, negatively regulate gene expression at the post-transcriptional level. miRNAs are involved in fine-tuning fundamental cellular processes such as proliferation, cell death and cell cycle control and are believed to confer robustness to biological responses. Here, we investigated simultaneously the transcriptional changes of miRNA and mRNA expression levels over time after activation of the Janus kinase/Signal transducer and activator of transcription (Jak/STAT) pathway by interferon-gamma stimulation of melanoma cells. To examine global miRNA and mRNA expression patterns, time-series microarray data were analysed. We observed delayed responses of miRNAs (after 24-48 h) with respect to mRNAs (12-24 h) and identified biological functions involved at each step of the cellular response. Inference of the upstream regulators allowed for identification of transcriptional regulators involved in cellular reactions to interferon-gamma stimulation...
Predicting enhancer transcription and activity from chromatin modifications
Zhu, Y; Sun, L; Chen, Z; Whitaker, JW; Wang, T; Wang, W
Nucleic Acids Res. 2013, 41, 10032-10043
Free Full Text
Enhancers play a pivotal role in regulating the transcription of distal genes. Although certain chromatin features, such as the histone acetyltransferase P300 and the histone modification H3K4me1, indicate the presence of enhancers, only a fraction of enhancers are functionally active. Individual chromatin marks, such as H3K27ac and H3K27me3, have been identified to distinguish active from inactive enhancers. However, the systematic identification of the most informative single modification, or combination thereof, is still lacking. Furthermore, the discovery of enhancer RNAs (eRNAs) provides an alternative approach to directly predicting enhancer activity. However, it remains challenging to link chromatin modifications to eRNA transcription. Herein, we develop a logistic regression model to unravel the relationship between chromatin modifications and eRNA synthesis. We perform a systematic assessment of 24 chromatin modifications in fetal lung fibroblast and demonstrate that a combination of four modifications is sufficient to accurately predict eRNA transcription. Furthermore, we compare the ability of eRNAs and H3K27ac to discriminate enhancer activity...
Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
Varemo, L; Nielsen, J; Nookaew, I
Nucleic Acids Res. 2013, 41, 4378-4391
Free Full Text
Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should be preferred, and the variety of available GSA software and implementations pose a difficulty for the end-user who wants to try out different methods. To address this, we have developed the R package Piano that collects a range of GSA methods into the same system, for the benefit of the end-user. Further on we refine the GSA workflow by using modifications of the gene-level statistics. This enables us to divide the resulting gene set P-values into three classes, describing different aspects of gene expression directionality at gene set level. We use our fully implemented workflow to investigate the impact of the individual components of GSA by using microarray and RNA-seq data. The results show that the evaluated methods are globally similar and the major separation correlates well with our defined directionality classes...
TEMP: a computational method for analyzing transposable element polymorphism in populations
Zhuang, JL; Wang, J; Theurkauf, W; Weng, ZP
Nucleic Acids Res. 2014, 42, 6826-6838
Free Full Text
Insertions and excisions of transposable elements (TEs) affect both the stability and variability of the genome. Studying the dynamics of transposition at the population level can provide crucial insights into the processes and mechanisms of genome evolution. Pooling genomic materials from multiple individuals followed by high-throughput sequencing is an efficient way of characterizing genomic polymorphisms in a population. Here we describe a novel method named TEMP, specifically designed to detect TE movements present with a wide range of frequencies in a population. By combining the information provided by pair-end reads and split reads, TEMP is able to identify both the presence and absence of TE insertions in genomic DNA sequences derived from heterogeneous samples; accurately estimate the frequencies of transposition events in the population and pinpoint junctions of high frequency transposition events at nucleotide resolution. Simulation data indicate that TEMP outperforms other algorithms such as PoPoolationTE, RetroSeq, VariationHunter and GASVPro. TEMP also performs well on whole-genome human data derived from the 1000 Genomes Project...
Surprisingly extensive mixed phylogenetic and ecological signals among bacterial Operational Taxonomic Units
Koeppel, AF; Wu, M
Nucleic Acids Res. 2013, 41, 5175-5188
Free Full Text
The lack of a consensus bacterial species concept greatly hampers our ability to understand and organize bacterial diversity. Operational taxonomic units (OTUs), which are clustered on the basis of DNA sequence identity alone, are the most commonly used microbial diversity unit. Although it is understood that OTUs can be phylogenetically incoherent, the degree and the extent of the phylogenetic inconsistency have not been explicitly studied. Here, we tested the phylogenetic signal of OTUs in a broad range of bacterial genera from various phyla. Strikingly, we found that very few OTUs were monophyletic, and many showed evidence of multiple independent origins. Using previously established bacterial habitats as benchmarks, we showed that OTUs frequently spanned multiple ecological habitats. We demonstrated that ecological heterogeneity within OTUs is caused by their phylogenetic inconsistency, and not merely due to 'lumping' of taxa resulting from using relaxed identity cut-offs. We argue that ecotypes, as described by the Stable Ecotype Model, are phylogenetically and ecologically more consistent than OTUs and therefore could serve as an alternative unit for bacterial diversity studies...
A high-resolution network model for global gene regulation in Mycobacterium tuberculosis
Peterson, EJR; Reiss, DJ; Turkarslan, S; Minch, KJ; Rustad, T; Plaisier, CL; Longabaugh, WJR; Sherman, DR; Baliga, NS
Nucleic Acids Res. 2014, 42, 11291-11303
Free Full Text
The resilience of Mycobacterium tuberculosis (MTB) is largely due to its ability to effectively counteract and even take advantage of the hostile environments of a host. In order to accelerate the discovery and characterization of these adaptive mechanisms, we have mined a compendium of 2325 publicly available transcriptome profiles of MTB to decipher a predictive, systems-scale gene regulatory network model. The resulting modular organization of 98% of all MTB genes within this regulatory network was rigorously tested using two independently generated datasets: a genome-wide map of 7248 DNA-binding locations for 143 transcription factors (TFs) and global transcriptional consequences of over-expressing 206 TFs. This analysis has discovered specific TFs that mediate conditional co-regulation of genes within 240 modules across 14 distinct environmental contexts. In addition to recapitulating previously characterized regulons, we discovered 454 novel mechanisms for gene regulation during stress, cholesterol utilization and dormancy...
Mapping overlapping functional elements embedded within the protein-coding regions of RNA viruses
Nucleic Acids Res. 2014, 42, 12425-12439
Free Full Text
Identification of the full complement of genes and other functional elements in any virus is crucial to fully understand its molecular biology and guide the development of effective control strategies. RNA viruses have compact multifunctional genomes that frequently contain overlapping genes and non-coding functional elements embedded within protein-coding sequences. Overlapping features often escape detection because it can be difficult to disentangle the multiple roles of the constituent nucleotides via mutational analyses, while high-throughput experimental techniques are often unable to distinguish functional elements from incidental features. However, RNA viruses evolve very rapidly so that, even within a single species, substitutions rapidly accumulate at neutral or near-neutral sites providing great potential for comparative genomics to distinguish the signature of purifying selection. Computationally identified features can then be efficiently targeted for experimental analysis. Here we analyze alignments of protein-coding virus sequences to identify regions where there is a statistically significant reduction in the degree of variability at synonymous sites...
miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data
An, JY; Lai, J; Lehman, ML; Nelson, CC
Nucleic Acids Res. 2013, 41, 727-737
Free Full Text
miRDeep and its varieties are widely used to quantify known and novel micro RNA (miRNA) from small RNA sequencing (RNAseq). This article describes miRDeep*, our integrated miRNA identification tool, which is modeled off miRDeep, but the precision of detecting novel miRNAs is improved by introducing new strategies to identify precursor miRNAs. miRDeep* has a user-friendly graphic interface and accepts raw data in FastQ and Sequence Alignment Map (SAM) or the binary equivalent (BAM) format. Known and novel miRNA expression levels, as measured by the number of reads, are displayed in an interface, which shows each RNAseq read relative to the pre-miRNA hairpin. The secondary pre-miRNA structure and read locations for each predicted miRNA are shown and kept in a separate figure file. Moreover, the target genes of known and novel miRNAs are predicted using the TargetScan algorithm, and the targets are ranked according to the confidence score. miRDeep* is an integrated standalone application where sequence alignment, pre-miRNA secondary structure calculation and graphical display are purely Java coded. This application tool can be executed using a normal personal computer with 1.5 GB of memory...
Molecular evolutionary and structural analysis of the cytosolic DNA sensor cGAS and STING
Wu, XM; Wu, FH; Wang, XQ; Wang, LL; Siedow, JN; Zhang, WG; Pei, ZM
Nucleic Acids Res. 2014, 42, 8243-8257
Free Full Text
Cyclic GMP-AMP (cGAMP) synthase (cGAS) is recently identified as a cytosolic DNA sensor and generates a non-canonical cGAMP that contains G(2',5')pA and A(3',5')pG phosphodiester linkages. cGAMP activates STING which triggers innate immune responses in mammals. However, the evolutionary functions and origins of cGAS and STING remain largely elusive. Here, we carried out comprehensive evolutionary analyses of the cGAS-STING pathway. Phylogenetic analysis of cGAS and STING families showed that their origins could be traced back to a choanoflagellate Monosiga brevicollis. Modern cGAS and STING may have acquired structural features, including zinc-ribbon domain and critical amino acid residues for DNA binding in cGAS as well as carboxy terminal tail domain for transducing signals in STING, only recently in vertebrates. In invertebrates, cGAS homologs may not act as DNA sensors. Both proteins cooperate extensively, have similar evolutionary characteristics, and thus may have co-evolved during metazoan evolution. cGAS homologs and a prokaryotic dinucleotide cyclase for canonical cGAMP share conserved secondary structures and catalytic residues...
- About this journal
- NAR Methods online
- 2015 Database Issue
- 2015 Web Server Issue
- NAR Special Collections
- Referee Information
- Rights & Permissions
- Dispatch date of the next issue
- This journal is a member of the Committee on Publication Ethics (COPE)
- view Recent Comments on articles
- We are mobile – find out more
- Journals Career Network
Impact factor: 9.112
5-Yr impact factor: 8.867
Senior Executive Editors
- Instructions to authors
- Scope and Criteria for Consideration
- Submit a manuscript now
- Self-archiving policy
Open access options for authors