Skip Navigation

NAR Top Articles - Methods Online

Methods Online

View all categories

December 2013

Discovering pathways by orienting edges in protein interaction networks
Gitter, A; Klein-Seetharaman, J; Gupta, A; Bar-Joseph, Z
Nucleic Acids Res. (2011) 39 (4): e22
Free Full Text
Modern experimental technology enables the identification of the sensory proteins that interact with the cells' environment or various pathogens. Expression and knockdown studies can determine the downstream effects of these interactions. However, when attempting to reconstruct the signaling networks and pathways between these sources and targets, one faces a substantial challenge. Although pathways are directed, high-throughput protein interaction data are undirected. In order to utilize the available data, we need methods that can orient protein interaction edges and discover high-confidence pathways that explain the observed experimental outcomes. We formalize the orientation problem in weighted protein interaction graphs as an optimization problem and present three approximation algorithms based on either weighted Boolean satisfiability solvers or probabilistic assignments. We use these algorithms to identify pathways in yeast. Our approach recovers twice as many known signaling cascades as a recent unoriented signaling pathway prediction technique and over 13 times as many as an existing network orientation algorithm...

Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting
Cermak, T; Doyle, EL; Christian, M; Wang, L; Zhang, Y; Schmidt, C; Baller, JA; Somia, NV; Bogdanove, AJ; Voytas, DF
Nucleic Acids Res. (2011) 39 (12): e82
Free Full Text
TALENs are important new tools for genome engineering. Fusions of transcription activator-like (TAL) effectors of plant pathogenic Xanthomonas spp. to the FokI nuclease, TALENs bind and cleave DNA in pairs. Binding specificity is determined by customizable arrays of polymorphic amino acid repeats in the TAL effectors. We present a method and reagents for efficiently assembling TALEN constructs with custom repeat arrays. We also describe design guidelines based on naturally occurring TAL effectors and their binding sites. Using software that applies these guidelines, in nine genes from plants, animals and protists, we found candidate cleavage sites on average every 35 bp. Each of 15 sites selected from this set was cleaved in a yeast-based assay with TALEN pairs constructed with our reagents. We used two of the TALEN pairs to mutate HPRT1 in human cells and ADH1 in Arabidopsis thaliana protoplasts. Our reagents include a plasmid construct for making custom TAL effectors and one for TAL effector fusions to additional proteins of interest...

A novel method for the efficient and selective identification of 5-hydroxymethylcytosine in genomic DNA
Robertson, AB; Dahl, JA; Vagbo, CB; Tripathi, P; Krokan, HE; Klungland, A
Nucleic Acids Res. (2011) 39 (8): e55
Free Full Text
Recently, 5-hydroxymethylcytosine (5hmC) was identified in mammalian genomic DNA. The biological role of this modification remains unclear; however, identifying the genomic location of this modified base will assist in elucidating its function. We describe a method for the rapid and inexpensive identification of genomic regions containing 5hmC. This method involves the selective glucosylation of 5hmC residues by the beta-glucosyltransferase from T4 bacteriophage creating beta-glucosyl-5-hydroxymethylcytosine (beta-glu-5hmC). The beta-glu-5hmC modification provides a target that can be efficiently and selectively pulled down by J-binding protein 1 coupled to magnetic beads. DNA that is precipitated is suitable for analysis by quantitative PCR, microarray or sequencing. Furthermore, we demonstrate that the J-binding protein 1 pull down assay identifies 5hmC at the promoters of developmentally regulated genes in human embryonic stem cells. The method described here will allow for a greater understanding of the temporal and spatial effects that 5hmC may have on epigenetic regulation at the single gene level.

Integration of DNA into bacterial chromosomes from plasmids without a counter-selection marker
Heap, JT; Ehsaan, M; Cooksley, CM; Ng, YK; Cartman, ST; Winzer, K; Minton, NP
Nucleic Acids Res. (2012) 40 (8): e59
Free Full Text
Most bacteria can only be transformed with circular plasmids, so robust DNA integration methods for these rely upon selection of single-crossover clones followed by counter-selection of double-crossover clones. To overcome the limited availability of heterologous counter-selection markers, here we explore novel DNA integration strategies that do not employ them, and instead exploit (i) activation or inactivation of genes leading to a selectable phenotype, and (ii) asymmetrical regions of homology to control the order of recombination events. We focus here on the industrial biofuel-producing bacterium Clostridium acetobutylicum, which previously lacked robust integration tools, but the approach we have developed is broadly applicable. Large sequences can be delivered in a series of steps, as we demonstrate by inserting the chromosome of phage lambda (minus a region apparently unstable in Escherichia coli in our cloning context) into the chromosome of C. acetobutylicum in three steps. This work should open the way to reliable integration of DNA including large synthetic constructs in diverse microorganisms.

Summarizing and correcting the GC content bias in high-throughput sequencing
Benjamini, Y; Speed, TP
Nucleic Acids Res. (2012) 40 (10): e72
Free Full Text
GC content bias describes the dependence between fragment count (read coverage) and GC content found in Illumina sequencing data. This bias can dominate the signal of interest for analyses that focus on measuring fragment abundance within a genome, such as copy number estimation (DNA-seq). The bias is not consistent between samples; and there is no consensus as to the best methods to remove it in a single sample. We analyze regularities in the GC bias patterns, and find a compact description for this unimodal curve family. It is the GC content of the full DNA fragment, not only the sequenced read, that most influences fragment count. This GC effect is unimodal: both GC-rich fragments and AT-rich fragments are underrepresented in the sequencing results. This empirical evidence strengthens the hypothesis that PCR is the most important cause of the GC bias. We propose a model that produces predictions at the base pair level, allowing strand-specific GC-effect correction regardless of the downstream smoothing or binning. These GC modeling considerations can inform other high-throughput sequencing analyses such as ChIP-seq and RNA-seq.

Predicting the functional impact of protein mutations: application to cancer genomics
Reva, B; Antipin, Y; Sander, C
Nucleic Acids Res. (2011) 39 (17): e118
Free Full Text
As large-scale re-sequencing of genomes reveals many protein mutations, especially in human cancer tissues, prediction of their likely functional impact becomes important practical goal. Here, we introduce a new functional impact score (FIS) for amino acid residue changes using evolutionary conservation patterns. The information in these patterns is derived from aligned families and sub-families of sequence homologs within and between species using combinatorial entropy formalism. The score performs well on a large set of human protein mutations in separating disease-associated variants (similar to 19 200), assumed to be strongly functional, from common polymorphisms (similar to 35 600), assumed to be weakly functional (area under the receiver operating characteristic curve of similar to 0.86). In cancer, using recurrence, multiplicity and annotation for similar to 10 000 mutations in the COSMIC database, the method does well in assigning higher scores to more likely functional mutations ('drivers'). To guide experimental prioritization, we report a list of about 1000 top human cancer genes frequently mutated in one or more cancer types ranked by likely functional impact...

Rational, modular adaptation of enzyme-free DNA circuits to multiple detection methods
Li, BL; Ellington, AD; Chen, X
Nucleic Acids Res. (2011) 39 (16): e110
Free Full Text
Signal amplification is a key component of molecular detection. Enzyme-free signal amplification is especially appealing for the development of low-cost, point-of-care diagnostics. It has been previously shown that enzyme-free DNA circuits with signal-amplification capacity can be designed using a mechanism called 'catalyzed hairpin assembly'. However, it is unclear whether the efficiency and modularity of such circuits is suitable for multiple analytical applications. We have therefore designed and characterized a simplified DNA circuit based on catalyzed hairpin assembly, and applied it to multiple different analytical formats, including fluorescent, colorimetric, and electrochemical and signaling. By optimizing the design of previous hairpin-based catalytic assemblies we found that our circuit has almost zero background and a high catalytic efficiency, with a k(cat) value above 1 min(-1). The inherent modularity of the circuit allowed us to readily adapt our circuit to detect both RNA and small molecule analytes. Overall, these data demonstrate that catalyzed hairpin assembly is suitable for analyte detection and signal amplification in a 'plug-and-play' fashion.

Whole-transcriptome RNAseq analysis from minute amount of total RNA
Tariq, MA; Kim, HJ; Jejelowo, O; Pourmand, N
Nucleic Acids Res. (2011) 39 (18): e120
Free Full Text
RNA sequencing approaches to transcriptome analysis require a large amount of input total RNA to yield sufficient mRNA using either poly-A selection or depletion of rRNA. This feature makes it difficult to miniaturize transcriptome analysis for greater efficiency. To address this challenge, we devised and validated a simple procedure for the preparation of whole-transcriptome cDNA libraries from a minute amount (500 pg) of total RNA. We compared a single-sample library prepared by this Ovation (R) RNA-Seq system with two available methods of mRNA enrichment (TruSeq (TM) poly-A enrichment and RiboMinus (TM) rRNA depletion). Using the Ovation (R) preparation method for a set of eight mouse tissue samples, the RNA sequencing data obtained from two different next-generation sequencing platforms (SOLiD and Illumina Genome Analyzer IIx) yielded negligible rRNA reads (< 3.5%) while retaining transcriptome sequencing fidelity. We further validated the Ovation (R) amplification technique by examining the resulting library complexity, reproducibility, evenness of transcript coverage, 5' and 3' bias and platform-specific biases...

MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data
Shen, SH; Park, JW; Huang, J; Dittmar, KA; Lu, ZX; Zhou, Q; Carstens, RP; Xing, Y
Nucleic Acids Res. (2012) 40 (8): e61
Free Full Text
Ultra-deep RNA sequencing has become a powerful approach for genome-wide analysis of pre-mRNA alternative splicing. We develop MATS (multivariate analysis of transcript splicing), a Bayesian statistical framework for flexible hypothesis testing of differential alternative splicing patterns on RNA-Seq data. MATS uses a multivariate uniform prior to model the between-sample correlation in exon splicing patterns, and a Markov chain Monte Carlo (MCMC) method coupled with a simulation-based adaptive sampling procedure to calculate the P-value and false discovery rate (FDR) of differential alternative splicing. Importantly, the MATS approach is applicable to almost any type of null hypotheses of interest, providing the flexibility to identify differential alternative splicing events that match a given user-defined pattern. We evaluated the performance of MATS using simulated and real RNA-Seq data sets. In the RNA-Seq analysis of alternative splicing events regulated by the epithelial-specific splicing factor ESRP1, we obtained a high RT-PCR validation rate of 86% for differential exon skipping events with a MATS FDR of <10%...

Mixture of differentially tagged Tol2 transposons accelerates conditional disruption of a broad spectrum of genes in mouse embryonic stem cells
Mayasari, NI; Mukougawa, K; Shigeoka, T; Kawakami, K; Kawaichi, M; Ishida, Y
Nucleic Acids Res. (2012) 40 (13): e97
Free Full Text
Among the insertional mutagenesis techniques used in the current international knockout mouse project (KOMP) on the inactivation of all mouse genes in embryonic stem (ES) cells, random gene trapping has been playing a major role. Gene-targeting experiments have also been performed to individually and conditionally knockout the remaining 'difficult-to-trap' genes. Here, we show that transcriptionally silent genes in ES cells are severely underrepresented among the randomly trapped genes in KOMP. Our conditional poly(A)-trapping vector with a common retroviral backbone also has a strong bias to be integrated into constitutively transcribed genome loci. Most importantly, conditional gene disruption could not be successfully accomplished by using the retrovirus vector because of the frequent development of intra-vector deletions/rearrangements. We found that one of the cut and paste-type DNA transposons, Tol2, can serve as an ideal platform for gene-trap vectors that ensures identification and conditional disruption of a broad spectrum of genes in ES cells...

Back to the top