Skip Navigation

NAR Top Articles - Methods Online

Methods Online

View all categories

January 2015

A quality control system for profiles obtained by ChIP sequencing
M. A. Mendoza-Parra, W. Van Gool, M. A. Mohamed Saleem, D. G. Ceschin and H. Gronemeyer
Nucleic Acids Res. (2013) 41 (21): e196
Free Full Text
The absence of a quality control (QC) system is a major weakness for the comparative analysis of genome-wide profiles generated by next-generation sequencing (NGS). This concerns particularly genome binding/occupancy profiling assays like chromatin immunoprecipitation (ChIP-seq) but also related enrichment-based studies like methylated DNA immunoprecipitation/methylated DNA binding domain sequencing, global run on sequencing or RNA-seq. Importantly, QC assessment may significantly improve multidimensional comparisons that have great promise for extracting information from combinatorial analyses of the global profiles established for chromatin modifications, the bindings of epigenetic and chromatin-modifying enzymes/machineries, RNA polymerases and transcription factors and total, nascent or ribosome-bound RNAs. Here we present an approach that associates global and local QC indicators to ChIP-seq data sets as well as to a variety of enrichment-based studies by NGS. This QC system was used to certify >5600 publicly available data sets, hosted in a database for data mining and comparative QC analyses.

Multiplex CRISPR/Cas9-based genome engineering from a single lentiviral vector
A. M. Kabadi, D. G. Ousterout, I. B. Hilton and C. A. Gersbach
Nucleic Acids Res. (2014) 42 (19): e147
Free Full Text
Engineered DNA-binding proteins that manipulate the human genome and transcriptome have enabled rapid advances in biomedical research. In particular, the RNA-guided CRISPR/Cas9 system has recently been engineered to create site-specific double-strand breaks for genome editing or to direct targeted transcriptional regulation. A unique capability of the CRISPR/Cas9 system is multiplex genome engineering by delivering a single Cas9 enzyme and two or more single guide RNAs (sgRNAs) targeted to distinct genomic sites. This approach can be used to simultaneously create multiple DNA breaks or to target multiple transcriptional activators to a single promoter for synergistic enhancement of gene induction. To address the need for uniform and sustained delivery of multiplex CRISPR/Cas9-based genome engineering tools, we developed a single lentiviral system to express a Cas9 variant, a reporter gene and up to four sgRNAs from independent RNA polymerase III promoters that are incorporated into the vector by a convenient Golden Gate cloning method. Each sgRNA is efficiently expressed and can mediate multiplex gene editing and sustained transcriptional activation in immortalized and primary human cells...

Demonstration of CRISPR/Cas9/sgRNA-mediated targeted gene modification in Arabidopsis, tobacco, sorghum and rice
W. Jiang, H. Zhou, H. Bi, M. Fromm, B. Yang and D. P. Weeks
Nucleic Acids Res. (2013) 41 (20): e188
Free Full Text
The type II CRISPR/Cas system from Streptococcus pyogenes and its simplified derivative, the Cas9/single guide RNA (sgRNA) system, have emerged as potent new tools for targeted gene knockout in bacteria, yeast, fruit fly, zebrafish and human cells. Here, we describe adaptations of these systems leading to successful expression of the Cas9/sgRNA system in two dicot plant species, Arabidopsis and tobacco, and two monocot crop species, rice and sorghum. Agrobacterium tumefaciens was used for delivery of genes encoding Cas9, sgRNA and a non-fuctional, mutant green fluorescence protein (GFP) to Arabidopsis and tobacco. The mutant GFP gene contained target sites in its 5' coding regions that were successfully cleaved by a CAS9/sgRNA complex that, along with error-prone DNA repair, resulted in creation of functional GFP genes. DNA sequencing confirmed Cas9/sgRNA-mediated mutagenesis at the target site. Rice protoplast cells transformed with Cas9/sgRNA constructs targeting the promoter region of the bacterial blight susceptibility genes, OsSWEET14 and OsSWEET11, were confirmed by DNA sequencing to contain mutated DNA sequences at the target sites...

Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies
A. Klindworth, E. Pruesse, T. Schweer, J. Peplies, C. Quast, M. Horn and F. O. Glockner
Nucleic Acids Res. (2013) 41 (1): e1
Free Full Text
16S ribosomal RNA gene (rDNA) amplicon analysis remains the standard approach for the cultivation-independent investigation of microbial diversity. The accuracy of these analyses depends strongly on the choice of primers. The overall coverage and phylum spectrum of 175 primers and 512 primer pairs were evaluated in silico with respect to the SILVA 16S/18S rDNA non-redundant reference dataset (SSURef 108 NR). Based on this evaluation a selection of 'best available' primer pairs for Bacteria and Archaea for three amplicon size classes (100-400, 400-1000, >/= 1000 bp) is provided. The most promising bacterial primer pair (S-D-Bact-0341-b-S-17/S-D-Bact-0785-a-A-21), with an amplicon size of 464 bp, was experimentally evaluated by comparing the taxonomic distribution of the 16S rDNA amplicons with 16S rDNA fragments from directly sequenced metagenomes. The results of this study may be used as a guideline for selecting primer pairs with the best overall coverage and phylum spectrum for specific applications, therefore reducing the bias in PCR-based microbial diversity studies.

Easy quantitative assessment of genome editing by sequence trace decomposition
E. K. Brinkman, T. Chen, M. Amendola and B. van Steensel
Nucleic Acids Res. (2014) 42 (22): e168
Free Full Text
The efficacy and the mutation spectrum of genome editing methods can vary substantially depending on the targeted sequence. A simple, quick assay to accurately characterize and quantify the induced mutations is therefore needed. Here we present TIDE, a method for this purpose that requires only a pair of PCR reactions and two standard capillary sequencing runs. The sequence traces are then analyzed by a specially developed decomposition algorithm that identifies the major induced mutations in the projected editing site and accurately determines their frequency in a cell population. This method is cost-effective and quick, and it provides much more detailed information than current enzyme-based assays. An interactive web tool for automated decomposition of the sequence traces is available. TIDE greatly facilitates the testing and rational design of genome editing strategies.

PaperClip: rapid multi-part DNA assembly from existing libraries
M. Trubitsyna, G. Michlewski, Y. Cai, A. Elfick and C. E. French
Nucleic Acids Res. (2014) 42 (20): e154
Free Full Text
Assembly of DNA 'parts' to create larger constructs is an essential enabling technique for bioengineering and synthetic biology. Here we describe a simple method, PaperClip, which allows flexible assembly of multiple DNA parts from currently existing libraries cloned in any vector. No restriction enzymes, mutagenesis of internal restriction sites, or reamplification to add end homology are required. Order of assembly is directed by double stranded oligonucleotides-'Clips'. Clips are formed by ligation of pairs of oligonucleotides corresponding to the ends of each part. PaperClip assembly can be performed by polymerase chain reaction or by cell extract-mediated recombination. Once multi-use Clips have been prepared, assembly of at least six DNA parts in any order can be accomplished with high efficiency within several hours.

svaseq: removing batch effects and other unwanted noise from sequencing data
J. T. Leek
Nucleic Acids Res. (2014) 42 (21): e161
Free Full Text
It is now known that unwanted noise and unmodeled artifacts such as batch effects can dramatically reduce the accuracy of statistical inference in genomic experiments. These sources of noise must be modeled and removed to accurately measure biological variability and to obtain correct statistical inference when performing high-throughput genomic analysis. We introduced surrogate variable analysis (sva) for estimating these artifacts by (i) identifying the part of the genomic data only affected by artifacts and (ii) estimating the artifacts with principal components or singular vectors of the subset of the data matrix. The resulting estimates of artifacts can be used in subsequent analyses as adjustment factors to correct analyses. Here I describe a version of the sva approach specifically created for count data or FPKMs from sequencing experiments based on appropriate data transformation. I also describe the addition of supervised sva (ssva) for using control probes to identify the part of the genomic data only affected by artifacts. I present a comparison between these versions of sva and other methods for batch effect estimation on simulated data, real count-based data and FPKM-based data...

Computational analysis of bacterial RNA-Seq data
R. McClure, D. Balasubramanian, Y. Sun, M. Bobrovskyy, P. Sumby, C. A. Genco, C. K. Vanderpool and B. Tjaden
Nucleic Acids Res. (2013) 41 (14): e140
Free Full Text
Recent advances in high-throughput RNA sequencing (RNA-seq) have enabled tremendous leaps forward in our understanding of bacterial transcriptomes. However, computational methods for analysis of bacterial transcriptome data have not kept pace with the large and growing data sets generated by RNA-seq technology. Here, we present new algorithms, specific to bacterial gene structures and transcriptomes, for analysis of RNA-seq data. The algorithms are implemented in an open source software system called Rockhopper that supports various stages of bacterial RNA-seq data analysis, including aligning sequencing reads to a genome, constructing transcriptome maps, quantifying transcript abundance, testing for differential gene expression, determining operon structures and visualizing results. We demonstrate the performance of Rockhopper using 2.1 billion sequenced reads from 75 RNA-seq experiments conducted with Escherichia coli, Neisseria gonorrhoeae, Salmonella enterica, Streptococcus pyogenes and Xenorhabdus nematophila. We find that the transcriptome maps generated by our algorithms are highly accurate when compared with focused experimental data from E. coli and N. gonorrhoeae...

Impact of sequencing depth in ChIP-seq experiments
Y. L. Jung, L. J. Luquette, J. W. Ho, F. Ferrari, M. Tolstorukov, A. Minoda, R. Issner, C. B. Epstein, G. H. Karpen, M. I. Kuroda and P. J. Park
Nucleic Acids Res. (2014) 42 (9): e74
Free Full Text
In a chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experiment, an important consideration in experimental design is the minimum number of sequenced reads required to obtain statistically significant results. We present an extensive evaluation of the impact of sequencing depth on identification of enriched regions for key histone modifications (H3K4me3, H3K36me3, H3K27me3 and H3K9me2/me3) using deep-sequenced datasets in human and fly. We propose to define sufficient sequencing depth as the number of reads at which detected enrichment regions increase <1% for an additional million reads. Although the required depth depends on the nature of the mark and the state of the cell in each experiment, we observe that sufficient depth is often reached at <20 million reads for fly. For human, there are no clear saturation points for the examined datasets, but our analysis suggests 40-50 million reads as a practical minimum for most marks. We also devise a mathematical model to estimate the sufficient depth and total genomic coverage of a mark...

Broad-host-range vector system for synthetic biology and biotechnology in cyanobacteria
A. Taton, F. Unglaub, N. E. Wright, W. Y. Zeng, J. Paz-Yepes, B. Brahamsha, B. Palenik, T. C. Peterson, F. Haerizadeh, S. S. Golden and J. W. Golden
Nucleic Acids Res. (2014) 42 (17): e136
Free Full Text
Inspired by the developments of synthetic biology and the need for improved genetic tools to exploit cyanobacteria for the production of renewable bioproducts, we developed a versatile platform for the construction of broad-host-range vector systems. This platform includes the following features: (i) an efficient assembly strategy in which modules released from 3 to 4 donor plasmids or produced by polymerase chain reaction are assembled by isothermal assembly guided by short GC-rich overlap sequences. (ii) A growing library of molecular devices categorized in three major groups: (a) replication and chromosomal integration; (b) antibiotic resistance; (c) functional modules. These modules can be assembled in different combinations to construct a variety of autonomously replicating plasmids and suicide plasmids for gene knockout and knockin. (iii) A web service, the CYANO-VECTOR assembly portal, which was built to organize the various modules, facilitate the in silico construction of plasmids, and encourage the use of this system...

Back to the top