February 2015

antiSMASH 2.0--a versatile platform for genome mining of secondary metabolite producers
Blin, K; Medema, MH; Kazempour, D; Fischbach, MA; Breitling, R; Takano, E; Weber, T
Nucleic Acids Res. 2013, 41, W204-W212
Microbial secondary metabolites are a potent source of antibiotics and other pharmaceuticals. Genome mining of their biosynthetic gene clusters has become a key method to accelerate their identification and characterization. In 2011, we developed antiSMASH, a web-based analysis platform that automates this process. Here, we present the highly improved antiSMASH 2.0 release, available at For the new version, antiSMASH was entirely re-designed using a plug-and-play concept that allows easy integration of novel predictor or output modules. antiSMASH 2.0 now supports input of multiple related sequences simultaneously (multi-FASTA/GenBank/EMBL), which allows the analysis of draft genomes comprising multiple contigs. Moreover, direct analysis of protein sequences is now possible. antiSMASH 2.0 has also been equipped with the capacity to detect additional classes of secondary metabolites, including oligosaccharide antibiotics, phenazines, thiopeptides, homoserine lactones, phosphonates and furans. The algorithm for predicting the core structure of the cluster end product is now also covering lantipeptides...

SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information
Biasini, M; Bienert, S; Waterhouse, A; Arnold, K; Studer, G; Schmidt, T; Kiefer, F; Cassarino, TG; Bertoni, M; Bordoli, L; Schwede, T
Nucleic Acids Res. 2014, 42, W252-W258
Protein structure homology modelling has become a routine technique to generate 3D models for proteins when experimental structures are not available. Fully automated servers such as SWISS-MODEL with user-friendly web interfaces generate reliable models without the need for complex software packages or downloading large databases. Here, we describe the latest version of the SWISS-MODEL expert system for protein structure modelling. The SWISS-MODEL template library provides annotation of quaternary structure and essential ligands and co-factors to allow for building of complete structural models, including their oligomeric structure. The improved SWISS-MODEL pipeline makes extensive use of model quality estimation for selection of the most suitable templates and provides estimates of the expected accuracy of the resulting models. The accuracy of the models generated by SWISS-MODEL is continuously evaluated by the CAMEO system. The new web site allows users to interactively search for templates, cluster them by sequence similarity, structurally compare alternative templates...

CHOPCHOP: a CRISPR/Cas9 and TALEN web tool for genome editing
Montague, TG; Cruz, JM; Gagnon, JA; Church, GM; Valen, E
Nucleic Acids Res. 2014, 42, W401-W407
Major advances in genome editing have recently been made possible with the development of the TALEN and CRISPR/Cas9 methods. The speed and ease of implementing these technologies has led to an explosion of mutant and transgenic organisms. A rate-limiting step in efficiently applying TALEN and CRISPR/Cas9 methods is the selection and design of targeting constructs. We have developed an online tool, CHOPCHOP (, to expedite the design process. CHOPCHOP accepts a wide range of inputs (gene identifiers, genomic regions or pasted sequences) and provides an array of advanced options for target selection. It uses efficient sequence alignment algorithms to minimize search times, and rigorously predicts off-target binding of single-guide RNAs (sgRNAs) and TALENs. Each query produces an interactive visualization of the gene with candidate target sites displayed at their genomic positions and color-coded according to quality scores. In addition, for each possible target site, restriction sites and primer candidates are visualized, facilitating a streamlined pipeline of mutant generation and validation. The ease-of-use and speed of CHOPCHOP make it a valuable tool for genome engineering.

Deciphering key features in protein structures with the new ENDscript server
Robert, X; Gouet, P
Nucleic Acids Res. 2014, 42, W320-W324
ENDscript 2 is a friendly Web server for extracting and rendering a comprehensive analysis of primary to quaternary protein structure information in an automated way. This major upgrade has been fully re-engineered to enhance speed, accuracy and usability with interactive 3D visualization. It takes advantage of the new version 3 of ESPript, our well-known sequence alignment renderer, improved to handle a large number of data with reduced computation time. From a single PDB entry or file, ENDscript produces high quality figures displaying multiple sequence alignment of proteins homologous to the query, colored according to residue conservation. Furthermore, the experimental secondary structure elements and a detailed set of relevant biophysical and structural data are depicted. All this information and more are now mapped on interactive 3D PyMOL representations. Thanks to its adaptive and rigorous algorithm, beginner to expert users can modify settings to fine-tune ENDscript to their needs. ENDscript has also been upgraded as an open platform for the visualization of multiple biochemical and structural data coming from external biotool Web servers, with both 2D and 3D representations.

RBPmap: a web server for mapping binding sites of RNA-binding proteins
Paz, I; Kosti, I; Ares, M; Cline, M; Mandel-Gutfreund, Y
Nucleic Acids Res. 2014, 42, W361-W367
Regulation of gene expression is executed in many cases by RNA-binding proteins (RBPs) that bind to mRNAs as well as to non-coding RNAs. RBPs recognize their RNA target via specific binding sites on the RNA. Predicting the binding sites of RBPs is known to be a major challenge. We present a new webserver, RBPmap, freely accessible through the website for accurate prediction and mapping of RBP binding sites. RBPmap has been developed specifically for mapping RBPs in human, mouse and Drosophila melanogaster genomes, though it supports other organisms too. RBPmap enables the users to select motifs from a large database of experimentally defined motifs. In addition, users can provide any motif of interest, given as either a consensus or a PSSM. The algorithm for mapping the motifs is based on a Weighted-Rank approach, which considers the clustering propensity of the binding sites and the overall tendency of regulatory regions to be conserved. In addition, RBPmap incorporates a position-specific background model, designed uniquely for different genomic regions, such as splice sites, 5' and 3' UTRs, non-coding RNA and intergenic regions...

NetVenn: an integrated network analysis web platform for gene lists
Wang, Y; Thilmony, R; Gu, YQ
Nucleic Acids Res. 2014, 42, W161-W166
Many lists containing biological identifiers, such as gene lists, have been generated in various genomics projects. Identifying the overlap among gene lists can enable us to understand the similarities and differences between the data sets. Here, we present an interactome network-based web application platform named NetVenn for comparing and mining the relationships among gene lists. NetVenn contains interactome network data publically available for several species and supports a user upload of customized interactome network data. It has an efficient and interactive graphic tool that provides a Venn diagram view for comparing two to four lists in the context of an interactome network. NetVenn also provides a comprehensive annotation of genes in the gene lists by using enriched terms from multiple functional databases. In addition, it allows for mapping the gene expression data, providing information of transcription status of genes in the network. The power graph analysis tool is integrated in NetVenn for simplified visualization of gene relationships in the network. NetVenn is freely available at

PELE web server: atomistic study of biomolecular systems at your fingertips
Madadkar-Sobhani, A; Guallar, V
Nucleic Acids Res. 2013, 41, W322-W328
PELE, Protein Energy Landscape Exploration, our novel technology based on protein structure prediction algorithms and a Monte Carlo sampling, is capable of modelling the all-atom protein-ligand dynamical interactions in an efficient and fast manner, with two orders of magnitude reduced computational cost when compared with traditional molecular dynamics techniques. PELE's heuristic approach generates trial moves based on protein and ligand perturbations followed by side chain sampling and global/local minimization. The collection of accepted steps forms a stochastic trajectory. Furthermore, several processors may be run in parallel towards a collective goal or defining several independent trajectories; the whole procedure has been parallelized using the Message Passing Interface. Here, we introduce the PELE web server, designed to make the whole process of running simulations easier and more practical by minimizing input file demand, providing user-friendly interface and producing abstract outputs (e. g. interactive graphs and tables). The web server has been implemented in C++ using Wt ( and MySQL (

DMINDA: an integrated web server for DNA motif identification and analyses
Ma, Q; Zhang, HY; Mao, XZ; Zhou, C; Liu, BQ; Chen, X; Xu, Y
Nucleic Acids Res. 2014, 42, W12-W19
DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular.

NetworkAnalyst - integrative approaches for protein-protein interaction network analysis and visual exploration
Xia, JG; Benner, MJ; Hancock, REW
Nucleic Acids Res. 2014, 42, W167-W174
Biological network analysis is a powerful approach to gain systems-level understanding of patterns of gene expression in different cell types, disease states and other biological/experimental conditions. Three consecutive steps are required - identification of genes or proteins of interest, network construction and network analysis and visualization. To date, researchers have to learn to use a combination of several tools to accomplish this task. In addition, interactive visualization of large networks has been primarily restricted to locally installed programs. To address these challenges, we have developed NetworkAnalyst, taking advantage of state-of-the-art web technologies, to enable high performance network analysis with rich user experience. NetworkAnalyst integrates all three steps and presents the results via a powerful online network visualization framework. Users can upload gene or protein lists, single or multiple gene expression datasets to perform comprehensive gene annotation and differential expression analysis. Significant genes are mapped to our manually curated protein-protein interaction database to construct relevant networks...

LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures
Duan, QN; Flynn, C; Niepel, M; Hafner, M; Muhlich, JL; Fernandez, NF; Rouillard, AD; Tan, CM; Chen, EY; Golub, TR; Sorger, PK; Subramanian, A; Ma'ayan, A
Nucleic Acids Res. 2014, 42, W449-W460
For the Library of Integrated Network-based Cellular Signatures (LINCS) project many gene expression signatures using the L1000 technology have been produced. The L1000 technology is a cost-effective method to profile gene expression in large scale. LINCS Canvas Browser (LCB) is an interactive HTML5 web-based software application that facilitates querying, browsing and interrogating many of the currently available LINCS L1000 data. LCB implements two compacted layered canvases, one to visualize clustered L1000 expression data, and the other to display enrichment analysis results using 30 different gene set libraries. Clicking on an experimental condition highlights gene-sets enriched for the differentially expressed genes from the selected experiment. A search interface allows users to input gene lists and query them against over 100 000 conditions to find the top matching experiments. The tool integrates many resources for an unprecedented potential for new discoveries in systems biology and systems pharmacology.

