Skip Navigation

NAR Top Articles - Genomics

Genomics

View all categories

June 2015


Optimization of scarless human stem cell genome editing
Yang, LH; Guell, M; Byrne, S; Yang, JL; De Los Angeles, A; Mali, P; Aach, J; Kim-Kiselak, C; Briggs, AW; Rios, X; Huang, PY; Daley, G; Church, G
Nucleic Acids Res. 2013, 41, 9049-9061
Free Full Text
Efficient strategies for precise genome editing in human-induced pluripotent cells (hiPSCs) will enable sophisticated genome engineering for research and clinical purposes. The development of programmable sequence-specific nucleases such as Transcription Activator-Like Effectors Nucleases (TALENs) and Cas9-gRNA allows genetic modifications to be made more efficiently at targeted sites of interest. However, many opportunities remain to optimize these tools and to enlarge their spheres of application. We present several improvements: First, we developed functional re-coded TALEs (reTALEs), which not only enable simple one-pot TALE synthesis but also allow TALE-based applications to be performed using lentiviral vectors. We then compared genome-editing efficiencies in hiPSCs mediated by 15 pairs of reTALENs and Cas9-gRNA targeting CCR5 and optimized ssODN design in conjunction with both methods for introducing specific mutations. We found Cas9-gRNA achieved 7-8x higher non-homologous end joining efficiencies (3%) than reTALENs (0.4%)...

metaseq: a Python package for integrative genome-wide analysis reveals relationships between chromatin insulators and associated nuclear mRNA
Dale, RK; Matzat, LH; Lei, EP
Nucleic Acids Res. 2014, 42, 9158-9170
Free Full Text
Here we introduce metaseq, a software library written in Python, which enables loading multiple genomic data formats into standard Python data structures and allows flexible, customized manipulation and visualization of data from high-throughput sequencing studies. We demonstrate its practical use by analyzing multiple datasets related to chromatin insulators, which are DNA-protein complexes proposed to organize the genome into distinct transcriptional domains. Recent studies in Drosophila and mammals have implicated RNA in the regulation of chromatin insulator activities. Moreover, the Drosophila RNA-binding protein Shep has been shown to antagonize gypsy insulator activity in a tissue-specific manner, but the precise role of RNA in this process remains unclear. Better understanding of chromatin insulator regulation requires integration of multiple datasets, including those from chromatin-binding, RNA-binding, and gene expression experiments. We use metaseq to integrate RIP-and ChIP-seq data for Shep and the core gypsy insulator protein Su(Hw) in two different cell types, along with publicly available ChIP-chip and RNA-seq data...

The effect of tRNA levels on decoding times of mRNA codons
Dana, A; Tuller, T
Nucleic Acids Res. 2014, 42, 9171-9181
Free Full Text
The possible effect of transfer ribonucleic acid (tRNA) concentrations on codons decoding time is a fundamental biomedical research question; however, due to a large number of variables affecting this process and the non-direct relation between them, a conclusive answer to this question has eluded so far researchers in the field. In this study, we perform a novel analysis of the ribosome profiling data of four organisms which enables ranking the decoding times of different codons while filtering translational phenomena such as experimental biases, extreme ribosomal pauses and ribosome traffic jams. Based on this filtering, we show for the first time that there is a significant correlation between tRNA concentrations and the codons estimated decoding time both in prokaryotes and in eukaryotes in natural conditions (-0.38 to -0.66, all P values <0.006); in addition, we show that when considering tRNA concentrations, codons decoding times are not correlated with aminoacyl-tRNA levels. The reported results support the conjecture that translation efficiency is directly influenced by the tRNA levels in the cell. Thus, they should help to understand the evolution of synonymous aspects of coding sequences via the adaptation of their codons to the tRNA pool.

Stability, delivery and functions of human sperm RNAs at fertilization
Sendler, E; Johnson, GD; Mao, SH; Goodrich, RJ; Diamond, MP; Hauser, R; Krawetz, SA
Nucleic Acids Res. 2013, 41, 4104-4117
Free Full Text
Increasing attention has focused on the significance of RNA in sperm, in light of its contribution to the birth and long-term health of a child, role in sperm function and diagnostic potential. As the composition of sperm RNA is in flux, assigning specific roles to individual RNAs presents a significant challenge. For the first time RNA-seq was used to characterize the population of coding and non-coding transcripts in human sperm. Examining RNA representation as a function of multiple methods of library preparation revealed unique features indicative of very specific and stage-dependent maturation and regulation of sperm RNA, illuminating their various transitional roles. Correlation of sperm transcript abundance with epigenetic marks suggested roles for these elements in the pre- and post-fertilization genome. Several classes of non-coding RNAs including lncRNAs, CARs, pri-miRNAs, novel elements and mRNAs have been identified which, based on factors including relative abundance, integrity in sperm, available knockout data of embryonic effect and presence or absence in the unfertilized human oocyte, are likely to be essential male factors critical to early post-fertilization...

Large-scale analysis of tandem repeat variability in the human genome
Duitama, J; Zablotskaya, A; Gemayel, R; Jansen, A; Belet, S; Vermeesch, JR; Verstrepen, KJ; Froyen, G
Nucleic Acids Res. 2014, 42, 5728-5741
Free Full Text
Tandem repeats are short DNA sequences that are repeated head-to-tail with a propensity to be variable. They constitute a significant proportion of the human genome, also occurring within coding and regulatory regions. Variation in these repeats can alter the function and/or expression of genes allowing organisms to swiftly adapt to novel environments. Importantly, some repeat expansions have also been linked to certain neurodegenerative diseases. Therefore, accurate sequencing of tandem repeats could contribute to our understanding of common phenotypic variability and might uncover missing genetic factors in idiopathic clinical conditions. However, despite long-standing evidence for the functional role of repeats, they are largely ignored because of technical limitations in sequencing, mapping and typing. Here, we report on a novel capture technique and data filtering protocol that allowed simultaneous sequencing of thousands of tandem repeats in the human genomes of a three generation family using GS-FLX-plus Titanium technology. Our results demonstrated that up to 7.6% of tandem repeats in this family (4% in coding sequences) differ from the reference sequence...

Mapping of six somatic linker histone H1 variants in human breast cancer cells uncovers specific features of H1.2
Millan-Arino, L; Islam, AMMK; Izquierdo-Bouldstridge, A; Mayor, R; Terme, JM; Luque, N; Sancho, M; Lopez-Bigas, N; Jordan, A
Nucleic Acids Res. 2014, 42, 4474-4493
Free Full Text
Seven linker histone H1 variants are present in human somatic cells with distinct prevalence across cell types. Despite being key structural components of chromatin, it is not known whether the different variants have specific roles in the regulation of nuclear processes or are differentially distributed throughout the genome. Using variant-specific antibodies to H1 and hemagglutinin (HA)-tagged recombinant H1 variants expressed in breast cancer cells, we have investigated the distribution of six H1 variants in promoters and genome-wide. H1 is depleted at promoters depending on its transcriptional status and differs between variants. Notably, H1.2 is less abundant than other variants at the transcription start sites of inactive genes, and promoters enriched in H1.2 are different from those enriched in other variants and tend to be repressed. Additionally, H1.2 is enriched at chromosomal domains characterized by low guanine-cytosine (GC) content and is associated with lamina-associated domains. Meanwhile, other variants are associated with higher GC content, CpG islands and gene-rich domains...

A comprehensive survey of non-canonical splice sites in the human transcriptome
Parada, GE; Munita, R; Cerda, CA; Gysling, K
Nucleic Acids Res. 2014, 42, 10564-10578
Free Full Text
We uncovered the diversity of non-canonical splice sites at the human transcriptome using deep transcriptome profiling. We mapped a total of 3.7 billion human RNA-seq reads and developed a set of stringent filters to avoid false non-canonical splice site detections. We identified 184 splice sites with non-canonical dinucleotides and U2/U12-like consensus sequences. We selected 10 of the herein identified U2/U12-like non-canonical splice site events and successfully validated 9 of them via reverse transcriptase-polymerase chain reaction and Sanger sequencing. Analyses of the 184 U2/U12-like non- canonical splice sites indicate that 51% of them are not annotated in GENCODE. In addition, 28% of them are conserved in mouse and 76% are involved in alternative splicing events, some of them with tissue-specific alternative splicing patterns. Interestingly, our analysis identified some U2/U12-like non-canonical splice sites that are converted into canonical splice sites by RNA A-to-I editing. Moreover, the U2/U12-like non-canonical splice sites have a differential distribution of splicing regulatory sequences, which may contribute to their recognition and regulation...

The light-induced transcriptome of the zebrafish pineal gland reveals complex regulation of the circadian clockwork by light
Ben-Moshe, Z; Alon, S; Mracek, P; Faigenbloom, L; Tovin, A; Vatine, GD; Eisenberg, E; Foulkes, NS; Gothilf, Y
Nucleic Acids Res. 2014, 42, 3750-3767
Free Full Text
Light constitutes a primary signal whereby endogenous circadian clocks are synchronized ('entrained') with the day/night cycle. The molecular mechanisms underlying this vital process are known to require gene activation, yet are incompletely understood. Here, the light-induced transcriptome in the zebrafish central clock organ, the pineal gland, was characterized by messenger RNA (mRNA) sequencing (mRNA-seq) and microarray analyses, resulting in the identification of multiple light-induced mRNAs. Interestingly, a considerable portion of the molecular clock (14 genes) is light-induced in the pineal gland. Four of these genes, encoding the transcription factors dec1, reverbb1, e4bp4-5 and e4bp4-6, differentially affected clock- and light-regulated promoter activation, suggesting that light-input is conveyed to the core clock machinery via diverse mechanisms. Moreover, we show that dec1, as well as the core clock gene per2, is essential for light-entrainment of rhythmic locomotor activity in zebrafish larvae...

Sensitive, multiplex and direct quantification of RNA sequences using a modified RASL assay
Larman, HB; Scott, ER; Wogan, M; Oliveira, G; Torkamani, A; Schultz, PG
Nucleic Acids Res. 2014, 42, 9146-9157
Free Full Text
A sensitive and highly multiplex method to directly measure RNA sequence abundance without requiring reverse transcription would be of value for a number of biomedical applications, including high throughput small molecule screening, pathogen transcript detection and quantification of short/degraded RNAs. RNA Annealing, Selection and Ligation (RASL) assays, which are based on RNA template-dependent oligonucleotide probe ligation, have been developed to meet this need, but technical limitations have impeded their adoption. Whereas DNA ligase-based RASL assays suffer from extremely low and sequence-dependent ligation efficiencies that compromise assay robustness, Rnl2 can join a fully DNA donor probe to a 3'-diribonucleotide-terminated acceptor probe with high efficiency on an RNA template strand. Rnl2-based RASL exhibits sub-femtomolar transcript detection sensitivity, and permits the rational tuning of probe signals for optimal analysis by massively parallel DNA sequencing (RASL-seq). A streamlined Rnl2-based RASL-seq protocol was assessed in a small molecule screen using 77 probe sets designed to monitor complex human B cell phenotypes...

Transcriptional landscape and essential genes of Neisseria gonorrhoeae
Remmele, CW; Xian, YB; Albrecht, M; Faulstich, M; Fraunholz, M; Heinrichs, E; Dittrich, MT; Muller, T; Reinhardt, R; Rudel, T
Nucleic Acids Res. 2014, 42, 10579-10595
Free Full Text
The WHO has recently classified Neisseria gonorrhoeae as a super-bacterium due to the rapid spread of antibiotic resistant derivatives and an overall dramatic increase in infection incidences. Genome sequencing has identified potential genes, however, little is known about the transcriptional organization and the presence of non-coding RNAs in gonococci. We performed RNA sequencing to define the transcriptome and the transcriptional start sites of all gonococcal genes and operons. Numerous new transcripts including 253 potentially non-coding RNAs transcribed from intergenic regions or antisense to coding genes were identified. Strikingly, strong antisense transcription was detected for the phasevariable opa genes coding for a family of adhesins and invasins in pathogenic Neisseria, that may have regulatory functions. Based on the defined transcriptional start sites, promoter motifs were identified. We further generated and sequenced a high density Tn5 transposon library to predict a core of 827 gonococcal essential genes, 133 of which have no known function...

Back to the top