December 2013

The methylomes of six bacteria
Murray, IA; Clark, TA; Morgan, RD; Boitano, M; Anton, BP; Luong, K; Fomenkov, A; Turner, SW; Korlach, J; Roberts, RJ
Nucleic Acids Res. (2012) 40 (22): 11450-11462
Six bacterial genomes, Geobacter metallireducens GS-15, Chromohalobacter salexigens, Vibrio breoganii 1C-10, Bacillus cereus ATCC 10987, Campylobacter jejuni subsp. jejuni 81-176 and C. jejuni NCTC 11168, all of which had previously been sequenced using other platforms were re-sequenced using single-molecule, real-time (SMRT) sequencing specifically to analyze their methylomes. In every case a number of new N-6-methyladenine ((m6)A) and N-4-methylcytosine (C-m4) methylation patterns were discovered and the DNA methyltransferases (MTases) responsible for those methylation patterns were assigned. In 15 cases, it was possible to match MTase genes with MTase recognition sequences without further sub-cloning. Two Type I restriction systems required sub-cloning to differentiate their recognition sequences, while four MTase genes that were not expressed in the native organism were sub-cloned to test for viability and recognition sequences. Two of these proved active. No attempt was made to detect 5-methylcytosine (C-m5) recognition motifs from the SMRT (R) sequencing data because this modification produces weaker signals using current methods...

Cell type-specific genomics of Drosophila neurons
Henry, GL; Davis, FP; Picard, S; Eddy, SR
Nucleic Acids Res. (2012) 40 (19): 9691-9704
Many tools are available to analyse genomes but are often challenging to use in a cell type-specific context. We have developed a method similar to the isolation of nuclei tagged in a specific cell type (INTACT) technique [Deal,R.B. and Henikoff,S. (2010) A simple method for gene expression and chromatin profiling of individual cell types within a tissue. Dev. Cell, 18, 1030-1040; Steiner,F.A., Talbert,P.B., Kasinathan,S., Deal,R.B. and Henikoff,S. (2012) Cell-type-specific nuclei purification from whole animals for genome-wide expression and chromatin profiling. Genome Res., doi:10.1101/gr.131748.111], first developed in plants, for use in Drosophila neurons. We profile gene expression and histone modifications in Kenyon cells and octopaminergic neurons in the adult brain. In addition to recovering known gene expression differences, we also observe significant cell type-specific chromatin modifications. In particular, a small subset of differentially expressed genes exhibits a striking anti-correlation between repressive and activating histone modifications. These genes are enriched for transcription factors, recovering those known to regulate mushroom body identity and predicting analogous regulators of octopaminergic neurons...

Insights into the evolution of Archaea and eukaryotic protein modifier systems revealed by the genome of a novel archaeal group
Nunoura, T; Takaki, Y; Kakuta, J; Nishi, S; Sugahara, J; Kazama, H; Chee, GJ; Hattori, M; Kanai, A; Atomi, H; Takai, K; Takami, H
Nucleic Acids Res. (2011) 39 (8): 3204-3223
The domain Archaea has historically been divided into two phyla, the Crenarchaeota and Euryarchaeota. Although regarded as members of the Crenarchaeota based on small subunit rRNA phylogeny, environmental genomics and efforts for cultivation have recently revealed two novel phyla/divisions in the Archaea; the 'Thaumarchaeota' and 'Korarchaeota'. Here, we show the genome sequence of Candidatus 'Caldiarchaeum subterraneum' that represents an uncultivated crenarchaeotic group. A composite genome was reconstructed from a metagenomic library previously prepared from a microbial mat at a geothermal water stream of a sub-surface gold mine. The genome was found to be clearly distinct from those of the known phyla/divisions, Crenarchaeota (hyperthermophiles), Euryarchaeota, Thaumarchaeota and Korarchaeota. The unique traits suggest that this crenarchaeotic group can be considered as a novel archaeal phylum/division. Moreover, C. subterraneum harbors an ubiquitin-like protein modifier system consisting of Ub, E1, E2 and small Zn RING finger family protein...

A comparison of RNA-Seq and high-density exon array for detecting differential gene expression between closely related species
Liu, S; Lin, L; Jiang, P; Wang, D; Xing, Y
Nucleic Acids Res. (2011) 39 (2): 578-588
RNA-Seq has emerged as a revolutionary technology for transcriptome analysis. In this article, we report a systematic comparison of RNA-Seq and high-density exon array for detecting differential gene expression between closely related species. On a panel of human/chimpanzee/rhesus cerebellum RNA samples previously examined by the high-density human exon junction array (HJAY) and real-time qPCR, we generated 48.68 million RNA-Seq reads. Our results indicate that RNA-Seq has significantly improved gene coverage and increased sensitivity for differentially expressed genes compared with the high-density HJAY array. Meanwhile, we observed a systematic increase in the RNA-Seq error rate for lowly expressed genes. Specifically, between-species DEGs detected by array/qPCR but missed by RNA-Seq were characterized by relatively low expression levels, as indicated by lower RNA-Seq read counts, lower HJAY array expression indices and higher qPCR raw cycle threshold values. Furthermore, this issue was not unique to between-species comparisons of gene expression...

Rapid incorporation kinetics and improved fidelity of a novel class of 3''-OH unblocked reversible terminators
Gardner, AF; Wang, JC; Wu, WD; Karouby, J; Li, H; Stupi, BP; Jack, WE; Hersh, MN; Metzker, ML
Nucleic Acids Res. (2012) 40 (15): 7404-7415
Recent developments of unique nucleotide probes have expanded our understanding of DNA polymerase function, providing many benefits to techniques involving next-generation sequencing (NGS) technologies. The cyclic reversible termination (CRT) method depends on efficient base-selective incorporation of reversible terminators by DNA polymerases. Most terminators are designed with 3'-O-blocking groups but are incorporated with low efficiency and fidelity. We have developed a novel class of 3'-OH unblocked nucleotides, called Lightning Terminators (TM), which have a terminating 2-nitrobenzyl moiety attached to hydroxymethylated nucleobases. A key structural feature of this photocleavable group displays a 'molecular tuning' effect with respect to single-base termination and improved nucleotide fidelity. Using Therminator (TM) DNA polymerase, we demonstrate that these 3'-OH unblocked terminators exhibit superior enzymatic performance compared to two other reversible terminators, 3'-O-amino-TTP and 3'-O-azidomethyl-TTP...

Sequence and expression analysis of gaps in human chromosome 20
Minocherhomji, S; Seemann, S; Mang, Y; El-schich, Z; Bak, M; Hansen, C; Papadopoulos, N; Josefsen, K; Nielsen, H; Gorodkin, J; Tommerup, N; Silahtaroglu, A
Nucleic Acids Res. (2012) 40 (14): 6660-6672
The finished human genome-assemblies comprise several hundred un-sequenced euchromatic gaps, which may be rich in long polypurine/polypyrimidine stretches. Human chromosome 20 (chr 20) currently has three unfinished gaps remaining on its q-arm. All three gaps are within gene-dense regions and/or overlap disease-associated loci, including the DLGAP4 locus. In this study, we sequenced similar to 99% of all three unfinished gaps on human chr 20, determined their complete genomic sizes and assessed epigenetic profiles using a combination of Sanger sequencing, mate pair paired-end high-throughput sequencing and chromatin, methylation and expression analyses. We found histone 3 trimethylated at Lysine 27 to be distributed across all three gaps in immortalized B-lymphocytes. In one gap, five novel CpG islands were predominantly hypermethylated in genomic DNA from peripheral blood lymphocytes and human cerebellum. One of these CpG islands was differentially methylated and paternally hypermethylated. We found all chr 20 gaps to comprise structured non-coding RNAs (ncRNAs) and to be conserved in primates...

Genome-wide transcriptome analysis of the plant pathogen Xanthomonas identifies sRNAs with putative virulence functions
Schmidtke, C; Findeiss, S; Sharma, CM; Kuhfuss, J; Hoffmann, S; Vogel, J; Stadler, PF; Bonas, U
Nucleic Acids Res. (2012) 40 (5): 2020-2031
The Gram-negative plant-pathogenic bacterium Xanthomonas campestris pv. vesicatoria (Xcv) is an important model to elucidate the mechanisms involved in the interaction with the host. To gain insight into the transcriptome of the Xcv strain 85-10, we took a differential RNA sequencing (dRNA-seq) approach. Using a novel method to automatically generate comprehensive transcription start site (TSS) maps we report 1421 putative TSSs in the Xcv genome. Genes in Xcv exhibit a poorly conserved -10 promoter element and no consensus Shine-Dalgarno sequence. Moreover, 14% of all mRNAs are leaderless and 13% of them have unusually long 5'-UTRs. Northern blot analyses confirmed 16 intergenic small RNAs and seven cis-encoded antisense RNAs in Xcv. Expression of eight intergenic transcripts was controlled by HrpG and HrpX, key regulators of the Xcv type III secretion system. More detailed characterization identified sX12 as a small RNA that controls virulence of Xcv by affecting the interaction of the pathogen and its host plants...

Identification of novel NRF2-regulated genes by ChIP-Seq: influence on retinoid X receptor alpha
Chorley, BN; Campbell, MR; Wang, XT; Karaca, M; Sambandan, D; Bangura, F; Xue, P; Pi, JB; Kleeberger, SR; Bell, DA
Nucleic Acids Res. (2012) 40 (15): 7416-7429
Cellular oxidative and electrophilic stress triggers a protective response in mammals regulated by NRF2 (nuclear factor (erythroid-derived) 2-like; NFE2L2) binding to deoxyribonucleic acid-regulatory sequences near stress-responsive genes. Studies using Nrf2-deficient mice suggest that hundreds of genes may be regulated by NRF2. To identify human NRF2-regulated genes, we conducted chromatin immunoprecipitation (ChIP)-sequencing experiments in lymphoid cells treated with the dietary isothiocyanate, sulforaphane (SFN) and carried out follow-up biological experiments on candidates. We found 242 high confidence, NRF2-bound genomic regions and 96% of these regions contained NRF2-regulatory sequence motifs. The majority of binding sites were near potential novel members of the NRF2 pathway. Validation of selected candidate genes using parallel ChIP techniques and in NRF2-silenced cell lines indicated that the expression of about two-thirds of the candidates are likely to be directly NRF2-dependent including retinoid X receptor alpha (RXRA). NRF2 regulation of RXRA has implications for response to retinoid treatments...

Genome-wide Runx2 occupancy in prostate cancer cells suggests a role in regulating secretion
Little, GH; Noushmehr, H; Baniwal, SK; Berman, BP; Coetzee, GA; Frenkel, B
Nucleic Acids Res. (2012) 40 (8): 3538-3547
Runx2 is a metastatic transcription factor (TF) increasingly expressed during prostate cancer (PCa) progression. Using PCa cells conditionally expressing Runx2, we previously identified Runx2-regulated genes with known roles in epithelial-mesenchymal transition, invasiveness, angiogenesis, extracellular matrix proteolysis and osteolysis. To map Runx2-occupied regions (R2ORs) in PCa cells, we first analyzed regions predicted to bind Runx2 based on the expression data, and found that recruitment to sites upstream of the KLK2 and CSF2 genes was cyclical over time. Genome-wide ChIP-seq analysis at a time of maximum occupancy at these sites revealed 1603 high-confidence R2ORs, enriched with cognate motifs for RUNX, GATA and ETS TFs. The R2ORs were distributed with little regard to annotated transcription start sites (TSSs), mainly in introns and intergenic regions. Runx2-upregulated genes, however, displayed enrichment for R2ORs within 40 kb of their TSSs. The main annotated functions enriched in 98 Runx2-upregulated genes with nearby R2ORs were related to invasiveness and membrane trafficking/secretion...

Analysis of genomic variation in non-coding elements using population-scale sequencing data from the 1000 Genomes Project
Mu, XJ; Lu, ZJ; Kong, Y; Lam, HYK; Gerstein, MB
Nucleic Acids Res. (2011) 39 (16): 7058-7076
In the human genome, it has been estimated that considerably more sequence is under natural selection in non-coding regions [such as transcription-factor binding sites (TF-binding sites) and non-coding RNAs (ncRNAs)] compared to protein-coding ones. However, less attention has been paid to them. To study selective pressure on non-coding elements, we use next-generation sequencing data from the recently completed pilot phase of the 1000 Genomes Project, which, compared to traditional methods, allows for the characterization of a full spectrum of genomic variations, including single-nucleotide polymorphisms (SNPs), short insertions and deletions (indels) and structural variations (SVs). We develop a framework for combining these variation data with non-coding elements, calculating various population-based metrics to compare classes and subclasses of elements, and developing element-aware aggregation procedures to probe the internal structure of an element. Overall, we find that TF-binding sites and ncRNAs are less selectively constrained for SNPs than coding sequences (CDSs), but more constrained than a neutral reference...

