-
1.
Comparison of methods for quantification of global DNA methylation in human cells and tissues.
Lisanti, S, Omar, WA, Tomaszewski, B, De Prins, S, Jacobs, G, Koppen, G, Mathers, JC, Langie, SA
PloS one. 2013;(11):e79044
Abstract
DNA methylation is a key epigenetic modification which, in mammals, occurs mainly at CpG dinucleotides. Most of the CpG methylation in the genome is found in repetitive regions, rich in dormant transposons and endogenous retroviruses. Global DNA hypomethylation, which is a common feature of several conditions such as ageing and cancer, can cause the undesirable activation of dormant repeat elements and lead to altered expression of associated genes. DNA hypomethylation can cause genomic instability and may contribute to mutations and chromosomal recombinations. Various approaches for quantification of global DNA methylation are widely used. Several of these approaches measure a surrogate for total genomic methyl cytosine and there is uncertainty about the comparability of these methods. Here we have applied 3 different approaches (luminometric methylation assay, pyrosequencing of the methylation status of the Alu repeat element and of the LINE1 repeat element) for estimating global DNA methylation in the same human cell and tissue samples and have compared these estimates with the "gold standard" of methyl cytosine quantification by HPLC. Next to HPLC, the LINE1 approach shows the smallest variation between samples, followed by Alu. Pearson correlations and Bland-Altman analyses confirmed that global DNA methylation estimates obtained via the LINE1 approach corresponded best with HPLC-based measurements. Although, we did not find compelling evidence that the gold standard measurement by HPLC could be substituted with confidence by any of the surrogate assays for detecting global DNA methylation investigated here, the LINE1 assay seems likely to be an acceptable surrogate in many cases.
-
2.
Coverage-based consensus calling (CbCC) of short sequence reads and comparison of CbCC results to identify SNPs in chickpea (Cicer arietinum; Fabaceae), a crop species without a reference genome.
Azam, S, Thakur, V, Ruperao, P, Shah, T, Balaji, J, Amindala, B, Farmer, AD, Studholme, DJ, May, GD, Edwards, D, et al
American journal of botany. 2012;(2):186-92
Abstract
PREMISE OF THE STUDY Next-generation sequencing (NGS) technologies are frequently used for resequencing and mining of single nucleotide polymorphisms (SNPs) by comparison to a reference genome. In crop species such as chickpea (Cicer arietinum) that lack a reference genome sequence, NGS-based SNP discovery is a challenge. Therefore, unlike probability-based statistical approaches for consensus calling and by comparison with a reference sequence, a coverage-based consensus calling (CbCC) approach was applied and two genotypes were compared for SNP identification. METHODS A CbCC approach is used in this study with four commonly used short read alignment tools (Maq, Bowtie, Novoalign, and SOAP2) and 15.7 and 22.1 million Illumina reads for chickpea genotypes ICC4958 and ICC1882, together with the chickpea trancriptome assembly (CaTA). KEY RESULTS A nonredundant set of 4543 SNPs was identified between two chickpea genotypes. Experimental validation of 224 randomly selected SNPs showed superiority of Maq among individual tools, as 50.0% of SNPs predicted by Maq were true SNPs. For combinations of two tools, greatest accuracy (55.7%) was reported for Maq and Bowtie, with a combination of Bowtie, Maq, and Novoalign identifying 61.5% true SNPs. SNP prediction accuracy generally increased with increasing reads depth. CONCLUSIONS This study provides a benchmark comparison of tools as well as read depths for four commonly used tools for NGS SNP discovery in a crop species without a reference genome sequence. In addition, a large number of SNPs have been identified in chickpea that would be useful for molecular breeding.
-
3.
Comparison of bisulfite sequencing PCR with pyrosequencing for measuring differences in DNA methylation.
Reed, K, Poulin, ML, Yan, L, Parissenti, AM
Analytical biochemistry. 2010;(1):96-106
Abstract
DNA methylation strongly affects chromatin structure and the regulation of gene expression. For many years, bisulfite sequencing PCR (BSP) has served as the "gold standard" for measuring DNA methylation. However, with the evolution of pyrosequencing as a tool to evaluate DNA methylation, the need arises to compare the relative efficiencies of the two techniques in measuring DNA methylation. We provide for the first time a direct assessment of BSP and pyrosequencing to detect and quantify hypomethylation, hypermethylation, and mixed methylation of the ABCB1 promoter in various drug-sensitive and drug-resistant MCF-7 breast cancer cell lines through head-to-head experimentation. Our findings indicate that although both methods can reliably detect increased, decreased, and mixed methylation of DNA, BSP appears to be more sensitive than pyrosequencing at detecting strong hypermethylation of DNA. However, we also observed greater variability in the methylation of CpG sites by BSP, possibly due to the additional bacterial cloning step required by BSP over pyrosequencing. BSP and pyrosequencing equally detected hypomethylation and mixed methylation of DNA. The ability of pyrosequencing to reliably detect differences in DNA methylation across cell populations without requiring the cloning of bisulfite-treated DNA into bacterial expression vectors was seen as a major advantage of this technique.
-
4.
Genomic convergence analysis of schizophrenia: mRNA sequencing reveals altered synaptic vesicular transport in post-mortem cerebellum.
Mudge, J, Miller, NA, Khrebtukova, I, Lindquist, IE, May, GD, Huntley, JJ, Luo, S, Zhang, L, van Velkinburgh, JC, Farmer, AD, et al
PloS one. 2008;(11):e3625
Abstract
Schizophrenia (SCZ) is a common, disabling mental illness with high heritability but complex, poorly understood genetic etiology. As the first phase of a genomic convergence analysis of SCZ, we generated 16.7 billion nucleotides of short read, shotgun sequences of cDNA from post-mortem cerebellar cortices of 14 patients and six, matched controls. A rigorous analysis pipeline was developed for analysis of digital gene expression studies. Sequences aligned to approximately 33,200 transcripts in each sample, with average coverage of 450 reads per gene. Following adjustments for confounding clinical, sample and experimental sources of variation, 215 genes differed significantly in expression between cases and controls. Golgi apparatus, vesicular transport, membrane association, Zinc binding and regulation of transcription were over-represented among differentially expressed genes. Twenty three genes with altered expression and involvement in presynaptic vesicular transport, Golgi function and GABAergic neurotransmission define a unifying molecular hypothesis for dysfunction in cerebellar cortex in SCZ.
-
5.
Comparative analysis of regulatory motif discovery tools for transcription factor binding sites.
Wei, W, Yu, XD
Genomics, proteomics & bioinformatics. 2007;(2):131-42
Abstract
In the post-genomic era, identification of specific regulatory motifs or transcription factor binding sites (TFBSs) in non-coding DNA sequences, which is essential to elucidate transcriptional regulatory networks, has emerged as an obstacle that frustrates many researchers. Consequently, numerous motif discovery tools and correlated databases have been applied to solving this problem. However, these existing methods, based on different computational algorithms, show diverse motif prediction efficiency in non-coding DNA sequences. Therefore, understanding the similarities and differences of computational algorithms and enriching the motif discovery literatures are important for users to choose the most appropriate one among the online available tools. Moreover, there still lacks credible criterion to assess motif discovery tools and instructions for researchers to choose the best according to their own projects. Thus integration of the related resources might be a good approach to improve accuracy of the application. Recent studies integrate regulatory motif discovery tools with experimental methods to offer a complementary approach for researchers, and also provide a much-needed model for current researches on transcriptional regulatory networks. Here we present a comparative analysis of regulatory motif discovery tools for TFBSs.
-
6.
Optimization of quantitative MGMT promoter methylation analysis using pyrosequencing and combined bisulfite restriction analysis.
Mikeska, T, Bock, C, El-Maarri, O, Hübner, A, Ehrentraut, D, Schramm, J, Felsberg, J, Kahl, P, Büttner, R, Pietsch, T, et al
The Journal of molecular diagnostics : JMD. 2007;(3):368-81
-
-
Free full text
-
Abstract
Resistance to chemotherapy is a major complication during treatment of cancer patients. Hypermethylation of the MGMT gene alters DNA repair and is associated with longer survival of glioblastoma patients treated with alkylating agents. Therefore, MGMT promoter methylation plays an important role as a predictive biomarker for chemotherapy resistance. To adopt this established correlation into a molecular diagnosis procedure, we compared and optimized three experimental techniques [combined bisulfite restriction analysis, a primer extension- and denaturing high-performance liquid chromatography-based method named SIRPH (SNuPE ion pair-reverse phase high-performance liquid chromatography), and pyrosequencing] with regard to their accuracy of detecting MGMT promoter methylation. Initially, bisulfite sequencing was used to obtain a comprehensive methylation profile of the MGMT promoter region in 22 glioblastoma samples and in three normal brain controls. Next, we statistically identified CpG sites that best discriminate between methylated and unmethylated MGMT promoters. These results were then used to design optimal combined bisulfite restriction analysis, SIRPH, and pyrosequencing assays for accurate and cost-efficient assessment of MGMT promoter methylation. We compared all three techniques with regard to their reliability and reproducibility on well-characterized tumor samples. The optimized pyrosequencing assay performed best and provides a sensitive, robust, and easy-to-use method for quantitative assessment of MGMT methylation, for both snap-frozen and paraffin-embedded specimens.
-
7.
BioOptimizer: a Bayesian scoring function approach to motif discovery.
Jensen, ST, Liu, JS
Bioinformatics (Oxford, England). 2004;(10):1557-64
Abstract
MOTIVATION Transcription factors (TFs) bind directly to short segments on the genome, often within hundreds to thousands of base pairs upstream of gene transcription start sites, to regulate gene expression. The experimental determination of TFs binding sites is expensive and time-consuming. Many motif-finding programs have been developed, but no program is clearly superior in all situations. Practitioners often find it difficult to judge which of the motifs predicted by these algorithms are more likely to be biologically relevant. RESULTS We derive a comprehensive scoring function based on a full Bayesian model that can handle unknown site abundance, unknown motif width and two-block motifs with variable-length gaps. An algorithm called BioOptimizer is proposed to optimize this scoring function so as to reduce noise in the motif signal found by any motif-finding program. The accuracy of BioOptimizer, which can be used in conjunction with several existing programs, is shown to be superior to using any of these motif-finding programs alone when evaluated by both simulation studies and application to sets of co-regulated genes in bacteria. In addition, this scoring function formulation enables us to compare objectively different predicted motifs and select the optimal ones, effectively combining the strengths of existing programs. AVAILABILITY BioOptimizer is available for download at www.fas.harvard.edu/~junliu/BioOptimizer/
-
8.
Background rareness-based iterative multiple sequence alignment algorithm for regulatory element detection.
Narasimhan, C, LoCascio, P, Uberbacher, E
Bioinformatics (Oxford, England). 2003;(15):1952-63
Abstract
MOTIVATION Experimental methods capable of generating sets of co-regulated genes have become commonplace, however, recognizing the regulatory motifs responsible for this regulation remains difficult. As a result, computational detection of transcription factor binding sites in such data sets has been an active area of research. Most approaches have utilized either Gibbs sampling or greedy strategies to identify such elements in sets of sequences. These existing methods have varying degrees of success depending on the strength and length of the signals and the number of available sequences. We present a new deterministic iterative algorithm for regulatory element detection based on a Markov chain background. As in other methods, sequences in the entire genome and the training set are taken into account in order to discriminate against commonly occurring signals and produce patterns, which are significant in the training set. RESULTS The results of the algorithm compare favorably with existing tools on previously known and newly compiled data sets. The iteration based search appears rather rigorous, not only finding the binding sites, but also showing how the binding site stands out from genomic background. The approach used to score the results is critical and a discussion of various scoring schemes and options is also presented. Benchmarking of several methods shows that while most tools are good at detecting strong signals, Gibbs sampling algorithms give inconsistent results when the regulatory element signal becomes weak. A Markov chain based background model alleviates the drawbacks of MAP (maximum a posteriori log likelihood) scores. AVAILABILITY Available on request from the authors. SUPPLEMENTARY INFORMATION Data and the results presented in this paper are available on the web at http://compbio.ornl.gov/mira/index.html
-
9.
DNA polymorphism: a comparison of force fields for nucleic acids.
Reddy, SY, Leclerc, F, Karplus, M
Biophysical journal. 2003;(3):1421-49
Abstract
The improvements of the force fields and the more accurate treatment of long-range interactions are providing more reliable molecular dynamics simulations of nucleic acids. The abilities of certain nucleic acid force fields to represent the structural and conformational properties of nucleic acids in solution are compared. The force fields are AMBER 4.1, BMS, CHARMM22, and CHARMM27; the comparison of the latter two is the primary focus of this paper. The performance of each force field is evaluated first on its ability to reproduce the B-DNA decamer d(CGATTAATCG)(2) in solution with simulations in which the long-range electrostatics were treated by the particle mesh Ewald method; the crystal structure determined by Quintana et al. (1992) is used as the starting point for all simulations. A detailed analysis of the structural and solvation properties shows how well the different force fields can reproduce sequence-specific features. The results are compared with data from experimental and previous theoretical studies.
-
10.
A powerful non-homology method for the prediction of operons in prokaryotes.
Moreno-Hagelsieb, G, Collado-Vides, J
Bioinformatics (Oxford, England). 2002;:S329-36
Abstract
MOTIVATION The prediction of the transcription unit organization of genomes is an important clue in the inference of functional relationships of genes, the interpretation and evaluation of transcriptome experiments, and the overall inference of the regulatory networks governing the expression of genes in response to the environment. Though several methods have been devised to predict operons, most need a high characterization of the genome analysed. Log-likelihoods derived from inter-genic distance distributions work surprisingly well to predict operons in Escherichia coli and are available for any genome as soon as the gene sets are predicted. RESULTS Here we provide evidence that the very same method is applicable to any prokaryotic genome. First, the method has the same efficiency when evaluated using a collection of experimentally known operons of Bacillus subtilis. Second, operons among most if not all prokaryotes seem to have the same tendencies to keep short distances between their genes, the most frequent distances being the overlaps of four and one base pairs. The universality of this structural feature allows us to predict the organization of transcription units in all prokaryotes. Third, predicted operons contain a higher proportion of genes with related phylogenetic profiles and conservation of adjacency than predicted borders of transcription units.