-
1.
Protein Remote Homology Detection and Fold Recognition Based on Sequence-Order Frequency Matrix.
Liu, B, Chen, J, Guo, M, Wang, X
IEEE/ACM transactions on computational biology and bioinformatics. 2019;(1):292-300
Abstract
Protein remote homology detection and fold recognition are two critical tasks for the studies of protein structures and functions. Currently, the profile-based methods achieve the state-of-the-art performance in these fields. However, the widely used sequence profiles, like position-specific frequency matrix (PSFM) and position-specific scoring matrix (PSSM), ignore the sequence-order effects along protein sequence. In this study, we have proposed a novel profile, called sequence-order frequency matrix (SOFM), to extract the sequence-order information of neighboring residues from multiple sequence alignment (MSA). Combined with two profile feature extraction approaches, top-n-grams and the Smith-Waterman algorithm, the SOFMs are applied to protein remote homology detection and fold recognition, and two predictors called SOFM-Top and SOFM-SW are proposed. Experimental results show that SOFM contains more information content than other profiles, and these two predictors outperform other state-of-the-art methods. It is anticipated that SOFM will become a very useful profile in the studies of protein structures and functions.
-
2.
Predicting the functional and structural consequences of nsSNPs in human methionine synthase gene using computational tools.
Desai, M, Chauhan, JB
Systems biology in reproductive medicine. 2019;(4):288-300
Abstract
Methionine synthase encoded by the MTR gene is one of the key enzymes involved in the SAM (S- Adenosyl Methionine) cycle catalyzing the conversion of homocysteine to methionine. Methionine plays an important role in the DNA, RNA, protein, phospholipids, and neurotransmitters methylation. It also maintains serum homocysteine level and indirectly regulates de novo nucleotide synthesis and repair. The current study predicted the functional consequences of nsSNPs in human MTR gene using SIFT, PolyPhen2, PROVEAN, SNAP2, PMut, nsSNPAnalyzer, PhD-SNP, SNPs&GO, I-Mutant, MuPro, and iPTREE-STAB. The PTM sites within the protein were predicted using ModPred and the phylogenetic conservations of amino acids & conserved domains of protein were predicted using ConSurf and NCBI conserved domain search tool respectively. The protein 3D structure was generated using SPARKS-X and analyzed using RAMPAGE. Structural deviation was analyzed using TM-Score. STRING analysis was preformed to predict protein-protein interactions. D621G, G682D, V744L, V766E, and R1027W were predicted to be the most deleterious nsSNPs in MTR. R1027 was predicted to having the three PTM sites and G682 & V744 were predicted as highly conserved residues. D621G, G682D, V744L, V776E, and R1027W were predicted to be within conserved domains of methionine synthase. The G682D, V744L, V776E, and R1027W were predicted to alter protein 3D structure. STRING predicted that methionine synthase interacting with 10 different proteins. The present study predicted D621G, G682D, V744L, V766E, and R1027W as functionally and structurally significant nsSNPs in human MTR gene. The present study can provide the significant information for further experimental analysis. Abbreviations: cblG: methylcobalamin deficiency G; MTR: 5-methyl tetrahydrofolate-homocysteine methyl transferase; MS: methionine synthase; SAM: S-adenosyl methionine; nsSNPs: non-synonymous single nucleotide polymorphisms; OMIM online mendelian inheritance in man; NCBI national center for biological information; SIFT sorting intolerant from tolerant; PolyPhen2: polymorphism phenotyping 2; PROVEAN protein variation effect analyzer; SNPs&GO: single nucleotide polymorphisms and gene ontology; PhD-SNP: predictor of human deleterious single nucleotide polymorphisms; RI: reliability index; PTM: post translational modification; SPDBV Swiss PDB viewer; PDB: protein data bank; RMSD root mean square deviation; STRING search tool for the retrieval of interacting proteins.
-
3.
Exploring Protein Supersecondary Structure Through Changes in Protein Folding, Stability, and Flexibility.
Pires, DEV, Rodrigues, CHM, Albanaz, ATS, Karmakar, M, Myung, Y, Xavier, J, Michanetzi, EM, Portelli, S, Ascher, DB
Methods in molecular biology (Clifton, N.J.). 2019;:173-185
Abstract
The ability to predict how mutations affect protein structure, folding, and flexibility can elucidate the molecular mechanisms leading to disruption of supersecondary structures, the emergence of phenotypes, as well guiding rational protein engineering. The advent of fast and accurate computational tools has enabled us to comprehensively explore the landscape of mutation effects on protein structures, prioritizing mutations for rational experimental validation.Here we describe the use of two complementary web-based in silico methods, DUET and DynaMut, developed to infer the effects of mutations on folding, stability, and flexibility and how they can be used to explore and interpret these effects on protein supersecondary structures.
-
4.
QTY code designed thermostable and water-soluble chimeric chemokine receptors with tunable ligand affinity.
Qing, R, Han, Q, Skuhersky, M, Chung, H, Badr, M, Schubert, T, Zhang, S
Proceedings of the National Academy of Sciences of the United States of America. 2019;(51):25668-25676
-
-
Free full text
-
Abstract
Chemokine receptors are of great interest as they play a critical role in many immunological and pathological processes. The ability to study chemokine receptors in aqueous solution without detergent would be significant because natural receptors require detergents to become soluble. We previously reported using the QTY code to design detergent-free chemokine receptors. We here report the design of 2 detergent-free chimeric chemokine receptors that were experimentally unattainable in detergent solution. We designed chimeric receptors by switching the N terminus and 3 extracellular (EC) loops between different receptors. Specifically, we replaced the N terminus and 3 EC loops of CCR5QTY with the N terminus and 3 EC loops of CXCR4. The ligand for CXCR4; namely CXCL12, binds to the chimeric receptor CCR5QTY (7TM)-CXCR4 (N terminus+3 EC loops), but with lower affinity compared to CXCR4; the CCL5 ligand of CCR5 binds the chimeric receptor with ∼20× lower affinity. The chimeric design helps to elucidate the mechanism of native receptor-ligand interaction. We also show that all detergent-free QTY-designed chemokine receptors, expressed in Escherichia coli, bind to their respective chemokines with affinities in the nanomolar (nM) range, similar to the affinities of native receptors and SF9-produced QTY variants. These QTY-designed receptors exhibit remarkable thermostability in the presence of arginine and retain ligand-binding activity after heat treatment at 60 °C for 4 h and 24 h, and at 100 °C for 10 min. Our design approach enables affordable scale-up production of detergent-free QTY variant chemokine receptors with tunable functionality for various uses.
-
5.
Prediction of folding mechanisms for Ig-like beta sandwich proteins based on inter-residue average distance statistics methods.
Aumpuchin, P, Kikuchi, T
Proteins. 2019;(2):120-135
Abstract
To understand the folding mechanism of a protein is one of the goals in bioinformatics study. Nowadays, it is enigmatic and difficult to extract folding information from amino acid sequence using standard bioinformatics techniques or even experimental protocols which can be time consuming. To overcome these problems, we aim to extract the initial folding unit for titin protein (Ig and fnIII domains) by means of inter-residue average distance statistics, Average Distance Map (ADM) and contact frequency analysis (F-value). TI I27 and TNfn3 domains are used to represent the Ig-domain and fnIII-domain, respectively. Beta-strands 2, 3, 5, and 6 are significant for the initial folding processes of TI I27. The central strands of TNfn3 were predicted as a primary folding segment. Known 3D structure and unknown 3D structure domains were investigated by structure or non-structure based multiple sequence alignment, respectively, to learn the conserved hydrophobic residues and predicted compact region relevant to evolution. Our results show good correspondence to experimental data, phi-value and protection factor from H-D exchange experiments. The significance of conserved hydrophobic residues near F-value peaks for structural stability using hydrophobic packing is confirmed. Our prediction methods once again could extract a folding mechanism only knowing the amino acid sequence.
-
6.
Microbiome 101: Studying, Analyzing, and Interpreting Gut Microbiome Data for Clinicians.
Allaband, C, McDonald, D, Vázquez-Baeza, Y, Minich, JJ, Tripathi, A, Brenner, DA, Loomba, R, Smarr, L, Sandborn, WJ, Schnabl, B, et al
Clinical gastroenterology and hepatology : the official clinical practice journal of the American Gastroenterological Association. 2019;(2):218-230
-
-
Free full text
-
Abstract
Advances in technical capabilities for reading complex human microbiomes are leading to an explosion of microbiome research, leading in turn to intense interest among clinicians in applying these techniques to their patients. In this review, we discuss the content of the human microbiome, including intersubject and intrasubject variability, considerations of study design including important confounding factors, and different methods in the laboratory and on the computer to read the microbiome and its resulting gene products and metabolites. We highlight several common pitfalls for clinicians, including the expectation that an individual's microbiome will be stable, that diet can induce rapid changes that are large compared with the differences among subjects, that everyone has essentially the same core stool microbiome, and that different laboratory and computational methods will yield essentially the same results. We also highlight the current limitations and future promise of these techniques, with the expectation that an understanding of these considerations will help accelerate the path toward routine clinical application of these techniques developed in research settings.
-
7.
EvolStruct-Phogly: incorporating structural properties and evolutionary information from profile bigrams for the phosphoglycerylation prediction.
Chandra, AA, Sharma, A, Dehzangi, A, Tsunoda, T
BMC genomics. 2019;(Suppl 9):984
Abstract
BACKGROUND Post-translational modification (PTM), which is a biological process, tends to modify proteome that leads to changes in normal cell biology and pathogenesis. In the recent times, there has been many reported PTMs. Out of the many modifications, phosphoglycerylation has become particularly the subject of interest. The experimental procedure for identification of phosphoglycerylated residues continues to be an expensive, inefficient and time-consuming effort, even with a large number of proteins that are sequenced in the post-genomic period. Computational methods are therefore being anticipated in order to effectively predict phosphoglycerylated lysines. Even though there are predictors available, the ability to detect phosphoglycerylated lysine residues still remains inadequate. RESULTS We have introduced a new predictor in this paper named EvolStruct-Phogly that uses structural and evolutionary information relating to amino acids to predict phosphoglycerylated lysine residues. Benchmarked data is employed containing experimentally identified phosphoglycerylated and non-phosphoglycerylated lysines. We have then extracted the three structural information which are accessible surface area of amino acids, backbone torsion angles, amino acid's local structure conformations and profile bigrams of position-specific scoring matrices. CONCLUSION EvolStruct-Phogly showed a noteworthy improvement in regards to the performance when compared with the previous predictors. The performance metrics obtained are as follows: sensitivity 0.7744, specificity 0.8533, precision 0.7368, accuracy 0.8275, and Mathews correlation coefficient of 0.6242. The software package and data of this work can be obtained from https://github.com/abelavit/EvolStruct-Phogly or www.alok-ai-lab.com.
-
8.
SIMLIN: a bioinformatics tool for prediction of S-sulphenylation in the human proteome based on multi-stage ensemble-learning models.
Wang, X, Li, C, Li, F, Sharma, VS, Song, J, Webb, GI
BMC bioinformatics. 2019;(1):602
Abstract
BACKGROUND S-sulphenylation is a ubiquitous protein post-translational modification (PTM) where an S-hydroxyl (-SOH) bond is formed via the reversible oxidation on the Sulfhydryl group of cysteine (C). Recent experimental studies have revealed that S-sulphenylation plays critical roles in many biological functions, such as protein regulation and cell signaling. State-of-the-art bioinformatic advances have facilitated high-throughput in silico screening of protein S-sulphenylation sites, thereby significantly reducing the time and labour costs traditionally required for the experimental investigation of S-sulphenylation. RESULTS In this study, we have proposed a novel hybrid computational framework, termed SIMLIN, for accurate prediction of protein S-sulphenylation sites using a multi-stage neural-network based ensemble-learning model integrating both protein sequence derived and protein structural features. Benchmarking experiments against the current state-of-the-art predictors for S-sulphenylation demonstrated that SIMLIN delivered competitive prediction performance. The empirical studies on the independent testing dataset demonstrated that SIMLIN achieved 88.0% prediction accuracy and an AUC score of 0.82, which outperforms currently existing methods. CONCLUSIONS In summary, SIMLIN predicts human S-sulphenylation sites with high accuracy thereby facilitating biological hypothesis generation and experimental validation. The web server, datasets, and online instructions are freely available at http://simlin.erc.monash.edu/ for academic purposes.
-
9.
In silico identification of epitopes present in human heat shock proteins (HSPs) overexpressed by tumour cells.
Marchan, J
Journal of immunological methods. 2019;:34-45
Abstract
Although many of heat shock proteins (HSPs) are crucial in homeostasis due to their role in maintaining cellular proteostasis by the integration of two pivotal processes-folding and degradation, several decades of cancer proteomics suggest that HSPs may improve cancer establishment and progression. Therefore, it is imperative to explore how these molecules impact patient outcomes and whether their interaction with the immune systems improves the protumour or antitumour environment. Here, using an immunoinformatics approach were investigated the best probable epitopes from ten HSPs (HSP90α, HSP90β, HSPA1A, HSPA1L, HSPA2, HSPA5, HSPA6, HSPB1, HSPB5 and HSP60/HSP10). To achieve this aim, antigenicity, immunogenicity (prediction of continuous and discontinuous B cell epitopes, binding peptides to HLA class I and HLA class II, and overlapping epitopes), analysis of conservancy and population coverage, and prediction of IgE epitopes were evaluated. According to the physicochemical properties used for their prediction (hydrophilicity, flexibility, accessibility and antigenicity propensity), ten continuous epitopes (one per HSPs) were considered as the best and also several regions of each molecule were identified as B discontinuous epitopes. Interestingly, peptides of HSP90β, HSPA2, HSPB1, and HSPB5 were predicted as both continuous and discontinuous B cell epitopes. For all the HSPs evaluated were identified potential overlapping epitopes ("NTFYSNKEI", "TTYSCVGVF", "TADRWRVSL", "VKHFSPEEL" and "CEFQDAYVL"). Moreover, these peptides were negative for IgE epitopes and showed a large coverage in the human population (HLA-A*02, HLA-B*15, HLA-C*03, and HLA-C*12). Taken together, these data indicate that such epitopes may activate both the humoral and cell-mediated response, and thus serve as therapeutic targets for cancer. However, it must be assessed their efficacy and safety in vitro and in vivo before their translation in clinical trials.
-
10.
Protein-Protein Interaction Interface Residue Pair Prediction Based on Deep Learning Architecture.
Zhao, Z, Gong, X
IEEE/ACM transactions on computational biology and bioinformatics. 2019;(5):1753-1759
Abstract
MOTIVATION Proteins usually fulfill their biological functions by interacting with other proteins. Although some methods have been developed to predict the binding sites of a monomer protein, these are not sufficient for prediction of the interaction between two monomer proteins. The correct prediction of interface residue pairs from two monomer proteins is still an open question and has great significance for practical experimental applications in the life sciences. We hope to build a method for the prediction of interface residue pairs that is suitable for those applications. RESULTS Here, we developed a novel deep network architecture called the multi-layered Long-Short Term Memory networks (LSTMs) approach for the prediction of protein interface residue pairs. First, we created three new descriptions and used other six worked characterizations to describe an amino acid, then we employed these features to discriminate between interface residue pairs and non-interface residue pairs. Second, we used two thresholds to select residue pairs that are more likely to be interface residue pairs. Furthermore, this step increases the proportion of interface residue pairs and reduces the influence of imbalanced data. Third, we built deep network architectures based on Long-Short Term Memory networks algorithm to organize and refine the prediction of interface residue pairs by employing features mentioned above. We trained the deep networks on dimers in the unbound state in the international Protein-protein Docking Benchmark version 3.0. The updated data sets in the versions 4.0 and 5.0 were used as the validation set and test set respectively. For our best model, the accuracy rate was over 62 percent when we chose the top 0.2 percent pairs of every dimer in the test set as predictions, which will be very helpful for the understanding of protein-protein interaction mechanisms and for guidance in biological experiments.