-
1.
SecProCT: In Silico Prediction of Human Secretory Proteins Based on Capsule Network and Transformer.
Du, W, Zhao, X, Sun, Y, Zheng, L, Li, Y, Zhang, Y
International journal of molecular sciences. 2021;(16)
Abstract
Identifying secretory proteins from blood, saliva or other body fluids has become an effective method of diagnosing diseases. Existing secretory protein prediction methods are mainly based on conventional machine learning algorithms and are highly dependent on the feature set from the protein. In this article, we propose a deep learning model based on the capsule network and transformer architecture, SecProCT, to predict secretory proteins using only amino acid sequences. The proposed model was validated using cross-validation and achieved 0.921 and 0.892 accuracy for predicting blood-secretory proteins and saliva-secretory proteins, respectively. Meanwhile, the proposed model was validated on an independent test set and achieved 0.917 and 0.905 accuracy for predicting blood-secretory proteins and saliva-secretory proteins, respectively, which are better than conventional machine learning methods and other deep learning methods for biological sequence analysis. The main contributions of this article are as follows: (1) a deep learning model based on a capsule network and transformer architecture is proposed for predicting secretory proteins. The results of this model are better than the those of existing conventional machine learning methods and deep learning methods for biological sequence analysis; (2) only amino acid sequences are used in the proposed model, which overcomes the high dependence of existing methods on the annotated protein features; (3) the proposed model can accurately predict most experimentally verified secretory proteins and cancer protein biomarkers in blood and saliva.
-
2.
Cyclodextrins based delivery systems for macro biomolecules.
Liu, J, Ding, X, Fu, Y, Xiang, C, Yuan, Y, Zhang, Y, Yu, P
European journal of medicinal chemistry. 2021;:113105
Abstract
Macro biomolecules are of vital importance in regulating the biofunctions in organisms, in which proteins (including peptides when mentioned below) and nucleic acids (NAs) are the most important. Therefore, these proteins and NAs can be applied as "drugs" to regulate the biofunctions from abnormal to normal. Either for proteins and NAs, the most challenging thing is to avoid the biodegradation or physicochemical degradation before they reach the targeted location, and then functions as complete functional structures. Hence, appropriate delivery systems are very important which can protect them from these degradations. Cyclodextrins (CDs) based delivery systems achieved mega successes due to their outstanding pharmaceutical properties and there have been several reviews on CDs based small molecule drug delivery systems recently. But for biomolecules, which are getting more and more important for modern therapies, however, there are very few reviews to systematically summarize and analyze the CDs-based macro biomolecules delivery systems, especially for proteins. In this review, there were some of the notable examples were summarized for the macro biomolecules (proteins and NAs) delivery based on CDs. For proteins, this review included insulin, lysozyme, bovine serum albumin (BSA), green fluorescent protein (GFP) and IgG's, etc. deliveries in slow release, stimulating responsive release or targeting release manners. For NAs, this review summarized cationic CD-polymers and CD-cluster monomers as NAs carriers, notably, including the multicomponents targeting CD-based carriers and the virus-like RNA assembly method siRNA carriers.
-
3.
Protein-Protein Interactions Prediction Based on Graph Energy and Protein Sequence Information.
Xu, D, Xu, H, Zhang, Y, Chen, W, Gao, R
Molecules (Basel, Switzerland). 2020;(8)
Abstract
Identification of protein-protein interactions (PPIs) plays an essential role in the understanding of protein functions and cellular biological activities. However, the traditional experiment-based methods are time-consuming and laborious. Therefore, developing new reliable computational approaches has great practical significance for the identification of PPIs. In this paper, a novel prediction method is proposed for predicting PPIs using graph energy, named PPI-GE. Particularly, in the process of feature extraction, we designed two new feature extraction methods, the physicochemical graph energy based on the ionization equilibrium constant and isoelectric point and the contact graph energy based on the contact information of amino acids. The dipeptide composition method was used for order information of amino acids. After multi-information fusion, principal component analysis (PCA) was implemented for eliminating noise and a robust weighted sparse representation-based classification (WSRC) classifier was applied for sample classification. The prediction accuracies based on the five-fold cross-validation of the human, Helicobacter pylori (H. pylori), and yeast data sets were 99.49%, 97.15%, and 99.56%, respectively. In addition, in five independent data sets and two significant PPI networks, the comparative experimental results also demonstrate that PPI-GE obtained better performance than the compared methods.
-
4.
Site-selective covalent reactions on proteinogenic amino acids.
So, WH, Zhang, Y, Kang, W, Wong, CTT, Sun, H, Xia, J
Current opinion in biotechnology. 2017;:220-227
Abstract
To achieve precise control of the signaling events or to achieve unmistakable synthesis of biomolecules, nature has evolved organic reactions involving proteinogenic amino acids with unparalleled site selectivity. For example, dedicated enzymes accurately dictate the site of post-translational modifications in signaling proteins, and ribosomes precisely link the C-terminal carboxylic acid of one unprotected amino acid with the N-terminal amino group of the other amino acid through spatially confined proximity. For many years, chemists have been striving to achieve site selectivity on biomolecules by mimicking nature. Driven by the development of chemoselective protein conjugation reactions, enzymology and protein-protein interactions, the past decade has witnessed a boom in site-selective protein conjugation reactions. (In this review, a site-selective protein conjugation reaction is defined as an organic reaction that targets a single amino acid instead of a kind of amino acids in a protein or a proteome under physiological conditions, for example, a single cysteine residue among all of the cysteines.) In this review, we summarize the recent advancements of bioconjugation reactions that demonstrate this feature of precise site selectivity, focusing on the reactions of the proteinogenic amino acids (excluding those at non-coded or non-proteinogenic amino acids that are introduced to proteins through genetic manipulations).
-
5.
Association of fat-mass and obesity-associated gene FTO rs9939609 polymorphism with the risk of obesity among children and adolescents: a meta-analysis.
Quan, LL, Wang, H, Tian, Y, Mu, X, Zhang, Y, Tao, K
European review for medical and pharmacological sciences. 2015;(4):614-23
Abstract
OBJECTIVE To elucidate the association of fat-mass and obesity-associated gene (FTO) rs9939609 polymorphism with obesity among children and adolescents. METHODS A literature search was conducted in PubMed, MEDLINE, Springer, and Google scholar to identify eligible studies. The pooled odds ratios (ORs) with 95% confidence intervals (CIs) were used for four models: co-dominant model (AA vs. TT, AT vs. TT), dominant model (AA + AT vs. TT), recessive model (AA vs. AT + TT), and allelic model (A vs. T). Subgroup analyses stratified by ethnicity (Caucasian, others) and participants (children, children and adolescents) were assessed under allelic model. The heterogeneity and publication bias were examined. RESULTS This meta-analysis included 12 eligible studies consisting 5,000 cases and 9,853 controls. The results revealed that FTO rs9939609 polymorphism was significantly associated with the increased risk of obesity in co-dominant model (AA vs. TT: OR = 1.91, 95% CI: 1.47-2.48, p < 0.01; AT vs. TT: OR = 1.18, 95% CI: 1.02-1.38, p = 0.03), dominant model (AA + AT vs. TT: OR = 1.47, 95% CI: 1.35-1.59, p < 0.01), recessive model (AA vs. AT + TT: OR = 1.79, 95% CI: 1.47-2.17, p < 0.01), and allelic model (A vs. T: OR = 1.39, 95% CI: 1.22-1.58, p < 0.01). Similar results were obtained for the subgroup analyses stratified by ethnicity and participants under allelic model. CONCLUSIONS FTO rs9939609 polymorphism is associated with the increased risk of obesity among children and adolescents, especially the homozygous carriers.
-
6.
DNA-Encoded Dynamic Combinatorial Chemical Libraries.
Reddavide, FV, Lin, W, Lehnert, S, Zhang, Y
Angewandte Chemie (International ed. in English). 2015;(27):7924-8
Abstract
Dynamic combinatorial chemistry (DCC) explores the thermodynamic equilibrium of reversible reactions. Its application in the discovery of protein binders is largely limited by difficulties in the analysis of complex reaction mixtures. DNA-encoded chemical library (DECL) technology allows the selection of binders from a mixture of up to billions of different compounds; however, experimental results often show low a signal-to-noise ratio and poor correlation between enrichment factor and binding affinity. Herein we describe the design and application of DNA-encoded dynamic combinatorial chemical libraries (EDCCLs). Our experiments have shown that the EDCCL approach can be used not only to convert monovalent binders into high-affinity bivalent binders, but also to cause remarkably enhanced enrichment of potent bivalent binders by driving their in situ synthesis. We also demonstrate the application of EDCCLs in DNA-templated chemical reactions.
-
7.
Prediction and Analysis of Post-Translational Pyruvoyl Residue Modification Sites from Internal Serines in Proteins.
Jiang, Y, Li, BQ, Zhang, Y, Feng, YM, Gao, YF, Zhang, N, Cai, YD
PloS one. 2013;(6):e66678
Abstract
Most of pyruvoyl-dependent proteins observed in prokaryotes and eukaryotes are critical regulatory enzymes, which are primary targets of inhibitors for anti-cancer and anti-parasitic therapy. These proteins undergo an autocatalytic, intramolecular self-cleavage reaction in which a covalently bound pyruvoyl group is generated on a conserved serine residue. Traditional detections of the modified serine sites are performed by experimental approaches, which are often labor-intensive and time-consuming. In this study, we initiated in an attempt for the computational predictions of such serine sites with Feature Selection based on a Random Forest. Since only a small number of experimentally verified pyruvoyl-modified proteins are collected in the protein database at its current version, we only used a small dataset in this study. After removing proteins with sequence identities >60%, a non-redundant dataset was generated and was used, which contained only 46 proteins, with one pyruvoyl serine site for each protein. Several types of features were considered in our method including PSSM conservation scores, disorders, secondary structures, solvent accessibilities, amino acid factors and amino acid occurrence frequencies. As a result, a pretty good performance was achieved in our dataset. The best 100.00% accuracy and 1.0000 MCC value were obtained from the training dataset, and 93.75% accuracy and 0.8441 MCC value from the testing dataset. The optimal feature set contained 9 features. Analysis of the optimal feature set indicated the important roles of some specific features in determining the pyruvoyl-group-serine sites, which were consistent with several results of earlier experimental studies. These selected features may shed some light on the in-depth understanding of the mechanism of the post-translational self-maturation process, providing guidelines for experimental validation. Future work should be made as more pyruvoyl-modified proteins are found and the method should be evaluated on larger datasets. At last, the predicting software can be downloaded from http://www.nkbiox.com/sub/pyrupred/index.html.
-
8.
A facile microdialysis interface for on-line desalting and identification of proteins by nano-electrospray ionization mass spectrometry.
Sun, L, Duan, J, Tao, D, Liang, Z, Zhang, W, Zhang, L, Zhang, Y
Rapid communications in mass spectrometry : RCM. 2008;(15):2391-7
Abstract
The adverse effect of salts, especially inorganic salts, on electrospray ionization mass spectrometry (ESI-MS) is one of the most serious obstacles that might limit its application. Among the numerous desalting approaches, the microdialysis technique is favorable for large molecules, such as proteins. In this work, employing a hollow fiber membrane of cellulose acetate (MWCO 3000 Da), a simple, facile and efficient microdialysis interface with the dead volume of less than 1 microL was constructed for the on-line desalting and identification of proteins dissolved in high salt concentration buffer by nano-ESI-MS. Furthermore, with counterflow added, the desalting procedure was accelerated, and could be finished within 1 min. This system was successfully applied to the analysis of myoglobin dissolved in either high concentration ammonium acetate or sodium chloride buffer. The experimental results showed that, by using such a microdialysis interface, the salt concentration, even as high as 1 M, could be decreased by at least 2 orders of magnitude, while sample loss was less than 10%, demonstrating the potential of such an interface in broadening the application of nano-ESI-MS in the analysis of large molecules.