1.
ConnecTF: A platform to integrate transcription factor-gene interactions and validate regulatory networks.
Brooks, MD, Juang, CL, Katari, MS, Alvarez, JM, Pasquino, A, Shih, HJ, Huang, J, Shanks, C, Cirrone, J, Coruzzi, GM
Plant physiology. 2021;(1):49-66
-
-
Free full text
-
Abstract
Deciphering gene regulatory networks (GRNs) is both a promise and challenge of systems biology. The promise lies in identifying key transcription factors (TFs) that enable an organism to react to changes in its environment. The challenge lies in validating GRNs that involve hundreds of TFs with hundreds of thousands of interactions with their genome-wide targets experimentally determined by high-throughput sequencing. To address this challenge, we developed ConnecTF, a species-independent, web-based platform that integrates genome-wide studies of TF-target binding, TF-target regulation, and other TF-centric omic datasets and uses these to build and refine validated or inferred GRNs. We demonstrate the functionality of ConnecTF by showing how integration within and across TF-target datasets uncovers biological insights. Case study 1 uses integration of TF-target gene regulation and binding datasets to uncover TF mode-of-action and identify potential TF partners for 14 TFs in abscisic acid signaling. Case study 2 demonstrates how genome-wide TF-target data and automated functions in ConnecTF are used in precision/recall analysis and pruning of an inferred GRN for nitrogen signaling. Case study 3 uses ConnecTF to chart a network path from NLP7, a master TF in nitrogen signaling, to direct secondary TF2s and to its indirect targets in a Network Walking approach. The public version of ConnecTF (https://ConnecTF.org) contains 3,738,278 TF-target interactions for 423 TFs in Arabidopsis, 839,210 TF-target interactions for 139 TFs in maize (Zea mays), and 293,094 TF-target interactions for 26 TFs in rice (Oryza sativa). The database and tools in ConnecTF will advance the exploration of GRNs in plant systems biology applications for model and crop species.
2.
Rice_Phospho 1.0: a new rice-specific SVM predictor for protein phosphorylation sites.
Lin, S, Song, Q, Tao, H, Wang, W, Wan, W, Huang, J, Xu, C, Chebii, V, Kitony, J, Que, S, et al
Scientific reports. 2015;:11940
Abstract
Experimentally-determined or computationally-predicted protein phosphorylation sites for distinctive species are becoming increasingly common. In this paper, we compare the predictive performance of a novel classification algorithm with different encoding schemes to develop a rice-specific protein phosphorylation site predictor. Our results imply that the combination of Amino acid occurrence Frequency with Composition of K-Spaced Amino Acid Pairs (AF-CKSAAP) provides the best description of relevant sequence features that surround a phosphorylation site. A support vector machine (SVM) using AF-CKSAAP achieves the best performance in classifying rice protein phophorylation sites when compared to the other algorithms. We have used SVM with AF-CKSAAP to construct a rice-specific protein phosphorylation sites predictor, Rice_Phospho 1.0 (http://bioinformatics.fafu.edu.cn/rice_phospho1.0). We measure the Accuracy (ACC) and Matthews Correlation Coefficient (MCC) of Rice_Phospho 1.0 to be 82.0% and 0.64, significantly higher than those measures for other predictors such as Scansite, Musite, PlantPhos and PhosphoRice. Rice_Phospho 1.0 also successfully predicted the experimentally identified phosphorylation sites in LOC_Os03g51600.1, a protein sequence which did not appear in the training dataset. In summary, Rice_phospho 1.0 outputs reliable predictions of protein phosphorylation sites in rice, and will serve as a useful tool to the community.