![]() |
|
|
|||||||||||||||||||
|
a Department of Computer and Information Sciences, University of Delaware, Newark, Delaware 19716, USA
Key Words: protein–protein interaction phylogenetic vectors least-squares support vector machines
Addressed for correspondence: Li Liao, Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716. Voice: 302-831-3500; fax: 302-831-8458. lliao{at}cis.udel.edu
Predicting protein–protein interactions has become a key step of reverse-engineering biological networks to better understand cellular functions. The experimental methods in determining protein–protein interactions are time-consuming and costly, which has motivated vigorous development of computational approaches for predicting protein–protein interactions. A set of recently developed bioinformatics methods utilizes coevolutionary information of the interacting partners (e.g., as exhibited in the form of correlations between distance matrices, where, for each protein, a matrix stores the pairwise distances between the protein and its orthologs in a group of reference genomes). We proposed a novel method to account for the intra-matrix correlations in improving predictive accuracy. The distance matrices for a pair of proteins are transformed and concatenated into a phylogenetic vector. A least-squares support vector machine is trained and tested on pairs of proteins, represented as phylogenetic vectors, whose interactions are known. The intra-matrix correlations are accounted for by introducing a weighted linear kernel, which determines the dot product of two phylogenetic vectors. The performance, measured as receiver operator characteristic (ROC) score in cross-validation experiments, shows significant improvement of our method (ROC score 0.928) over that obtained by Pearson correlations (0.659).
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||