Predicting interactions between pathogen and human proteins based on the relation between sequence length and amino acid composition

Saud Alguwaizani, Shulei Ren, De Shuang Huang, Kyungsook Han

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Aim: Both bacterial infection and viral infection involve a large number of protein-protein interactions (PPIs) between a pathogen and its target host. Background: So far, many computational methods have focused on predicting PPIs within the same species rather than PPIs across different species. Methods: From the extensive analysis of PPIs between Yersinia pestis bacteria and humans, we recent-ly discovered an interesting relation; a linear relation between amino acid composition and sequence length was observed in many proteins involved in PPIs. We have built a support vector machine (SVM) model, which predicts PPIs between human and bacteria using two feature types derived from the rela-tion. The two feature types used in the SVM are the amino acid composition group (AACG) and the difference in amino acid composition between host and pathogen proteins. Results: The SVM model achieved high performance in predicting bacteria-human PPIs. The model showed an accuracy of 96%, sensitivity of 94%, and specificity of 98% in predicting PPIs between humans and Yersinia pestis, in which there is a strong relation between amino acid composition and sequence length. The SVM model was also tested in predicting PPIs between human and viruses, which include Ebola, HCV, and SARS-CoV-2, and showed a good performance. Conclusion: The feature types identified in our study are simple yet powerful in predicting pathogen-human PPIs. Although preliminary, our method will be useful for finding unknown target host proteins or pathogen proteins and designing in vitro or in vivo experiments.

Original languageEnglish
Pages (from-to)799-806
Number of pages8
JournalCurrent Bioinformatics
Volume16
Issue number6
DOIs
StatePublished - 2021

Bibliographical note

Publisher Copyright:
© 2021 Bentham Science Publishers.

Keywords

  • Ebola
  • HCV
  • Machine learning
  • Pathogen-host interaction
  • Protein-protein interaction
  • SARS-CoV-2
  • Y. pestis

Fingerprint

Dive into the research topics of 'Predicting interactions between pathogen and human proteins based on the relation between sequence length and amino acid composition'. Together they form a unique fingerprint.

Cite this