Authors: İRFAN KÖSESOY, MURAT GÖK, TAMER KAHVECİ
Abstract: Knowledge of the pathogen-host interactions between the species is essentialin order to develop a solution strategy against infectious diseases. In vitro methods take extended periods of time to detect interactions and provide very few of the possible interaction pairs. Hence, modelling interactions between proteins has necessitated the development of computational methods. The main scope of this paper is integrating the known protein interactions between thehost and pathogen organisms to improve the prediction success rate of unknown pathogen-host interactions. Thus, the truepositive rate of the predictions was expected to increase.In order to perform this study extensively, encoding methods and learning algorithms of several proteins were tested. Along with human as the host organism, two different pathogen organisms were used in the experiments. For each combination of protein-encoding and prediction method, both the original prediction algorithms were tested using only pathogen-host interactions and the same methodwas testedagain after integrating the known protein interactions within each organism. The effect of merging the networks of pathogen-host interactions of different species on the prediction performance of state-of-the-art methods was also observed. Successwas measured in terms of Matthews correlation coefficient, precision, recall, F1 score, and accuracy metrics. Empirical results showed that integrating the host and pathogen interactions yields better performance consistently in almost all experiments.
Keywords: Infectious diseases, host-pathogen interactions, protein-protein interactions, protein networks, machine learning, bioinformatics
Full Text: PDF