Abstract
Automatically inferring the function of unknown proteins is a challenging task in proteomics. There are two major problems in the task of computational protein function prediction, which are the choice of the protein representation and the choice of the classification algorithm. There are several ways of extracting features from a protein, and the choice of the feature representation might be as important as the choice of the classification algorithm. These problems are aggravated in the case of hierarchical protein function prediction, where a hierarchy of classifiers is built and each of those classifiers' construction has to consider the aforementioned selection problems. In this paper we address these problem by employing three alternative selective hierarchical classification approaches: (a) selecting the best classifier given a fixed representation; (b) selecting the best representation given a fixed classifier; and (c) selecting the best classifier and representation simultaneously, in a synergistic fashion. The analysis of the results have shown that the selective representation approach is almost always ranked number 1 when compared against the different fixed representations and that the use of the selective classifier approach is not able to surpass using only the best classifier for the target problem.
Get full access to this article
View all access options for this article.
