Abstract
Feature selection is one of the key problems in machine learning and data mining. It involves identifying a subset of the most useful features that produces compatible results as the original entire set of features. It can reduce the dimensionality of original data, speed up the learning process and build comprehensible learning models with good generalization performance. Nowadays, ensemble idea has been used to improve the performance of feature selection by integrating multiple base feature selection models into an ensemble one. In this paper, in order to improve the efficiency of feature selection in dealing with large scale, high dimension and imbalanced problems, a Min-Max Ensemble Feature Selection (M2-EFS) is proposed, which is based on balanced data partition and min-max ensemble strategy. The experimental results demonstrate that the M2-EFS can obtain higher performance than other classical ensemble methods in most cases, especially for large scale, high dimension and imbalanced data.
Get full access to this article
View all access options for this article.
