Abstract
A new stopping criterion for active learning SVM is proposed. It takes advantage of the low density separation idea, which is extensively used in semi-supervised learning and unsupervised learning methods. When the separating hyperplane of an active learning SVM (ALSVM) lies in a sparse region in the feature space, its generalization error is expected to have reached a local minimum, and active learning can stop in such circumstances. An improved Normalized Cut algorithm, Stochastic Search Normalized Cut(SSNcut), is employed for measuring whether SVM's separating hyperplane lies in a low density region. In each active learning iteration, the label vector predicted by SVM is compared with that obtained from SSNcut. If these two label vectors are very similar to each other, the SVM's separating hyperplane can be considered as lying in a sparse region. When calculating these label vectors, only points that are nearest to the current separating hyperplane of ALSVM are considered. This makes Normalized Cut less likely to be influenced by outliers. Experiments show that the proposed stopping criterion strikes a good balance between generalization performance and label cost.
Get full access to this article
View all access options for this article.
