Abstract
It is critical to determine the optimal number of clusters (NC) in cluster analysis. Many cluster validity indices have been proposed, such as the Silhouette index and In-group proportion index. However, these validity indices have more time complexity. From the viewpoint of sample geometry, a new internal cluster validity index for determining the optimal NC is proposed. The new index can evaluate the clustering quality of a certain clustering algorithm and determine the optimal NC for many kinds of data sets, including synthetic data sets, benchmark data sets, and real data sets. Compared with many well-known validity indices, the proposed index is more effective and efficient. Theoretical analysis and experimental results show the effectiveness and high efficiency of the new index.
Get full access to this article
View all access options for this article.
