Abstract
Data clustering is an important data exploration technique with many applications in engineering, including parts family formation in group technology and segmentation in image processing. One of the most popular data clustering methods is K-means clustering because of its simplicity and computational efficiency. The main problem with this clustering method is its tendency to coverge at a local minimum. In this paper, the cause of this problem is explained and an existing solution involving a cluster centre jumping operation is examined. The jumping technique alleviates the problem with local minima by enabling cluster centres to move in such a radical way as to reduce the overall cluster distortion. However, the method is very sensitive to errors in estimating distortion. A clustering scheme that is also based on distortion reduction through cluster centre movement but is not so sensitive to inaccuracies in distortion estimation is proposed in this paper. The scheme, which is an incremental version of the K-means algorithm, involves adding cluster centres one by one as clusters are being formed. The paper presents test results to demonstrate the efficacy of the proposed algorithm.