Abstract
The stability techniques widely used in bioinformatics research estimate clusterings with a pre-defined number of clusters. But the complex nature of bio-molecular data necessitates the extension of the stability techniques in order to validate the whole clusters' hierarchy without strict setting of the number of clusters beforehand. In this paper we proposed a stability-based algorithm HClusterV to estimate the individual clusters of the dendrogram. It is based on a repetitive construction of the hierarchy of clusters followed by the calculation of the original consensus matrix. The proposed algorithm allows to overcome the deficiency of the previous approach and to improve the reliability of the stability indices. Experiments on two simulated datasets and further comparative analysis confirmed the advantages of our approach. The proposed HClusterV algorithm was evaluated on two real microarray datasets and gave the results consistent with the corresponding non-hierarchical stability-based methods and relevant biological knowledge.
Keywords
Get full access to this article
View all access options for this article.
