Abstract
The original self-organizing map (SOM) was proposed in the context of processing numeric data. In previous studies, an extended SOM incorporating data structure distance hierarchies has been proposed to facilitate handling of categorical values. The model could take into account the semantics embedded in categorical values via distance hierarchies. In addition to manual construction by domain experts, an approach to learning distance hierarchies from datasets has been devised. However, the proposed approach in the previous study was based on supervised learning which demands presence of a class attribute in the dataset. In real-world applications, class attribute may not be available. Thus, the supervised approach can be inapplicable. In this article, we present several methods of unsupervised learning of distance hierarchies so that neither are class attribute nor domain experts required in measuring similarity degree between categorical values. We then integrate the learned distance hierarchies with the extended SOM to facilitate the application to datasets without a class attribute. We conduct experiments to verify feasibility and compare performance of the proposed unsupervised-learning methods.
Get full access to this article
View all access options for this article.
