This paper proposes a novel approach to improve the visualization capabilities of self-organizing maps and facilitate the identification of the resulting clusters. Unlike other clustering algorithms, self-organizing maps lack the feature to select a predefined number of clusters, and the boundaries of the clusters are not explicitly represented on the self-organizing maps. The main advantage of our proposed approach is that the option for selecting the desired number of clusters has been implemented. The experimental investigation was performed using four datasets with different characteristics. The improved visualization leverages various similarity distances to assess their impact on performance. The effectiveness of the novel approach to clustering results has been compared with those of the well-known k-means and hierarchical clustering methods, which allow for the selection of the desired number of clusters. Additionally, the visualization results, obtained by the proposed approach, were compared with those produced using the Orange Data Mining tool, where the u-matrix is applied to visualize a self-organizing map. The advantage of our approach compared to the u-matrix visualization has been highlighted in this paper. The performance of clustering algorithms has been measured by calculating the ratio of data items correctly assigned to clusters in the case when the clusters are predefined in the analyzed dataset. The results obtained showed that the most effective similarity distances are the cosine and correlation distances, which help to detect the correctly predefined clusters in the visualization of self-organizing maps.
This work is licensed under a Creative Commons Attribution 4.0 International License.