Home / Regular Issue / JST Vol. 27 (1) Jan. 2019 / JST-1047-2017


Supervised Clustering based on a Multi-objective Genetic Algorithm

Vipa Thananant and Surapong Auwatanamongkol

Pertanika Journal of Science & Technology, Volume 27, Issue 1, January 2019

Keywords: Crowding genetic algorithm, data sampling, multi-objective optimization, Pareto optimal solutions, supervised clustering

Published on: 24 Jan 2019

Supervised clustering organizes data instances into clusters on the basis of similarities between the data instances as well as class labels for the data instances. Supervised clustering seeks to meet multiple objectives, such as compactness of clusters, homogeneity of data in clusters with respect to their class labels, and separateness of clusters. With these objectives in mind, a new supervised clustering algorithm based on a multi-objective crowding genetic algorithm, named SC-MOGA, is proposed in this paper. The algorithm searches for the optimal clustering solution that simultaneously achieves the three objectives mentioned above. The SC-MOGA performs very well on a small dataset, but for a large dataset it may not be able to converge to an optimal solution or can take a very long running time to converge to a solution. Hence, a data sampling method based on the Bisecting K-Means algorithm is also introduced, to find representatives for supervised clustering. This method groups the data instances of a dataset into small clusters, each containing data instances with the same class label. Data representatives are then randomly selected from each cluster. The experimental results show that SC-MOGA with the proposed data sampling method is very effective. It outperforms three previously proposed supervised clustering algorithms, namely SRIDHCR, LK-Means and SCEC, in terms of four cluster validity indexes. The experimental results show that the proposed data sampling method not only helps to reduce the number of data instances to be clustered by the SC-MOGA, but also enhances the quality of the data clustering results.

ISSN 0128-7680

e-ISSN 2231-8526

Article ID


Download Full Article PDF

Share this article

Recent Articles