Home / Archive / JST Vol. 25 (2) Apr. 2017 / JST-S0061-2016

 

Review of Context-Based Similarity Measure for Categorical Data

Nurul Adzlyana, M. S., Rosma, M. D. and Nurazzah, A. R.

Pertanika Journal of Science & Technology, Volume 25, Issue 2, April 2017

Published: 27 Apr 2017

Data mining processes such as clustering, classification, regression and outlier detection are developed based on similarity between two objects. Data mining processes of categorical data is found to be most challenging. Earlier similarity measures are context-free. In recent years, researchers have come up with context-sensitive similarity measure based on the relationships of objects. This paper provides an in-depth review of context-based similarity measures. Descriptions of algorithm for four context-based similarity measure, namely Association-based similarity measure, DILCA, CBDL and the hybrid context-based similarity measure, are described. Advantages and limitations of each context-based similarity measure are identified and explained. Context-based similarity measure is highly recommended for data-mining tasks for categorical data. The findings of this paper will help data miners in choosing appropriate similarity measures to achieve more accurate classification or clustering results.

ISSN 0128-7702

e-ISSN 2231-8534

Article ID

JST-S0061-2016

Download Full Article PDF

Share this article

Related Articles