A Fuzzy Centroids Clustering Algorithm with Between-cluster Information for
Categorical Data
Abstract:
In this study, a new fuzzy centroids clustering for categorical data is presented. The objective function of the fuzzy k-modes algorithm is modified by adding the between-cluster information so as to simultaneously minimize the within-cluster dispersion and enhance the between-cluster separation. Due to the misclassification by using the hard centroids, a fuzzy centroids clustering with the between-cluster information for categorical data is provided. Furthermore, the dissimilarity measure between an object and the centroid at the feature level is given as 1 minus the frequency of the feature value of the object. On several real data sets from UCI, the proposed algorithm is effective and the performance of the novel algorithm outperforms the one with hard-type centroids.
How to cite this article
Wang Li-Na, Liu Qian and Zhou Yuan, 2013. A Fuzzy Centroids Clustering Algorithm with Between-cluster Information for
Categorical Data. Information Technology Journal, 12: 5482-5486.
REFERENCES
Bai, L., J. Liang, C. Dang and F. Cao, 2013. A novel fuzzy clustering algorithm with between-cluster information for categorical data. Fuzzy Sets Syst., 215: 55-73.
CrossRef
Cao, F., J. Liang, D. Li, L. Bai and C. Dang, 2012. A dissimilarity measure for the k-Modes clustering algorithm. Knowl. Based Syst., 26: 120-127.
CrossRef Direct Link
Chan, E.Y., W.K. Ching, M.K. Ng and Z.J. Huang, 2004. An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recogn., 37: 943-952.
CrossRef Direct Link
Frank, A. and A. Asuncion, 2010. UCI machine learning repository Irvine. University of California, School of Information and Computer Science, USA. http://archive.ics.uci.edu/ml/.
Huang, Z., 1998. Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining Knowledge Discovery, 2: 283-304.
CrossRef Direct Link
Huang, Z. and M.K. Ng, 1999. A fuzzy k-modes algorithm for clustering categorical data. IEEE Trans. Fuzzy Syst., 7: 446-452.
CrossRef Direct Link
Ji, J., W. Pang, C. Zhou, X. Han and Z. Wang, 2012. A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data. Knowl. Based Syst., 30: 129-135.
CrossRef Direct Link
Kim, D.W., K.H. Lee and D. Lee, 2004. Fuzzy clustering of categorical data using fuzzy centroids. Pattern Recogn. Lett., 25: 1263-1271.
CrossRef Direct Link
Lee, M. and W. Pedrycz, 2009. The fuzzy C-means algorithm with fuzzy P-mode prototypes for clustering objects having mixed features. Fuzzy Sets Syst., 160: 3590-3600.
CrossRef Direct Link
Ng, M.K., M.J. Li, J.Z. Huang and Z. He, 2007. On the impact of dissimilarity measure in k-Modes clustering Algorithm. IEEE Trans. Pattern Anal. Machine Intelli., 29: 503-507.
CrossRef Direct Link
Yang, Y., 1999. An evaluation of statistical approaches to text categorization. Inform. Retrieval, 1: 69-90.
CrossRef
© Science Alert. All Rights Reserved