HOME JOURNALS CONTACT

Information Technology Journal

Year: 2013 | Volume: 12 | Issue: 20 | Page No.: 5482-5486
DOI: 10.3923/itj.2013.5482.5486
A Fuzzy Centroids Clustering Algorithm with Between-cluster Information for Categorical Data
Wang Li-Na, Liu Qian and Zhou Yuan

Abstract: In this study, a new fuzzy centroids clustering for categorical data is presented. The objective function of the fuzzy k-modes algorithm is modified by adding the between-cluster information so as to simultaneously minimize the within-cluster dispersion and enhance the between-cluster separation. Due to the misclassification by using the hard centroids, a fuzzy centroids clustering with the between-cluster information for categorical data is provided. Furthermore, the dissimilarity measure between an object and the centroid at the feature level is given as 1 minus the frequency of the feature value of the object. On several real data sets from UCI, the proposed algorithm is effective and the performance of the novel algorithm outperforms the one with hard-type centroids.

Fulltext PDF

How to cite this article
Wang Li-Na, Liu Qian and Zhou Yuan, 2013. A Fuzzy Centroids Clustering Algorithm with Between-cluster Information for Categorical Data. Information Technology Journal, 12: 5482-5486.

Keywords: Fuzzy clustering, fuzzy centroids, between-cluster information and categorical data

REFERENCES

  • Bai, L., J. Liang, C. Dang and F. Cao, 2013. A novel fuzzy clustering algorithm with between-cluster information for categorical data. Fuzzy Sets Syst., 215: 55-73.
    CrossRef    


  • Cao, F., J. Liang, D. Li, L. Bai and C. Dang, 2012. A dissimilarity measure for the k-Modes clustering algorithm. Knowl. Based Syst., 26: 120-127.
    CrossRef    Direct Link    


  • Chan, E.Y., W.K. Ching, M.K. Ng and Z.J. Huang, 2004. An optimization algorithm for clustering using weighted dissimilarity measures. Pattern Recogn., 37: 943-952.
    CrossRef    Direct Link    


  • Frank, A. and A. Asuncion, 2010. UCI machine learning repository Irvine. University of California, School of Information and Computer Science, USA. http://archive.ics.uci.edu/ml/.


  • Huang, Z., 1998. Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining Knowledge Discovery, 2: 283-304.
    CrossRef    Direct Link    


  • Huang, Z. and M.K. Ng, 1999. A fuzzy k-modes algorithm for clustering categorical data. IEEE Trans. Fuzzy Syst., 7: 446-452.
    CrossRef    Direct Link    


  • Ji, J., W. Pang, C. Zhou, X. Han and Z. Wang, 2012. A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data. Knowl. Based Syst., 30: 129-135.
    CrossRef    Direct Link    


  • Kim, D.W., K.H. Lee and D. Lee, 2004. Fuzzy clustering of categorical data using fuzzy centroids. Pattern Recogn. Lett., 25: 1263-1271.
    CrossRef    Direct Link    


  • Lee, M. and W. Pedrycz, 2009. The fuzzy C-means algorithm with fuzzy P-mode prototypes for clustering objects having mixed features. Fuzzy Sets Syst., 160: 3590-3600.
    CrossRef    Direct Link    


  • Ng, M.K., M.J. Li, J.Z. Huang and Z. He, 2007. On the impact of dissimilarity measure in k-Modes clustering Algorithm. IEEE Trans. Pattern Anal. Machine Intelli., 29: 503-507.
    CrossRef    Direct Link    


  • Yang, Y., 1999. An evaluation of statistical approaches to text categorization. Inform. Retrieval, 1: 69-90.
    CrossRef    

  • © Science Alert. All Rights Reserved