Subscribe Now Subscribe Today
Science Alert
FOLLOW US:     Facebook     Twitter
Curve Top
Journal of Software Engineering
  Year: 2011 | Volume: 5 | Issue: 4 | Page No.: 116-126
DOI: 10.3923/jse.2011.116.126
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail
Distributed K-means based-on Soft Constraints
Y.C. Yu, J.D. Wang, G.S. Zheng and Y. Jiang

Pairwise constraints can effectively improve the clustering results. However, noise constraints will seriously affect the performance of clustering. To improve the distributed clustering with constraints, distributed k-means based-on soft constraints, which constraint violations can be effectively dealt with, is presented in this paper. Aiming at the limitation of distributed clustering, such as communication cost and data privacy etc., only positive constraints by chunklets are used in the proposed method. To simplify the treatment of constrained data points, the mean value of chunklet is used as the representative point. Then positive constraints among chunklet are approximately transformed into pairwise positive constraints between each data points from the chunklet and the mean value. Thus, the cluster label of each mean value is regarded as the label estimation of data points from the chunklet. Based on the above approximation, a new measure of partition cost used to deal with constraint violations is defined. Therefore, for unconstrained data points, the within-cluster sum of distance squares can be minimized. Meanwhile, for constrained data points, the sum of distance between data points and corresponding centriods and the cost of constraint violations is minimized too. The experimental results showed that the proposed method decreases the computation complexity of constraint violations. Compared with hard constrained distributed clustering, the clustering accuracy of the proposed method is increased.
PDF Fulltext XML References Citation Report Citation
How to cite this article:

Y.C. Yu, J.D. Wang, G.S. Zheng and Y. Jiang, 2011. Distributed K-means based-on Soft Constraints. Journal of Software Engineering, 5: 116-126.

DOI: 10.3923/jse.2011.116.126








Curve Bottom