Information Technology Journal1812-56381812-5646Asian Network for Scientific Information10.3923/itj.2011.478.484VelmuruganT. SanthanamT. 32011103Clustering is one of the most important research areas in
the field of data mining. Clustering means creating groups of objects based
on their features in such a way that the objects belonging to the same groups
are similar and those belonging in different groups are dissimilar. Clustering
is an unsupervised learning technique. Data clustering is the subject of active
research in several fields such as statistics, pattern recognition and machine
learning. From a practical perspective clustering plays an outstanding role
in data mining applications in many domains. The main advantage of clustering
is that interesting patterns and structures can be found directly from very
large data sets with little or none of the background knowledge. Clustering
algorithms can be applied in many areas, for instance marketing, biology, libraries,
insurance, city-planning, earthquake studies and www document classification.
Data mining adds to clustering the complications of very large datasets with
very many attributes of different types. This imposes unique computational requirements
on relevant clustering algorithms. A variety of algorithms have recently emerged
that meet these requirements and were successfully applied to real-life data
mining problems. They are subject of this survey. Also, this survey explores
the behavior of some of the partition based clustering algorithms and their
