Subscribe Now Subscribe Today
Science Alert
Curve Top
Journal of Applied Sciences
  Year: 2009 | Volume: 9 | Issue: 20 | Page No.: 3739-3745
DOI: 10.3923/jas.2009.3739.3745
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail

A Survey of Distributed Classification Based Ensemble Data Mining Methods

D. Mokeddem and H. Belbachir

Distributed classification is one task of distributed data mining which allows predicting if a data instance is member of a predefined class. It can be applied for two different objectives: the first is the desire to scale up algorithms to large data sets where the data are distributed in order to increase the overall efficiency; the second is the processing of data which are inherently distributed and autonomous. Ensemble learning methods as very promising techniques in terms of accuracy and also providing a distributed aspect, can be adapted to the distributed data mining. This study presents a survey of various approaches which use ensemble learning methods in a context of distributed classification, using as base classifier decision trees algorithm. According to the two objective mentioned above, the majority of work reported in the literature address the problem using one of the two techniques. The adaptation of ensemble learning methods to disjoint data sets, in the context of mining inherently distributed data and the parallelization of ensemble learning methods, in a scalability context. Through this survey, one can deduct that the work done in one or the other perspective (scaling up data mining algorithms or mining inherently distributed data) could be complementary. Some open questions, current debates and future directions are also discussed.
PDF Fulltext XML References Citation Report Citation
  •    Connection Subgraphs: A Survey
  •    Classification and Regression Trees: A Possible Method for Creating Risk Groups for Progression to Diabetic Nephropathy
How to cite this article:

D. Mokeddem and H. Belbachir, 2009. A Survey of Distributed Classification Based Ensemble Data Mining Methods. Journal of Applied Sciences, 9: 3739-3745.

DOI: 10.3923/jas.2009.3739.3745






Curve Bottom