Chunquan Liang
College of Mechanical and Electronic Engineering, Northwest A and F University, Yangling, 712100, Shaanxi, People Repbulic of China
Yang Zhang
College of Mechanical and Electronic Engineering, Northwest A and F University, Yangling, 712100, Shaanxi, People Repbulic of China
Shaojun Hu
College of Information Engineering, Northwest A and F University, Yangling, 712100, Shaanxi, People Repbulic of China
ABSTRACT
In this study, we study the problem of classifying uncertain data streams. Based on CVFDT algorithm, we proposed a novel algorithm, namely uCVFDTc, to learn very fast decision trees from uncertain data streams with concept drift. In training phase, the uCVFDTc algorithm uses Hoeffding bound theory to yield fast and reasonable decision trees. In classification phase, at tree leaves it uses Uncertain Naive Bayes (UNB) classifiers to improve classification performance. Experimental results showed that uCVFDTc had strong ability to learn from uncertain data streams and cope with concept drift; the use of UNB at tree leaves had improved the performance of uCVFDTc, especially the ability to handle concept drift.
PDF References Citation
How to cite this article
Chunquan Liang, Yang Zhang and Shaojun Hu, 2013. Learning Decision Trees from Time-changing Uncertain Data Streams. Information Technology Journal, 12: 8469-8475.
DOI: 10.3923/itj.2013.8469.8475
URL: https://scialert.net/abstract/?doi=itj.2013.8469.8475
DOI: 10.3923/itj.2013.8469.8475
URL: https://scialert.net/abstract/?doi=itj.2013.8469.8475
REFERENCES
- Domingos, P. and G. Hulten, 2000. Mining high-speed data streams. Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 20-23, 2000, Boston, Massachusetts, United States, pp: 71-80.
CrossRefDirect Link - Gama, J., 2012. A survey on learning from data streams: Current and future trends. Prog. Artif. Intell., 1: 45-55.
CrossRef - Gao, C. and J. Wang, 2010. Direct mining of discriminative patterns for classifying uncertain data. Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July 25-28, 2010, Washington, DC, USA, pp: 861-870.
CrossRef - Hoeffding, W., 1963. Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc., 58: 13-30.
CrossRef - Hulten, G., L. Spencer and P. Domingos, 2001. Mining time-changing data streams. Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 26-29, 2001, ACM Press, New York, pp: 97-106.
CrossRef - Liang, C., Y. Zhang, P. Shi and Z. Hu, 2012. Learning very fast decision tree from uncertain data streams with positive and unlabeled samples. Inform. Sci., 213: 50-67.
CrossRef - Mala, A. and F.R. Dhanaseelan, 2011. Data stream mining algorithms-A review of issues and existing approaches. Int. J. Comput. Sci. Eng., 3: 2726-2732.
Direct Link - Polikar, R., 2006. Ensemble based systems in decision making. IEEE Circuits Syst. Mag., 6: 21-45.
CrossRef - Qin, B., Y. Xia and F. Li, 2009. DTU: A decision tree for uncertain data. Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, April 27-30, 2009, Bankok, Thailand, pp: 4-15.
CrossRef - Qin, B., Y. Xia, S. Wang and X. Du, 2011. A novel bayesian classification for uncertain data. Knowledge Based Syst., 11: 1151-1158.
CrossRef - Tsang, S., B. Kao, K.Y. Yip, W.S. Ho and S.D. Lee, 2011. Decision trees for uncertain data. IEEE Trans. Knowl. Data Eng., 23: 64-78.
CrossRef - Street, W.N. and Y.S. Kim, 2001. A Streaming Ensemble Algorithm (SEA) for large-scale classification. Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 26-29, 2001, San Francisco, CA., USA., pp: 377-382.
CrossRefDirect Link - Wang, H., W. Fan, P. Yu and J. Han, 2003. Mining concept-drifting data streams using ensemble classifiers. Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 24-27, 2003, Shanghai, China, pp: 226-235.
CrossRefDirect Link