Yan Zhang
School of Computer Science and Technology, Shandong University, Jinan, 250014, China
Qingzhong Li
School of Computer Science and Technology, Shandong University, Jinan, 250014, China
Rui Zhang
Shandong Provincial Institute of Electronic Products Supervision and Inspection, Jinan, 250014, China
Peiguang Lin
School of Com. Sci. and Tech., Shandong Univ. of Finance and Economics, Jinan, 250014, China
ABSTRACT
With the fast increasing number of Web databases (WDBs), it is core issue in the study of Web data integration that we should select the most appropriate composition of databases to query and obtain more targeted data at a smaller cost. In this study, in order to reduce redundant data from different sources, we propose a novel method of Web databases redundancy computing to select proper Web data sources for given keywords. To solve the problem, we propose a web database feature representation model, and based on sample data from the sources, we put forward the deep web redundancy computing method considering three different data types: text attribute, numeric attribute and categorical attribute. Experiments show that this method can achieve the desired objectives and can meet the demand to the integrated system very well.
PDF References Citation
How to cite this article
Yan Zhang, Qingzhong Li, Rui Zhang and Peiguang Lin, 2013. Novel Method of Web Database Redundancy Computing for Web Data Sources Selection. Information Technology Journal, 12: 5216-5220.
DOI: 10.3923/itj.2013.5216.5220
URL: https://scialert.net/abstract/?doi=itj.2013.5216.5220
DOI: 10.3923/itj.2013.5216.5220
URL: https://scialert.net/abstract/?doi=itj.2013.5216.5220
REFERENCES
- Gravano, L., H. Garcia-Molina and A. Tomasic, 1999. GlOSS: Text-source discovery over the internet. ACM Trans. Database Syst., 24: 229-264.
CrossRef - Miao, Z.Y., P.P. Zhao, P.Y. Hu and Z.M. Cui, 2009. Estimation for overlapping rate of deep web databases based on attribute high-frequency words. Comput. Eng., 35: 28-30.
Direct Link - Zhao, H., W. Meng, Z. Wu, V. Raghavan and C. Yu, 2005. Fully automatic wrapper generation for search engines. Proceedings of the 14th International Conference on World Wide Web, May 10-14, 2005, Chiba, Japan, pp: 66-75.
Direct Link