HOME JOURNALS CONTACT

Information Technology Journal

Year: 2013 | Volume: 12 | Issue: 19 | Page No.: 5216-5220
DOI: 10.3923/itj.2013.5216.5220
Novel Method of Web Database Redundancy Computing for Web Data Sources Selection
Yan Zhang, Qingzhong Li, Rui Zhang and Peiguang Lin

Abstract: With the fast increasing number of Web databases (WDBs), it is core issue in the study of Web data integration that we should select the most appropriate composition of databases to query and obtain more targeted data at a smaller cost. In this study, in order to reduce redundant data from different sources, we propose a novel method of Web databases redundancy computing to select proper Web data sources for given keywords. To solve the problem, we propose a web database feature representation model, and based on sample data from the sources, we put forward the deep web redundancy computing method considering three different data types: text attribute, numeric attribute and categorical attribute. Experiments show that this method can achieve the desired objectives and can meet the demand to the integrated system very well.

Fulltext PDF

How to cite this article
Yan Zhang, Qingzhong Li, Rui Zhang and Peiguang Lin, 2013. Novel Method of Web Database Redundancy Computing for Web Data Sources Selection. Information Technology Journal, 12: 5216-5220.

Keywords: Deep web, redundancy computing and data sources selection

REFERENCES

  • Ghanem, T.M. and W.G. Aref, 2004. Databases deepen the Web. Computer, 37: 116-117.
    CrossRef    


  • Gravano, L., H. Garcia-Molina and A. Tomasic, 1999. GlOSS: Text-source discovery over the internet. ACM Trans. Database Syst., 24: 229-264.
    CrossRef    


  • He, H., W. Meng, C. Yu and Z. Wu, 2003. Wise-integrator: An automatic integrator of web search interfaces for e-commerce. Proceedings of the 29th International Conference on Very Large Data Bases, Volume 29, September 9-12, 2003, pp: 357 368-.


  • Lin, P., R. Xu, Z. Hong and Y. Zhang, 2008. Finding the WDB's query interface in deep web automatically. Proceedings of the International Conference on Internet Computing in Science and Engineering, January 28-29, 2008, Harbin, China, pp: 195-200.


  • Lin, P.G. and L. Zhao, 2010. Research on the expression and extraction of WDB's query interface based on ontology. J. Convergence Inform. Technol., 5: 103-113.


  • Miao, Z.Y., P.P. Zhao, P.Y. Hu and Z.M. Cui, 2009. Estimation for overlapping rate of deep web databases based on attribute high-frequency words. Comput. Eng., 35: 28-30.
    Direct Link    


  • Peng, Q., W.Y. Meng, H. He and C. Yu, 2004. WISE-cluster: Clustering E-commerce search engines automatically. Proceedings of the 6th ACM International Workshop on Web Information and Data Management, November 12-13, 2004, Washington, DC., USA., pp: 104-111.


  • Wang, B., 2009. Research on database choice and query conversion for deep web. Master's Thesis, Dalian Institute of Technology, Dalian, China.


  • Zhao, H., W. Meng, Z. Wu, V. Raghavan and C. Yu, 2005. Fully automatic wrapper generation for search engines. Proceedings of the 14th International Conference on World Wide Web, May 10-14, 2005, Chiba, Japan, pp: 66-75.

  • © Science Alert. All Rights Reserved