HOME JOURNALS CONTACT

Information Technology Journal

Year: 2013 | Volume: 12 | Issue: 22 | Page No.: 6710-6716
DOI: 10.3923/itj.2013.6710.6716
A Domain Web Data Standardization Organization Method
Cao Rui, Wang Rui, Hao Li-Yun and Wu Ling-Da

Abstract: In order to enable the Web data to be applied in a non-Internet environment, overcoming the timeliness of Web data, this study proposes a domain Web data standardization organization method. The domain Web standardization organization framework is established and the acquired data are divided into three categories: Structured, un-structured and semi-structured. With respect to the semi-structured data within an implicit scheme, we use a rational code design to transform semi-structured data into structured data. Combining file system and relational database, a standardization organization method is established for the three types of data. Experimental results show that this method is effective and efficient.

Fulltext PDF

How to cite this article
Cao Rui, Wang Rui, Hao Li-Yun and Wu Ling-Da, 2013. A Domain Web Data Standardization Organization Method. Information Technology Journal, 12: 6710-6716.

Keywords: Web data, domain, semi-structured data, unstructured data and organization

REFERENCES

  • Bergman, M.K., 2001. White paper: The deep web: Surfacing hidden value. J. Electron. Publishing, Vol. 7.
    CrossRef    


  • Buneman, P., 1997. Semistructured data. Proceeding of ACM Symposium on Principles of Database Systems, May 11-15, 1997, Tucson, AZ, USA., pp: 117-121.


  • CWI, 2002. XMark-An XML benchmark project. http://www.xml-benchmark.org/downloads.html.


  • He, B., M. Patel, Z. Zhang and C.K. Chang, 2007. Accessing the deep web: A survey. Commun. ACM, 50: 94-101.
    Direct Link    


  • Hicks, C., M. Scheffer, A.H.H. Ngu and Q.Z. Sheng, 2012. Discovery and cataloging of deep web sources. Proceedings of the 13th IEEE International Conference on Information Reuse and Integration, August 8-10, 2012, Las Vegas, NV., pp: 224-230.


  • Li, C.Q. and T.W. Ling, 2005. QED: A novel quaternary encoding to completely avoid re-labeling in XML updates. Proceedings of the 14th ACM International Conference on Information and Knowledge Management, 31 October-5 November, 2005, Bremen, Germany, pp: 501-508.


  • Liu, W., X.F. Meng and W.Y. Meng, 2007. A survey of deep web data integration. China J. Comput., 30: 1475-1489.


  • Marin-Castro, H.M., V.J. Sosa-Sosa and I. Lopez-Arevalo, 2011. A strategy for identification of web query interfaces using supervised learning. Proceedings of the 7th International Conference on Next Generation Web Services Practices, October 19-21, 2011, Salamanca, pp: 233-237.


  • Tripathy, A.K., N. Joshi, S. Thomas, S. Shetty and N. Thomas, 2012. VEDD-a visual wrapper for extraction of data using DOM tree. Proceedings of the International Conference on Communication, Information and Computing Technology, October 19-20, 2012, Mumbai, pp: 1-6.


  • Xu, L., Z.F. Bao and T.W. Ling, 2007. A dynamic labeling scheme using vectors. Proceedings of the 18th International Conference on Database and Expert Systems Applications, September 3-7, 2007, Regensburg, Germany, pp: 130-140.


  • Zhang, C., J. Naughton, D. DeWitt, Q. Luo and G. Lohman, 2001. On supporting containment queries in relational database management systems. Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, May 21-24, 2001, California, USA., pp: 425-436.


  • Zheng, D.D., P.P. Zhao and Z.M. Cui, 2005. On the research and design of deep web crawler. J. Tstingh Univ., 45: 1896-1902.

  • © Science Alert. All Rights Reserved