Subscribe Now Subscribe Today
Research Article
 

Path Query Processing in Large-Scale XML Databases



Su-Cheng Haw and G.S.V. Radha Krishna Rao
 
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail
ABSTRACT

With the ever-increasing popularity of XML (Extensible Markup Language) as data representation and exchange on the Internet, querying XML data has become an important issue to be address. In Native XML Database (NXD), XML documents are usually modeled as trees and XML queries are typically specified in path expression. In path expression, the primitive structural relationships are Parent-Child (P-C) and Ancestor-Descendant (A-D). Thus, finding all occurrences of these relationships is crucial for XML query processing. Current methods for query processing on NXD usually employ either sequential traversing of tree-structured model or a decomposition-matching-merging processes. We adopt the later approach and propose a novel hybrid query optimization technique, INLAB comprising both indexing and labeling technologies. Furthermore, we also propose several algorithms to create INLAB encoding and analyze the path query. We implemented our technique and present performance results over several benchmarking datasets, which prove the viability of our approach.

Services
Related Articles in ASCI
Similar Articles in this Journal
Search in Google Scholar
View Citation
Report Citation

 
  How to cite this article:

Su-Cheng Haw and G.S.V. Radha Krishna Rao, 2007. Path Query Processing in Large-Scale XML Databases. Journal of Applied Sciences, 7: 2736-2743.

DOI: 10.3923/jas.2007.2736.2743

URL: https://scialert.net/abstract/?doi=jas.2007.2736.2743

REFERENCES
Bruno, N., D. Srivastava and N. Koudas, 2002. Holistic twig joins: Optimal XML pattern matching. Proceedings of the International Conference on Management of Data, June 3-6, 2002, Madison, Wisconsin, pp: 310-321.

Chen, Q., A. Lim, K. Ong and J. Tang, 2003. D(k)-index: An adaptive structural summary for graph-structured data. Proceedings of the International Conference on Management of Data, June 9-12, 2003, San Diego, California, pp: 134-144.

Chen, Y., S. Padmanabhan, G.A. Mihaila and S.B. Davidson, 2004. Efficient path query processing on encoded XML. Proceeding of International Workshop on High Performance XML Processing.

Chien, S.Y., Z. Vagena and D. Zhang, 2002. Structural joins: A primitive for efficient XML query pattern matching. Proceedings of the International Conference on Data Engineering, February 26-March 1, 2002, San Jose, USA., pp: 141-152.

Chung, CW., J.K. Min and K. Shim, 2002. APEX: An adaptive path index for XML data. Procedings of the ACM SIGMOD International Conference on Management of Data, June 3-6, 2002, Madison, Wisconsin, pp: 121-132.

Cohen, E., H. Kaplan and T. Milo, 2002. Labeling dynamic XML trees. Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, June 3-5, 2002, Madison, Wisconsin, pp: 271-281.

Cooper, B.F., N. Sample, M.J. Franklin, G.R. Hjaltason and M. Shadmon, 2001. A fast index for semistructured data. Proceedings of the 27th International Conference on Very Large Data Bases, September 11-14, 2001, IEEE Xplore, London, pp: 341-350.

Dietz, P.F., 1982. Maintaining order in a linked list. Acta Inform., 21: 122-127.
Direct Link  |  

Goldman, R. and J. Widom, 1997. Data guides: Enabling query formulation and optimization in semistructured databases. Proceedings of the International Conference Very Large Data Bases, August 26-29, 1997, Athens, Greece, pp: 436-445.

Green, T.J., G. Miklau, M. Onizuka and D. Suciu, 2003. Processing XML streams with deterministic automata. Lecture Notes Comput. Sci., 2572: 173-189.
Direct Link  |  

Haw, S.C. and G.S.V.R.K. Rao, 2005. Query optimization techniques for XML databases. Int. J. Inform. Technol., 2: 97-104.
Direct Link  |  

Haw, S.C. and G.S.V.R.K. Rao, 2007. An efficient path query processing support for parent-child relationship in native XML databases. J. Digital Inform. Manage., 2: 82-87.

He, H. and J. Yang, 2004. Multiresolution indexing of XMl for frequent queries. Proceedings of the 20th International Conference on Data Engineering, March 30-April 2, 2004, IEEE Xplore, London, pp: 683-694.

Kaushik, R., D. Shenoy, P. Bohannon and E. Gudes, 2002. Exploiting local similarity to efficiently ondex paths in graph-structured data. Proceedings of the 18th International Conference Data Enreering, February 26-March 1, 2002, San Jose, CA, USA., pp: 129-140.

Kim, J., S.H. Lee and H-J. Kim, 2004. Efficient structural joins with clusters extents. Inform. Proc. Lett., 91: 69-75.
Direct Link  |  

Kimber, W.E., 1993. HyTime and SGML: Understanding the HyTime HYQ query language. http://ftp2.de.freebsd.org/pub/sgml/ifi.uio.no/yTime/HyQ-1.

Kiss, A. and V.L. Anh, 2005. Combining tree structure indexes with structural indexes in query evaluation on XML data. Lecture Notes Comput. Sci., 3631: 254-267.
Direct Link  |  

Li, Q. and B. Moon, 2001. Indexing and querying XML data for regular path expressions. Proceedings of the 27th International Conference on Very Large Data Bases, September 11-14, 2001, IEEE Xplore, London, pp: 361-370.

Lian, W., N. Mamoulist, David W.L. Cheung and S.M. Yiu, 2005. Indexing useful structural patterns for XML query processing. IEEE Trans. Knowledge Data Eng., 17: 997-1009.
Direct Link  |  

Lu, J. and T.W. Ling, 2004. Labeling and querying dynamic XML trees. Lecture Notes Comput. Sci., 3007: 180-189.
Direct Link  |  

Lu, J., T. Chen and T.W. Ling, 2004. Efficient processing of XML twig patterns with parent child edges: A look-ahead approach. Proceedings of the 13th ACM International Conference on Information and Knowledge Management, November 8-13, 2004, Washington, DC., USA., pp: 533-542.

Milo, T. and D. Suciu, 1999. Index structures for path expression. Proceedings of the 7th International Conference on Database Theory, January 10-12, 1999, IEEE Xplore, London, pp: 277-295.

O'Neil, P., E. O'Neil, S. Pal, I. Cseri, G. Schaller and N. Westbury, 2004. ORDPATHS: Insert-friendly XML node labels. Proceedings of the 2004 ACM SIGMOD international conference on Management of Data, June 13-18, 2004, Paris, France, pp: 903-908.

Peng, F. and S.S. Chawathe, 2003. XPath queries on streaming data. Proceedings of the International Conference on Management of Data, June 9-12, 2003, San Diego, California, pp: 431-442.

Sigmod, 2002. Sigmod database ACM. http://www.sigmod.org/record/xml.

Tatarinov, I., S. Viglas, K.S. Beyer, J. Shanmugasundaram, E.J. Shekita and C. Zhang, 2002. Storing and querying ordered XML using a relational database system. Proceedings of the International Conference on Management of Data, June 3-6, 2002, Madison, Wisconsin, pp: 204-215.

Thonangi, R., 2006. A concise labeling scheme for XML data. Proceeding of the International Conference on Management of Data COMAD, December 14-16, 2006, Delhi, India, pp: 1-11.

UW, 2002. University of Washington XML repository. http://www.cs.washington.edu/research/ xmldatasets/.

Wu, X., M.L. Lee and W. Hsu, 2004. A prime number labeling scheme for dynamic ordered XML tree. Proceedings of the 20th International Conference on Data Engineering, March 30-April 2, 2004, IEEE Xplore, London, pp: 66-78.

Wu, Y., J.M. Patel and H. V.Jagadish, 2003. Structural join order selection for XML query optimization. Proceedings of the International Conference on Data Engineering, March 5-8, 2003, Michigan University, MI, USA., pp: 443-454.

Yan, L. and Z. Liang, 2005. Multiple schema based XML indexing. Lecture Notes Comput. Sci., 3619: 891-900.
Direct Link  |  

Yao, J.T. and M. Zhang, 2004. A fast tree pattern matching algorithm for XML query. Proceeding of the International Conference Web Intelligence, September 20-24, 2004, IEEE Xplore, London, pp: 235-241.

Zhang, C., J. Naughton, D. DeWitty, Q. Luo and G. Lohman, 2001. On supporting containment queries in relational database management systems. Proc. ACM SIGMOD, 30: 425-436.
Direct Link  |  

Zhang, W., D. Liu and J. Li, 2004. An encoding scheme for indexing XML data. Proceedings of the Internatinal Conference On e-Technology, e-Commernce and e-Service, March 28-31, 2004, IEEE Xplore, London, pp: 525-528.

Zou, Q., S. Liu and W. Wesley Chu, 2004. Ctree: A compact tree for indexing XML data. Proceedings of the 6th Annual ACM International Workshop on Web Information and Data Management, November 12-13, 2004, Washington, DC., USA., pp: 39-46.

©  2020 Science Alert. All Rights Reserved