Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/20927
Title: Labeling dynamic XML documents: An order-centric approach
Authors: XU LIANG
Keywords: Dynamic XML, Labeling scheme, Order, Update, Dewey
Issue Date: 24-Aug-2010
Source: XU LIANG (2010-08-24). Labeling dynamic XML documents: An order-centric approach. ScholarBank@NUS Repository.
Abstract: The rise of xml as a de facto standard for data exchange and representation has generated a lot of interest on querying XML documents that conform to an ordered tree-structured data model. Labeling schemes facilitate XML query processing by assigning each node in the XML tree a unique label[8, 22, 35, 44, 51]. Structural relationships of the tree nodes, such as Parent/Child (PC), Ancestor/Descendant(AD), Sibling and Document order, can be efficiently established by comparing their labels. In this thesis, we explore static and dynamic XML labeling schemes from a novel order-centric perspective: We systematically study the various labeling schemes proposed in the literature with a special focus on their orders of labels. We develop an order-based framework to classify and characterize XML labeling schemes, based on which we show that the order of labels fundamentally impacts the update processing of a labeling scheme[48]. We introduce a novel order concept, vector order[46], which is the foundation of the dynamic labeling schemes we propose. Compared with previous solutions that are based on natural order, lexicographical order or VLEI order[9, 22, 32?35, 38, 44, 51], vector order is a simple, yet most effective solution to process updates in XML DBMS. We illustrate the application of vector order to both range-based and prefix-based labeling schemes, including Pre/post[22], Containment[51] and Dewey labeling schemes[44] to efficiently process updates without re-labeling. Since updates are usually unpredictable, we argue that a single labeling scheme should be used for both static and dynamic XML documents. Previous dynamic XML labeling schemes, however, suffer from the complexity introduced by their insertion techniques even if there is little/no update. To further improve the application of vector order to prefix-based labeling schemes, we extend the concept of vector order and introduce Dynamic DEwey (DDE) labeling scheme[49]. DDE, in the static setting, is the same as Dewey labeling scheme which is designed for static XML documents. In addition, based on an extension of vector order, DDE allows dynamic updates without re-labeling when updates take place. We introduce a variant of DDE, namely CDDE, which is derived from DDE labeling scheme from a one-to-one mapping. Compared with DDE, CDDE labeling scheme shows slower growth in label size for frequent insertions. Both DDE and CDDE have exhibited high resilience to skewed insertions in which case the qualities of existing labeling schemes degrade severely. Qualitative and experimental evaluations confirm the benefits of our approach compared to previous solutions. Lastly, we focus on improving the efficiency of applying vector order to rangebased labeling schemes[47]. We present in this thesis a generally applicable Search Tree-based (ST) encoding technique which can be applied to vector order as well as existing encoding schemes[32?34]. We illustrate the applications of ST encoding technique and show that it can generate dynamic labels of optimal size. In addition, when combining with encoding table compression, we are able to process very large XML documents with limited memory available. Experimental results demonstrate the advantages of our encoding technique over the previous encoding algorithms.
URI: http://scholarbank.nus.edu.sg/handle/10635/20927
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
XuL.pdf653.74 kBAdobe PDF

OPEN

NoneView/Download

Page view(s)

356
checked on Dec 11, 2017

Download(s)

578
checked on Dec 11, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.