An efficient and compact indexing scheme for large-scale data store

Please use this identifier to cite or link to this item: https://doi.org/10.1109/ICDE.2013.6544836

DC Field	Value
dc.title	An efficient and compact indexing scheme for large-scale data store
dc.contributor.author	Lu, P.
dc.contributor.author	Wu, S.
dc.contributor.author	Shou, L.
dc.contributor.author	Tan, K.-L.
dc.date.accessioned	2014-07-04T03:11:20Z
dc.date.available	2014-07-04T03:11:20Z
dc.date.issued	2013
dc.identifier.citation	Lu, P.,Wu, S.,Shou, L.,Tan, K.-L. (2013). An efficient and compact indexing scheme for large-scale data store. Proceedings - International Conference on Data Engineering : 326-337. ScholarBank@NUS Repository. <a href="https://doi.org/10.1109/ICDE.2013.6544836" target="_blank">https://doi.org/10.1109/ICDE.2013.6544836</a>
dc.identifier.isbn	9781467349086
dc.identifier.issn	10844627
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/78007
dc.description.abstract	The amount of data managed in today's Cloud systems has reached an unprecedented scale. In order to speed up query processing, an effective mechanism is to build indexes on attributes that are used in query predicates. However, conventional indexing schemes fail to provide a scalable service: as the size of these indexes are proportional to the data size, it is not space efficient to build many indexes. As such, it becomes more crucial to develop effective index to provide scalable database services in the Cloud. In this paper, we propose a compact bitmap indexing scheme for a large-scale data store. The bitmap indexing scheme combines state-of-the-art bitmap compression techniques, such as WAH encoding and bit-sliced encoding. To further reduce the index cost, a novel and query efficient partial indexing technique is adopted, which dynamically refreshes the index to handle updates and process queries. The intuition of our indexing approach is to maximize the number of indexed attributes, so that a wider range of queries, including range and join queries, can be efficiently supported. Our indexing scheme is light-weight and its creation can be seamlessly grafted onto the MapReduce processing engine without incurring significant running cost. Moreover, the compactness allows us to maintain the bitmap indexes in memory so that performance overhead of index access is minimal. We implement our indexing scheme on top of the underlying Distributed File System (DFS) and evaluate its performance on an in-house cluster. We compare our index-based query processing with HadoopDB to show its superior performance. Our experimental results confirm the effectiveness, efficiency and scalability of the indexing scheme. © 2013 IEEE.
dc.description.uri	http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1109/ICDE.2013.6544836
dc.source	Scopus
dc.type	Conference Paper
dc.contributor.department	COMPUTER SCIENCE
dc.description.doi	10.1109/ICDE.2013.6544836
dc.description.sourcetitle	Proceedings - International Conference on Data Engineering
dc.description.page	326-337
dc.identifier.isiut	NOT_IN_WOS
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM