Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/39529
DC FieldValue
dc.titleOptimality and scalability in lattice histogram construction
dc.contributor.authorKarras, P.
dc.date.accessioned2013-07-04T07:43:40Z
dc.date.available2013-07-04T07:43:40Z
dc.date.issued2009
dc.identifier.citationKarras, P. (2009). Optimality and scalability in lattice histogram construction. Proceedings of the VLDB Endowment 2 (1) : 670-681. ScholarBank@NUS Repository.
dc.identifier.issn21508097
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/39529
dc.description.abstractThe Lattice Histogram is a recently proposed data summarization technique that achieves approximation quality preferable to that of an optimal plain histogram. Like other hierarchical synopsis methods, a lattice histogram (LH) aims to approximate data using a hierarchical structure. Still, this structure is not defined a priori; it consists an unknown, not a given, of the problem. Past work has defined the properties that an LH needs to obey and developed general-purpose approximation algorithms for the construction thereof. Still, two major issues remain unaddressed: First, the construction of an optimal LH for a given error metric is a problem unsolved to date. Second, the proposed algorithms suffer from too high space and time complexities that render their application in real-world settings problematic. In this paper, we address both these questions, focusing on the case that the target error metric is a maximum error metric. Our algorithms treat both the error-bounded LH construction problem, in which the space occupied by an LH is minimized under an error constraint, as well as the classic space-bounded problem. First, we develop a dynamicprogramming scheme that detects an optimal LH under a given maximum-error bound. Second, we propose an efficient, practical, greedy algorithm that solves the same problem with much lower time and space requirements. Then, we show how both our algorithms can be applied to the classic space-bounded problem, aiming at minimizing error under a bound on space. Our experimental study with real-world data sets shows the effectiveness of our methods compared to competing summarization techniques. Moreover, our findings show that our greedy heuristic performs almost as well as the optimal solution in terms of accuracy. © 2009 VLDB Endowment.
dc.sourceScopus
dc.typeArticle
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.sourcetitleProceedings of the VLDB Endowment
dc.description.volume2
dc.description.issue1
dc.description.page670-681
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.