Please use this identifier to cite or link to this item:
Title: Using Map-reduce to scale an empirical database
Keywords: Map-Reduce,UpSizeR,Scale,Dataset,Database,Hadoop
Issue Date: 19-Dec-2011
Citation: SHEN ZHONG (2011-12-19). Using Map-reduce to scale an empirical database. ScholarBank@NUS Repository.
Abstract: Datasets are crucial for testing in both industrial and academic fields. However, getting a dataset which has a proper size and can reflect the real data properties is not easy. Different from normal domain-specific benchmarks, UpSizeR is a tool that takes an empirical dataset D and a scale factor s as input and generates a synthetic dataset which keeps the properties of the original dataset but s times its size. UpSizeR is implemented using Map-Reduce which guarantees it could efficiently handle large datasets . In order to reduce I/O operations, we optimize our UpSizeR implementation to make it more efficient. We run queries on both the synthetic and the original datasets and compare the results to evaluate the similarity of both datasets.
Appears in Collections:Master's Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
ShenZ.pdf1.8 MBAdobe PDF



Page view(s)

checked on Aug 19, 2019


checked on Aug 19, 2019

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.