Using Map-reduce to scale an empirical database

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/30705

DC Field	Value
dc.title	Using Map-reduce to scale an empirical database
dc.contributor.author	SHEN ZHONG
dc.date.accessioned	2012-02-29T18:00:52Z
dc.date.available	2012-02-29T18:00:52Z
dc.date.issued	2011-12-19
dc.identifier.citation	SHEN ZHONG (2011-12-19). Using Map-reduce to scale an empirical database. ScholarBank@NUS Repository.
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/30705
dc.description.abstract	Datasets are crucial for testing in both industrial and academic fields. However, getting a dataset which has a proper size and can reflect the real data properties is not easy. Different from normal domain-specific benchmarks, UpSizeR is a tool that takes an empirical dataset D and a scale factor s as input and generates a synthetic dataset which keeps the properties of the original dataset but s times its size. UpSizeR is implemented using Map-Reduce which guarantees it could efficiently handle large datasets . In order to reduce I/O operations, we optimize our UpSizeR implementation to make it more efficient. We run queries on both the synthetic and the original datasets and compare the results to evaluate the similarity of both datasets.
dc.language.iso	en
dc.subject	Map-Reduce,UpSizeR,Scale,Dataset,Database,Hadoop
dc.type	Thesis
dc.contributor.department	COMPUTER SCIENCE
dc.contributor.supervisor	TAY YONG CHIANG
dc.description.degree	Master's
dc.description.degreeconferred	MASTER OF SCIENCE
dc.identifier.isiut	NOT_IN_WOS
Appears in Collections:	Master's Theses (Open)

File	Description	Size	Format	Access Settings	Version
ShenZ.pdf		1.8 MB	Adobe PDF	OPEN	None	View/Download

Check