Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/145428
Title: | SYNTHETICALLY SCALING AN EMPIRICAL DATASET | Authors: | ZHANG JIANGWEI | Keywords: | data synthesis, data scaling, data generation, data similarity, system, database | Issue Date: | 29-Mar-2018 | Citation: | ZHANG JIANGWEI (2018-03-29). SYNTHETICALLY SCALING AN EMPIRICAL DATASET. ScholarBank@NUS Repository. | Abstract: | Large-scale enterprises, like Amazon and Douban, have enormous datasets. For research and development, it is impractical to run experiments with such a large dataset. It is therefore often necessary to obtain a smaller version of the dataset for experiments. We call this the scaling down problem. At the other extreme, a start-up company may have a small dataset, but wants to test the scalability of their system. They may, therefore, want to have a larger (and necessarily) synthetic version of their current empirical dataset. We call this the scaling up problem. This motivates the Dataset Scaling Problem (DSP): Given an original dataset D and a scale factor s, generate a scaled dataset D' that is similar to D but s times its size. This thesis studies DSP in the domain of graph and relational databases. | URI: | http://scholarbank.nus.edu.sg/handle/10635/145428 |
Appears in Collections: | Ph.D Theses (Open) |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
ZhangJW.pdf | 7.82 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.