Please use this identifier to cite or link to this item: https://doi.org/10.1109/ICDE.2011.5767907
DC FieldValue
dc.titleTowards exploratory hypothesis testing and analysis
dc.contributor.authorLiu, G.
dc.contributor.authorFeng, M.
dc.contributor.authorWang, Y.
dc.contributor.authorWong, L.
dc.contributor.authorNg, S.-K.
dc.contributor.authorMah, T.L.
dc.contributor.authorLee, E.J.D.
dc.date.accessioned2013-07-04T07:57:16Z
dc.date.available2013-07-04T07:57:16Z
dc.date.issued2011
dc.identifier.citationLiu, G., Feng, M., Wang, Y., Wong, L., Ng, S.-K., Mah, T.L., Lee, E.J.D. (2011). Towards exploratory hypothesis testing and analysis. Proceedings - International Conference on Data Engineering : 745-756. ScholarBank@NUS Repository. https://doi.org/10.1109/ICDE.2011.5767907
dc.identifier.isbn9781424489589
dc.identifier.issn10844627
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/40128
dc.description.abstractHypothesis testing is a well-established tool for scientific discovery. Conventional hypothesis testing is carried out in a hypothesis-driven manner. A scientist must first formulate a hypothesis based on his/her knowledge and experience, and then devise a variety of experiments to test it. Given the rapid growth of data, it has become virtually impossible for a person to manually inspect all the data to find all the interesting hypotheses for testing. In this paper, we propose and develop a data-driven system for automatic hypothesis testing and analysis. We define a hypothesis as a comparison between two or more sub-populations. We find sub-populations for comparison using frequent pattern mining techniques and then pair them up for statistical testing. We also generate additional information for further analysis of the hypotheses that are deemed significant. We conducted a set of experiments to show the efficiency of the proposed algorithms, and the usefulness of the generated hypotheses. The results show that our system can help users (1) identify significant hypotheses; (2) isolate the reasons behind significant hypotheses; and (3) find confounding factors that form Simpson's Paradoxes with discovered significant hypotheses. © 2011 IEEE.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1109/ICDE.2011.5767907
dc.publisherIEEE
dc.sourceScopus
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.doi10.1109/ICDE.2011.5767907
dc.description.sourcetitleProceedings - International Conference on Data Engineering
dc.description.page745-756
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications
Elements

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
2011-towards_exploratory_hypothesis_testing_analysis-postprint.pdf216.7 kBAdobe PDF

OPEN

Post-printView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.