Towards exploratory hypothesis testing and analysis

Please use this identifier to cite or link to this item: https://doi.org/10.1109/ICDE.2011.5767907

DC Field	Value
dc.title	Towards exploratory hypothesis testing and analysis
dc.contributor.author	Liu, G.
dc.contributor.author	Feng, M.
dc.contributor.author	Wang, Y.
dc.contributor.author	Wong, L.
dc.contributor.author	Ng, S.-K.
dc.contributor.author	Mah, T.L.
dc.contributor.author	Lee, E.J.D.
dc.date.accessioned	2013-07-04T07:57:16Z
dc.date.available	2013-07-04T07:57:16Z
dc.date.issued	2011
dc.identifier.citation	Liu, G., Feng, M., Wang, Y., Wong, L., Ng, S.-K., Mah, T.L., Lee, E.J.D. (2011). Towards exploratory hypothesis testing and analysis. Proceedings - International Conference on Data Engineering : 745-756. ScholarBank@NUS Repository. https://doi.org/10.1109/ICDE.2011.5767907
dc.identifier.isbn	9781424489589
dc.identifier.issn	10844627
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/40128
dc.description.abstract	Hypothesis testing is a well-established tool for scientific discovery. Conventional hypothesis testing is carried out in a hypothesis-driven manner. A scientist must first formulate a hypothesis based on his/her knowledge and experience, and then devise a variety of experiments to test it. Given the rapid growth of data, it has become virtually impossible for a person to manually inspect all the data to find all the interesting hypotheses for testing. In this paper, we propose and develop a data-driven system for automatic hypothesis testing and analysis. We define a hypothesis as a comparison between two or more sub-populations. We find sub-populations for comparison using frequent pattern mining techniques and then pair them up for statistical testing. We also generate additional information for further analysis of the hypotheses that are deemed significant. We conducted a set of experiments to show the efficiency of the proposed algorithms, and the usefulness of the generated hypotheses. The results show that our system can help users (1) identify significant hypotheses; (2) isolate the reasons behind significant hypotheses; and (3) find confounding factors that form Simpson's Paradoxes with discovered significant hypotheses. © 2011 IEEE.
dc.description.uri	http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1109/ICDE.2011.5767907
dc.publisher	IEEE
dc.source	Scopus
dc.type	Conference Paper
dc.contributor.department	COMPUTER SCIENCE
dc.description.doi	10.1109/ICDE.2011.5767907
dc.description.sourcetitle	Proceedings - International Conference on Data Engineering
dc.description.page	745-756
dc.identifier.isiut	NOT_IN_WOS
Appears in Collections:	Staff Publications Elements

Show simple item record

Files in This Item:

File	Description	Size	Format	Access Settings	Version
2011-towards_exploratory_hypothesis_testing_analysis-postprint.pdf		216.7 kB	Adobe PDF	OPEN	Post-print	View/Download

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM