Please use this identifier to cite or link to this item: https://doi.org/10.1145/1835449.1835670
DC FieldValue
dc.titleA Co-learning framework for learning user search intents from rule-generated training data
dc.contributor.authorYan, J.
dc.contributor.authorZheng, Z.
dc.contributor.authorJiang, L.
dc.contributor.authorLi, Y.
dc.contributor.authorYan, S.
dc.contributor.authorChen, Z.
dc.date.accessioned2014-06-19T02:52:38Z
dc.date.available2014-06-19T02:52:38Z
dc.date.issued2010
dc.identifier.citationYan, J.,Zheng, Z.,Jiang, L.,Li, Y.,Yan, S.,Chen, Z. (2010). A Co-learning framework for learning user search intents from rule-generated training data. SIGIR 2010 Proceedings - 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval : 895-896. ScholarBank@NUS Repository. <a href="https://doi.org/10.1145/1835449.1835670" target="_blank">https://doi.org/10.1145/1835449.1835670</a>
dc.identifier.isbn9781605588964
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/68735
dc.description.abstractLearning to understand user search intents from their online behaviors is crucial for both Web search and online advertising. However, it is a challenging task to collect and label a sufficient amount of high quality training data for various user intents such as "compare products", "plan a travel", etc. Motivated by this bottleneck, we start with some user common sense, i.e. a set of rules, to generate training data for learning to predict user intents. The rule-generated training data are however hard to be used since these data are generally imperfect due to the serious data bias and possible data noises. In this paper, we introduce a Co-learning Framework (CLF) to tackle the problem of learning from biased and noisy rule-generated training data. CLF firstly generates multiple sets of possibly biased and noisy training data using different rules, and then trains the individual user search intent classifiers over different training datasets independently. The intermediate classifiers are then used to categorize the training data themselves as well as the unlabeled data. The confidently classified data by one classifier are added to other training datasets and the incorrectly classified ones are instead filtered out from the training datasets. The algorithmic performance of this iterative learning procedure is theoretically guaranteed. © 2010 ACM.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1145/1835449.1835670
dc.sourceScopus
dc.subjectClassification
dc.subjectSearch engine
dc.subjectUser intent
dc.typeConference Paper
dc.contributor.departmentELECTRICAL & COMPUTER ENGINEERING
dc.description.doi10.1145/1835449.1835670
dc.description.sourcetitleSIGIR 2010 Proceedings - 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
dc.description.page895-896
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.