Please use this identifier to cite or link to this item: https://doi.org/10.1145/2351676.2351687
DC FieldValue
dc.titleDuplicate bug report detection with a combination of information retrieval and topic modeling
dc.contributor.authorNguyen, A.T.
dc.contributor.authorNguyen, T.T.
dc.contributor.authorNguyen, T.N.
dc.contributor.authorLo, D.
dc.contributor.authorSun, C.
dc.date.accessioned2016-06-02T09:25:23Z
dc.date.available2016-06-02T09:25:23Z
dc.date.issued2012
dc.identifier.citationNguyen, A.T.,Nguyen, T.T.,Nguyen, T.N.,Lo, D.,Sun, C. (2012). Duplicate bug report detection with a combination of information retrieval and topic modeling. 2012 27th IEEE/ACM International Conference on Automated Software Engineering, ASE 2012 - Proceedings : 70-79. ScholarBank@NUS Repository. <a href="https://doi.org/10.1145/2351676.2351687" target="_blank">https://doi.org/10.1145/2351676.2351687</a>
dc.identifier.isbn9781450312042
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/124985
dc.description.abstractDetecting duplicate bug reports helps reduce triaging efforts and save time for developers in fixing the same issues. Among several automated detection approaches, text-based information retrieval (IR) approaches have been shown to outperform others in term of both accuracy and time efficiency. However, those IR-based approaches do not detect well the duplicate reports on the same technical issues written in different descriptive terms. This paper introduces DBTM, a duplicate bug report detection approach that takes advantage of both IR-based features and topic-based features. DBTM models a bug report as a textual document describing certain technical issue(s), and models duplicate bug reports as the ones about the same technical issue(s). Trained with historical data including identified duplicate reports, it is able to learn the sets of different terms describing the same technical issues and to detect other not-yet-identified duplicate ones. Our empirical evaluation on real-world systems shows that DBTM improves the state-of-the-art approaches by up to 20% in accuracy. Copyright 2012 ACM.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1145/2351676.2351687
dc.sourceScopus
dc.subjectDuplicate bug reports
dc.subjectInformation retrieval
dc.subjectTopic model
dc.typeConference Paper
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.doi10.1145/2351676.2351687
dc.description.sourcetitle2012 27th IEEE/ACM International Conference on Automated Software Engineering, ASE 2012 - Proceedings
dc.description.page70-79
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.