Please use this identifier to cite or link to this item: https://doi.org/10.1109/ASE.2011.6100061
Title: Towards more accurate retrieval of duplicate bug reports
Authors: Sun, C.
Lo, D.
Khoo, S.-C. 
Jiang, J.
Issue Date: 2011
Citation: Sun, C.,Lo, D.,Khoo, S.-C.,Jiang, J. (2011). Towards more accurate retrieval of duplicate bug reports. 2011 26th IEEE/ACM International Conference on Automated Software Engineering, ASE 2011, Proceedings : 253-262. ScholarBank@NUS Repository. https://doi.org/10.1109/ASE.2011.6100061
Abstract: In a bug tracking system, different testers or users may submit multiple reports on the same bugs, referred to as duplicates, which may cost extra maintenance efforts in triaging and fixing bugs. In order to identify such duplicates accurately, in this paper we propose a retrieval function (REP) to measure the similarity between two bug reports. It fully utilizes the information available in a bug report including not only the similarity of textual content in summary and description fields, but also similarity of non-textual fields such as product, component, version, etc. For more accurate measurement of textual similarity, we extend BM25F - an effective similarity formula in information retrieval community, specially for duplicate report retrieval. Lastly we use a two-round stochastic gradient descent to automatically optimize REP for specific bug repositories in a supervised learning manner. We have validated our technique on three large software bug repositories from Mozilla, Eclipse and OpenOffice. The experiments show 10-27% relative improvement in recall rate@k and 17-23% relative improvement in mean average precision over our previous model. We also applied our technique to a very large dataset consisting of 209,058 reports from Eclipse, resulting in a recall rate@k of 37-71% and mean average precision of 47%. © 2011 IEEE.
Source Title: 2011 26th IEEE/ACM International Conference on Automated Software Engineering, ASE 2011, Proceedings
URI: http://scholarbank.nus.edu.sg/handle/10635/40760
ISBN: 9781457716393
DOI: 10.1109/ASE.2011.6100061
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.