The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here: https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Detection of duplicate defect reports using natural language processing

Author

Summary, in English

Defect reports are generated from various testing and development activities in software engineering. Some-times two reports are submitted that describe the same problem, leading to duplicate reports. These reports are mostly written in structured natural language, and as such, it is hard to compare two reports for similarity with formal methods. In order to identify duplicates, we investigate using Natural Language Processing (NLP) techniques to support the identification. A prototype tool is developed and evaluated in a case study analyzing defect reports at Sony Ericsson Mobile Communications. The evaluation shows that about 2/3 of the duplicates can possibly be found using the NLP techniques. Different variants of the techniques provide only minor result differences, indicating a robust technology. User testing shows that the overall attitude towards the technique is positive and that it has a growth potential. © 2007 IEEE.

Publishing year

2007

Language

English

Pages

499-508

Publication/Series

Proceedings - International Conference on Software Engineering

Document type

Conference paper

Publisher

IEEE - Institute of Electrical and Electronics Engineers Inc.

Topic

  • Computer Science

Keywords

  • Sony Ericsson (CO)
  • User testing

Conference name

29th International Conference on Software Engineering, ICSE 2007

Conference date

2007-05-20 - 2007-05-26

Conference place

Minneapolis, MN, United States

Status

Published

ISBN/ISSN/Other

  • ISSN: 0270-5257
  • CODEN: PCSEDE