A Multidocument Summarization Technique for Informative Bug Summaries

Mukhtar, Samal; Lee, Seonah; Heo, Jueun

Detailed Information

Cited 0 time in webofscience

Cited 1 time in scopus

Metadata Downloads

A Multidocument Summarization Technique for Informative Bug Summariesopen access

Authors: Mukhtar, Samal; Lee, Seonah; Heo, Jueun

Issue Date: Oct-2024

Publisher: Institute of Electrical and Electronics Engineers Inc.

Keywords: Computer bugs; Support vector machines; Vectors; Semantics; Mathematical models; Transformers; Training data; Tokenization; Source coding; Software development management; Bug report summarization; classification; combination; bug summaries

Citation: IEEE Access, v.12, pp 158908 - 158926

Pages: 19

Indexed: SCIE
SCOPUS

Journal Title: IEEE Access

Volume: 12

Start Page: 158908

End Page: 158926

URI: https://scholarworks.gnu.ac.kr/handle/sw.gnu/74730

DOI: 10.1109/ACCESS.2024.3487443

ISSN: 2169-3536
2169-3536

Abstract: To help developers grasp bug information, bug summaries should contain bug descriptions and information on the reproduction steps, environment, and solutions to be informative for developers. However, previously established bug report summarization techniques generate bug summaries mainly by identifying significant sentences, which often miss those bug report attributes. In this paper, we aim to generate informative summaries that cover these specific bug report attributes in a structured form. There are two challenges. First, the relevant information is sometimes scattered over multiple sources. Second, information on the reproduction steps and environment is often filtered out by previous techniques, which identify significant sentences on the basis of their relationships. Therefore, we propose a bug summarization technique that collects information from multiple sources, including duplicates and pull requests, and a classification technique for identifying sentences that provide relevant information on the reproduction steps and environment. Our proposed technique, ClaSum, consists of four steps: preprocessing, classification, sentence ranking, and summarization. We adopted RoBERTa for our classification step, Opinion and Topic association scores for the sentence ranking step, and BART for the summarization step. Our comparative experiments show that our technique outperforms the state-of-the-art technique BugSum in terms of the F1 score by 14%, 8%, and 35% on the SDS, ADS, and DDS datasets, respectively. Additionally, our qualitative investigation shows that our technique generates a more structural summary than two well-known LLMs, Gemini and Claude.

Files in This Item: There are no files associated with this item.

Appears in Collections: 공학계열 > AI융합공학과 > Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Lee, Seon Ah photo

Lee, Seon Ah: IT공과대학 (소프트웨어공학과)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

Gyeongsang National University Central Library, 501, Jinju-daero, Jinju-si, Gyeongsangnam-do, 52828, Republic of Korea+82-55-772-0534

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE