Cited 47 time in
Code authorship identification using convolutional neural networks
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Abuhamad, Mohammed | - |
| dc.contributor.author | Rhim, Ji-su | - |
| dc.contributor.author | AbuHmed, Tamer | - |
| dc.contributor.author | Ullah, Sana | - |
| dc.contributor.author | Kang, Sanggil | - |
| dc.contributor.author | Nyang, DaeHun | - |
| dc.date.accessioned | 2024-12-03T00:00:43Z | - |
| dc.date.available | 2024-12-03T00:00:43Z | - |
| dc.date.issued | 2019-06 | - |
| dc.identifier.issn | 0167-739X | - |
| dc.identifier.issn | 1872-7115 | - |
| dc.identifier.uri | https://scholarworks.gnu.ac.kr/handle/sw.gnu/73235 | - |
| dc.description.abstract | Although source code authorship identification creates a privacy threat for many open source contributors, it is an important topic for the forensics field and enables many successful forensic applications, including ghostwriting detection, copyright dispute settlements, and other code analysis applications. This work proposes a convolutional neural network (CNN) based code authorship identification system. Our proposed system exploits term frequency-inverse document frequency, word embedding modeling, and feature learning techniques for code representation. This representation is then fed into a CNN-based code authorship identification model to identify the code's author. Evaluation results from using our approach on data from Google Code Jam demonstrate an identification accuracy of up to 99.4% with 150 candidate programmers, and 96.2% with 1,600 programmers. The evaluation of our approach also shows high accuracy for programmers identification over real-world code samples from 1987 public repositories on GitHub with 95% accuracy for 745 C programmers and 97% for the C++ programmers. These results indicate that the proposed approaches are not language-specific techniques and can identify programmers of different programming languages. (C) 2018 Elsevier B.V. All rights reserved. | - |
| dc.format.extent | 12 | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | ELSEVIER | - |
| dc.title | Code authorship identification using convolutional neural networks | - |
| dc.type | Article | - |
| dc.publisher.location | 네델란드 | - |
| dc.identifier.doi | 10.1016/j.future.2018.12.038 | - |
| dc.identifier.scopusid | 2-s2.0-85059761266 | - |
| dc.identifier.wosid | 000465509600010 | - |
| dc.identifier.bibliographicCitation | FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, v.95, pp 104 - 115 | - |
| dc.citation.title | FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | - |
| dc.citation.volume | 95 | - |
| dc.citation.startPage | 104 | - |
| dc.citation.endPage | 115 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Computer Science | - |
| dc.relation.journalWebOfScienceCategory | Computer Science, Theory & Methods | - |
| dc.subject.keywordPlus | ATTRIBUTION | - |
| dc.subject.keywordAuthor | Code authorship identification | - |
| dc.subject.keywordAuthor | Program features privacy | - |
| dc.subject.keywordAuthor | Convolutional neural network | - |
| dc.subject.keywordAuthor | Deep learning identification | - |
| dc.subject.keywordAuthor | Software forensics and security | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
Gyeongsang National University Central Library, 501, Jinju-daero, Jinju-si, Gyeongsangnam-do, 52828, Republic of Korea+82-55-772-0532
COPYRIGHT 2022 GYEONGSANG NATIONAL UNIVERSITY LIBRARY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
