Cited 2 time in
Machine-Learning-Based Gender Distribution Prediction from Anonymous News Comments: The Case of Korean News Portal
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Suh, Jong Hwan | - |
| dc.date.accessioned | 2022-12-26T05:41:37Z | - |
| dc.date.available | 2022-12-26T05:41:37Z | - |
| dc.date.issued | 2022-08 | - |
| dc.identifier.issn | 2071-1050 | - |
| dc.identifier.uri | https://scholarworks.gnu.ac.kr/handle/sw.gnu/1023 | - |
| dc.description.abstract | Anonymous news comment data from a news portal in South Korea, naver.com, can help conduct gender research and resolve related issues for sustainable societies. Nevertheless, only a small portion of gender information (i.e., gender distribution) is open to the public, and therefore, it has rarely been considered for gender research. Hence, this paper aims to resolve the matter of incomplete gender information and make the anonymous news comment data usable for gender research as new social media big data. This paper proposes a machine-learning-based approach for predicting the gender distribution (i.e., male and female rates) of anonymous news commenters for a news article. Initially, the big data of news articles and their anonymous news comments were collected and divided into labeled and unlabeled datasets (i.e., with and without gender information). The word2vec approach was employed to represent a news article by the characteristics of the news comments. Then, using the labeled dataset, various prediction techniques were evaluated for predicting the gender distribution of anonymous news commenters for a labeled news article. As a result, the neural network was selected as the best prediction technique, and it could accurately predict the gender distribution of anonymous news commenters of the labeled news article. Thus, this study showed that a machine-learning-based approach can overcome the incomplete gender information problem of anonymous social media users. Moreover, when the gender distributions of the unlabeled news articles were predicted using the best neural network model, trained with the labeled dataset, their distribution turned out different from the labeled news articles. The result indicates that using only the labeled dataset for gender research can result in misleading findings and distorted conclusions. The predicted gender distributions for the unlabeled news articles can help to better understand anonymous news commenters as humans for sustainable societies. Eventually, this study provides a new way for data-driven computational social science with incomplete and anonymous social media big data. | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | MDPI Open Access Publishing | - |
| dc.title | Machine-Learning-Based Gender Distribution Prediction from Anonymous News Comments: The Case of Korean News Portal | - |
| dc.type | Article | - |
| dc.publisher.location | 스위스 | - |
| dc.identifier.doi | 10.3390/su14169939 | - |
| dc.identifier.scopusid | 2-s2.0-85137741171 | - |
| dc.identifier.wosid | 000845263400001 | - |
| dc.identifier.bibliographicCitation | Sustainability, v.14, no.16 | - |
| dc.citation.title | Sustainability | - |
| dc.citation.volume | 14 | - |
| dc.citation.number | 16 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | Y | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | ssci | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Science & Technology - Other Topics | - |
| dc.relation.journalResearchArea | Environmental Sciences & Ecology | - |
| dc.relation.journalWebOfScienceCategory | Green & Sustainable Science & Technology | - |
| dc.relation.journalWebOfScienceCategory | Environmental Sciences | - |
| dc.relation.journalWebOfScienceCategory | Environmental Studies | - |
| dc.subject.keywordPlus | IDENTIFICATION | - |
| dc.subject.keywordAuthor | anonymity | - |
| dc.subject.keywordAuthor | social media | - |
| dc.subject.keywordAuthor | big data | - |
| dc.subject.keywordAuthor | news comments | - |
| dc.subject.keywordAuthor | gender prediction | - |
| dc.subject.keywordAuthor | word embedding | - |
| dc.subject.keywordAuthor | machine learning | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
Gyeongsang National University Central Library, 501, Jinju-daero, Jinju-si, Gyeongsangnam-do, 52828, Republic of Korea+82-55-772-0534
COPYRIGHT 2022 GYEONGSANG NATIONAL UNIVERSITY LIBRARY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
