Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Does the Quality and Readability of Information Related to Varicocele Obtained from ChatGPT 4.0 Remain Consistent Across Different Models of Inquiry?open access

Authors
Luo, ZhaoKam, Sung ChulKim, Ji YongHu, WenhaoLin, ChuanPark, Hyun JunShin, Yu Seob
Issue Date
May-2025
Publisher
대한남성과학회
Keywords
Comprehension; Infertility; Large language models; Varicocele
Citation
The World Journal of Men's Health, v.44, no.1, pp 161 - 170
Pages
10
Indexed
SCIE
SCOPUS
KCI
Journal Title
The World Journal of Men's Health
Volume
44
Number
1
Start Page
161
End Page
170
URI
https://scholarworks.gnu.ac.kr/handle/sw.gnu/79115
DOI
10.5534/wjmh.240331
ISSN
2287-4208
2287-4690
Abstract
Purpose There is a growing tendency of individuals resorting to Chat-Generative Pretrained Transformer (ChatGPT) as a source of medical information on specific ailments. Varicocele is a prevalent condition affecting the male reproductive system. The quality, readability, and consistency of the information related to varicocele that individuals obtain through interactive access to ChatGPT remains uncertain. Materials and Methods This study employed Google Trends data to extract 25 trending questions since 2004. Two distinct inquiry methodologies were employed with ChatGPT 4.0: repetition mode (each question repeated three times) and cyclic mode (each question input once in three consecutive cycles). The generated texts were evaluated according to a number of criteria, including the Automated Readability Index (ARI), the Flesch Reading Ease Score (FRES), the Gunning Fog Index (GFI), the DISCERN score and the Ensuring Quality Information for Patients (EQIP). Kruskal-Wallis and Mann-Whitney U tests were employed to compare the text quality, readability, and consistency between the two modes. Results The results demonstrated that the texts generated in repetition and cyclic modes exhibited no statistically significant differences in ARI (12.06 +/- 1.29 vs. 12.27 +/- 1.74), FRES (36.08 +/- 8.70 vs. 36.87 +/- 7.73), GFI (13.14 +/- 1.81 vs. 13.25 +/- 1.50), DISCERN scores (38.08 +/- 6.55 vs. 38.35 +/- 6.50) and EQIP (47.92 +/- 6.84 vs. 48.35 +/- 5.56) (p>0.05). These findings indicate that ChatGPT 4.0 consistently produces information of comparable complexity and quality across different inquiry modes. Conclusions This study found that ChatGPT-generated medical information on "varicocele" demonstrates consistent quality and readability across different modes, highlighting its potential for stable healthcare information provision. However, the content's complexity poses challenges for general readers, and notable limitations in quality and reliability highlight the need for improved accuracy, credibility, and readability in AI-generated medical content.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Medicine > Department of Medicine > Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kam, Sung Chul photo

Kam, Sung Chul
의과대학 (의학과)
Read more

Altmetrics

Total Views & Downloads

BROWSE