Does the Quality and Readability of Information Related to Varicocele Obtained from ChatGPT 4.0 Remain Consistent Across Different Models of Inquiry?

Luo, Zhao; Kam, Sung Chul; Kim, Ji Yong; Hu, Wenhao; Lin, Chuan; Park, Hyun Jun; Shin, Yu Seob

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Does the Quality and Readability of Information Related to Varicocele Obtained from ChatGPT 4.0 Remain Consistent Across Different Models of Inquiry?

Authors: Luo, Zhao; Kam, Sung Chul; Kim, Ji Yong; Hu, Wenhao; Lin, Chuan; Park, Hyun Jun; Shin, Yu Seob

Issue Date: Jan-2026

Publisher: 대한남성과학회

Keywords: Comprehension; Infertility; Large language models; Varicocele

Citation: The World Journal of Men's Health, v.44, no.1, pp 161 - 170

Pages: 10

Indexed: SCIE
SCOPUS
KCI

Journal Title: The World Journal of Men's Health

Volume: 44

Number: 1

Start Page: 161

End Page: 170

URI: https://scholarworks.gnu.ac.kr/handle/sw.gnu/79115

DOI: 10.5534/wjmh.240331

ISSN: 2287-4208
2287-4690

Abstract: Purpose There is a growing tendency of individuals resorting to Chat-Generative Pretrained Transformer (ChatGPT) as a source of medical information on specific ailments. Varicocele is a prevalent condition affecting the male reproductive system. The quality, readability, and consistency of the information related to varicocele that individuals obtain through interactive access to ChatGPT remains uncertain. Materials and Methods This study employed Google Trends data to extract 25 trending questions since 2004. Two distinct inquiry methodologies were employed with ChatGPT 4.0: repetition mode (each question repeated three times) and cyclic mode (each question input once in three consecutive cycles). The generated texts were evaluated according to a number of criteria, including the Automated Readability Index (ARI), the Flesch Reading Ease Score (FRES), the Gunning Fog Index (GFI), the DISCERN score and the Ensuring Quality Information for Patients (EQIP). Kruskal-Wallis and Mann-Whitney U tests were employed to compare the text quality, readability, and consistency between the two modes. Results The results demonstrated that the texts generated in repetition and cyclic modes exhibited no statistically significant differences in ARI (12.06 +/- 1.29 vs. 12.27 +/- 1.74), FRES (36.08 +/- 8.70 vs. 36.87 +/- 7.73), GFI (13.14 +/- 1.81 vs. 13.25 +/- 1.50), DISCERN scores (38.08 +/- 6.55 vs. 38.35 +/- 6.50) and EQIP (47.92 +/- 6.84 vs. 48.35 +/- 5.56) (p>0.05). These findings indicate that ChatGPT 4.0 consistently produces information of comparable complexity and quality across different inquiry modes. Conclusions This study found that ChatGPT-generated medical information on "varicocele" demonstrates consistent quality and readability across different modes, highlighting its potential for stable healthcare information provision. However, the content's complexity poses challenges for general readers, and notable limitations in quality and reliability highlight the need for improved accuracy, credibility, and readability in AI-generated medical content.

Files in This Item: There are no files associated with this item.

Appears in Collections: College of Medicine > Department of Medicine > Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Kam, Sung Chul photo

Kam, Sung Chul: 의과대학 (의학과)

Read more

Altmetrics

Total Views & Downloads

RSS_1.0 RSS_2.0 ATOM_1.0

Gyeongsang National University Central Library, 501, Jinju-daero, Jinju-si, Gyeongsangnam-do, 52828, Republic of Korea+82-55-772-0534

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE