상세 보기
- 장민규;
- 최상민
WEB OF SCIENCE
0SCOPUS
0초록
Large language models (LLMs) can exhibit reliability issues—most notably hallucinations—that undermine their suitability for high-stakes applications. We investigate a simple yet effective prompt-engineering strategy to improve response consistency and stability without modifying model internals. Our central technique prompts models to provide the reasoning evidence their answers, aiming for task-agnostic, broadly applicable gains in consistency. We compare two engineered prompt against one non-engineered baselines across three models (GPT-4o-mini, Llama-3.1-8B, and Gemini-2.0-Flash-Lite) and four datasets (BoolQ, QNLI, MRPC, SST-2). For every model–dataset–prompt combination, we run 10 trials and evaluate Accuracy, Precision, Recall, F1 score, and the standard deviation of F1. The engineered prompt yields the lowest F1 standard deviation across the full experimental suite, indicating markedly improved response stability; on several datasets, it also achieves substantial F1 gains over non-engineered prompts. These results suggest that explicitly requesting post-answer reasoning is a practical, cost-efficient, and broadly applicable method for reducing output variability and enhancing overall reliability in LLMs. Code: https://anonymous.4open.science/r/LLM_consitency-B0ED/README.md
키워드
- 제목
- 프롬프트 엔지니어링 기법을 활용한 LLM의 응답 안정성 및 일관성 향상에 관한 연구
- 제목 (타언어)
- Research on Improving Response Reliability and Consistency in LLM Using Prompt Engineering Techniques
- 저자
- 장민규; 최상민
- 발행일
- 2025-10
- 유형
- Y
- 저널명
- 한국산업융합학회논문집
- 권
- 28
- 호
- 5
- 페이지
- 1517 ~ 1527