Prediction and interpretation of pathogenic bacteria occurrence at a recreational beach using data-driven algorithms
  • Jang, Jiyi
  • Abbas, Ather
  • Kim, Hyein
  • Rhee, Chaeyoung
  • Shin, Seung Gu
  • 외 3명
Citations

WEB OF SCIENCE

8
Citations

SCOPUS

8

초록

Recreational beaches face a threat from pathogenic bacteria that harbor antibiotic resistance genes (ARGs). To predict bacterial occurrence and comprehend their non-linear relationship with hydrometeorological features, advanced machine- and deep-learning algorithms were employed. These algorithms include regression trees (RT), as well as interpretable deep-learning algorithms such as the ‘Input Attention-Long Short-Term Memory (IA-LSTM)’ and ‘Temporal Fusion Transformer (TFT)’. Our focus was on predicting the occurrence of Prevotella, a prevalent pathogenic bacterium found at the beaches. Utilizing model-dependent and model-agnostic interpretation methods, which encompass sensitivity analysis, permutation, and the SHapley Additive exPlanations (SHAP) importance, we evaluated model behavior. RT-based algorithms exhibited predictive capabilities comparable to those of IA-LSTM and TFT, achieving validation Nash–Sutcliffe efficiencies of 0.93, 0.94, and 0.96, respectively. However, the deep-learning algorithms (IA-LSTM and TFT) are surpassed in terms of interpretability. The model-dependent interpretation method identified heavy precipitation as a pivotal hydrometeorological feature linked to increased Prevotella occurrence. Notably, the IA-LSTM identified Prevotella as a potential host for the sulfonamide resistance gene (sul1), suggesting the potential of Prevotella as an indicator for sul1. This research, leveraging interpretable data-driven models, advances our understanding of the hydrometeorological features influencing the occurrence of pathogenic bacteria and the prevalence of ARGs at the beach, and enhances predictive capabilities for bacterial occurrence. © 2023

키워드

Cumulative importance featuresDeep-learning algorithmsInterpretable modelsSimulationStrategy modelingGLOBAL SENSITIVITY-ANALYSISESCHERICHIA-COLIMODELSINDEXESWATERS
제목
Prediction and interpretation of pathogenic bacteria occurrence at a recreational beach using data-driven algorithms
저자
Jang, JiyiAbbas, AtherKim, HyeinRhee, ChaeyoungShin, Seung GuChun, Jong AhnBaek, SangsooCho, Kyung Hwa
DOI
10.1016/j.ecoinf.2023.102370
발행일
2023-12
유형
Article
저널명
Ecological Informatics
78