Cited 0 time in
Stochastic LASSO for extremely high-dimensional genomic data
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Baek, Beomsu | - |
| dc.contributor.author | Jo, Jongkwon | - |
| dc.contributor.author | Kang, Mingon | - |
| dc.contributor.author | Kim, Youngsoon | - |
| dc.date.accessioned | 2026-02-23T06:00:08Z | - |
| dc.date.available | 2026-02-23T06:00:08Z | - |
| dc.date.issued | 2026-01 | - |
| dc.identifier.issn | 2045-2322 | - |
| dc.identifier.uri | https://scholarworks.gnu.ac.kr/handle/sw.gnu/82450 | - |
| dc.description.abstract | Accurate identification of significant features in high-dimensional data is indispensable in high-throughput genomic analysis and association studies. Least Absolute Shrinkage and Selection Operator (LASSO) and its derivatives have been widely adapted to discover potential biomarkers as a feature selection scheme in various biological systems. Recently, bootstrap-based LASSO models, such as Random LASSO and Hi-LASSO, have been effective solutions for extremely high-dimensional but low sample size (EHDLSS) genomic data. However, the bootstrap-based LASSO models still have several drawbacks, such as multicollinearity within bootstrap samples, missing predictors in draw, and randomness in predictor sampling. To tackle the limitations, we propose a new bootstrap-based LASSO, named Stochastic LASSO, that effectively reduces multicollinearity in bootstrap samples and mitigates randomness in predictor sampling, resulting in remarkably outperforming benchmarks in feature selection and coefficient estimation. Furthermore, Stochastic LASSO provides a two-stage t-test strategy for selecting statistically significant features. The performance of Stochastic LASSO was assessed by comparing the existing benchmark models in extensive simulation experiments. In the simulation experiments, Stochastic LASSO consistently showed significant improvements in performance compared to the state-of-the-art LASSO models for feature selection, coefficient estimation, and robustness. We also applied Stochastic LASSO for the gene expression data of publicly available TCGA cancer datasets and identified statistically significant genes associated with survival month prediction. The source code is publicly available at: https://github.com/datax-lab/StochasticLASSO. | - |
| dc.language | 영어 | - |
| dc.language.iso | ENG | - |
| dc.publisher | Nature Publishing Group | - |
| dc.title | Stochastic LASSO for extremely high-dimensional genomic data | - |
| dc.type | Article | - |
| dc.publisher.location | 영국 | - |
| dc.identifier.doi | 10.1038/s41598-026-35273-3 | - |
| dc.identifier.scopusid | 2-s2.0-105029545491 | - |
| dc.identifier.wosid | 001683242100008 | - |
| dc.identifier.bibliographicCitation | Scientific Reports, v.16, no.1 | - |
| dc.citation.title | Scientific Reports | - |
| dc.citation.volume | 16 | - |
| dc.citation.number | 1 | - |
| dc.type.docType | Article | - |
| dc.description.isOpenAccess | N | - |
| dc.description.journalRegisteredClass | scie | - |
| dc.description.journalRegisteredClass | scopus | - |
| dc.relation.journalResearchArea | Science & Technology - Other Topics | - |
| dc.relation.journalWebOfScienceCategory | Multidisciplinary Sciences | - |
| dc.subject.keywordPlus | VARIABLE SELECTION | - |
| dc.subject.keywordPlus | MESSENGER-RNAS | - |
| dc.subject.keywordPlus | EXPRESSION | - |
| dc.subject.keywordPlus | UROCORTIN | - |
| dc.subject.keywordPlus | GENES | - |
| dc.subject.keywordAuthor | Stochastic LASSO | - |
| dc.subject.keywordAuthor | LASSO | - |
| dc.subject.keywordAuthor | High-dimensional data | - |
| dc.subject.keywordAuthor | Variable selection | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
Gyeongsang National University Central Library, 501, Jinju-daero, Jinju-si, Gyeongsangnam-do, 52828, Republic of Korea+82-55-772-0534
COPYRIGHT 2022 GYEONGSANG NATIONAL UNIVERSITY LIBRARY. ALL RIGHTS RESERVED.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.
