Detailed Information

Cited 23 time in webofscience Cited 29 time in scopus
Metadata Downloads

Hi-LASSO: High-Dimensional LASSOopen access

Authors
Kim, YoungsoonHao, JieMallavarapu, TejaswiniPark, JoongyangKang, Mingon
Issue Date
Apr-2019
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
Hi-LASSO; LASSO; random LASSO; high-dimensional data; variable selection
Citation
IEEE Access, v.7, pp 44562 - 44573
Pages
12
Indexed
SCIE
SCOPUS
Journal Title
IEEE Access
Volume
7
Start Page
44562
End Page
44573
URI
https://scholarworks.gnu.ac.kr/handle/sw.gnu/73124
DOI
10.1109/ACCESS.2019.2909071
ISSN
2169-3536
Abstract
High-throughput genomic technologies are leading to a paradigm shift in research of computational biology. Computational analysis with high-dimensional data and its interpretation are essential for the understanding of complex biological systems. Most biological data (e.g., gene expression and DNA sequence data) are high-dimensional, but consist of much fewer samples than predictors. Such high-dimension, low sample size (HDLSS) data often cause computational challenges in biological data analysis. A number of least absolute shrinkage and selection operator (LASSO) methods have been widely used for identifying biomarkers or prognostic factors in the field of bioinformatics. The LASSO solution has been improved through the development of the LASSO derivatives, including elastic-net, adaptive LASSO, relaxed LASSO, VISA, random LASSO, and recursive LASSO. However, there are several known limitations of the existing LASSO solutions: multicollinearity (particularly with different signs), subset size limitation, and the lack of the statistical test of significance. We propose a high-dimensional LASSO (Hi-LASSO) that theoretically improves a LASSO model providing better performance of both prediction and feature selection on extremely high-dimensional data. The Hi-LASSO alleviates bias introduced from bootstrapping, refines importance scores, improves the performance taking advantage of global oracle property, provides a statistical strategy to determine the number of bootstrapping, and allows tests of significance for feature selection with appropriate distribution. The performance of Hi-LASSO was assessed by comparing the existing state-of-the-art LASSO methods in extensive simulation experiments with multiple data settings. The Hi-LASSO was also applied for survival analysis with GBM gene expression data.
Files in This Item
There are no files associated with this item.
Appears in
Collections
자연과학대학 > Dept. of Information and Statistics > Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Park, Joongyang photo

Park, Joongyang
자연과학대학 (정보통계학과)
Read more

Altmetrics

Total Views & Downloads

BROWSE