Title | Double Sparsity Kernel Learning with Automatic Variable Selection and Data Extraction. |
Publication Type | Journal Article |
Year of Publication | 2018 |
Authors | Chen, Jingxiang, Chong Zhang, Michael R. Kosorok, and Yufeng Liu |
Journal | Stat Interface |
Volume | 11 |
Issue | 3 |
Pagination | 401-420 |
Date Published | 2018 |
ISSN | 1938-7989 |
Abstract | Learning in the Reproducing Kernel Hilbert Space (RKHS) has been widely used in many scientific disciplines. Because a RKHS can be very flexible, it is common to impose a regularization term in the optimization to prevent overfitting. Standard RKHS learning employs the squared norm penalty of the learning function. Despite its success, many challenges remain. In particular, one cannot directly use the squared norm penalty for variable selection or data extraction. Therefore, when there exists noise predictors, or the underlying function has a sparse representation in the dual space, the performance of standard RKHS learning can be suboptimal. In the literature, work has been proposed on how to perform variable selection in RKHS learning, and a data sparsity constraint was considered for data extraction. However, how to learn in a RKHS with both variable selection and data extraction simultaneously remains unclear. In this paper, we propose a unified RKHS learning method, namely, DOuble Sparsity Kernel (DOSK) learning, to overcome this challenge. An efficient algorithm is provided to solve the corresponding optimization problem. We prove that under certain conditions, our new method can asymptotically achieve variable selection consistency. Simulated and real data results demonstrate that DOSK is highly competitive among existing approaches for RKHS learning. |
DOI | 10.4310/SII.2018.v11.n3.a1 |
Alternate Journal | Stat Interface |
Original Publication | Double sparsity kernel learning with automatic variable selection and data extraction. |
PubMed ID | 30294406 |
PubMed Central ID | PMC6168218 |
Grant List | P01 CA142538 / CA / NCI NIH HHS / United States R01 GM126550 / GM / NIGMS NIH HHS / United States |
Double Sparsity Kernel Learning with Automatic Variable Selection and Data Extraction.
Project: