Designing penalty functions in high dimensional problems: The role of tuning parameters.

TitleDesigning penalty functions in high dimensional problems: The role of tuning parameters.
Publication TypeJournal Article
Year of Publication2016
AuthorsChen, Ting-Huei, Wei Sun, and Jason P. Fine
JournalElectron J Stat
Date Published2016

Various forms of penalty functions have been developed for regularized estimation and variable selection. Screening approaches are often used to reduce the number of covariate before penalized estimation. However, in certain problems, the number of covariates remains large after screening. For example, in genome-wide association (GWA) studies, the purpose is to identify Single Nucleotide Polymorphisms (SNPs) that are associated with certain traits, and typically there are millions of SNPs and thousands of samples. Because of the strong correlation of nearby SNPs, screening can only reduce the number of SNPs from millions to tens of thousands and the variable selection problem remains very challenging. Several penalty functions have been proposed for such high dimensional data. However, it is unclear which class of penalty functions is the appropriate choice for a particular application. In this paper, we conduct a theoretical analysis to relate the ranges of tuning parameters of various penalty functions with the dimensionality of the problem and the minimum effect size. We exemplify our theoretical results in several penalty functions. The results suggest that a class of penalty functions that bridges and penalties requires less restrictive conditions on dimensionality and minimum effect sizes in order to attain the two fundamental goals of penalized estimation: to penalize all the noise to be zero and to obtain unbiased estimation of the true signals. The penalties such as SICA and Log belong to this class, but they have not been used often in applications. The simulation and real data analysis using GWAS data suggest the promising applicability of such class of penalties.

Alternate JournalElectron J Stat
Original PublicationDesigning penalty functions in high dimensional problems: The role of tuning parameters.
PubMed ID28989558
PubMed Central IDPMC5628772
Grant ListP01 CA142538 / CA / NCI NIH HHS / United States
R01 GM105785 / GM / NIGMS NIH HHS / United States