### 基于随机化适应性Lasso的高维变量选择

• 出版日期:2021-01-25 发布日期:2021-01-26

### Selection of High Dimensional Variables Based on Randomized Adaptive Lasso

Yan Maobo Tian Maozai

Abstract: The number of variables selected into the model with penalty variable selection methods such as Lasso is limited by the sample size. In the literature, the method of coefficient significance of variables has abandoned the information contained in variables that are not selected into the model. In this paper, we use the randomization bootstrap method to obtain the weight of variables when the number of variables is larger than the sample size (p>n). In order to get the final estimation result, the conditional distribution of the selected event is constructed and the variable whose coefficient is not significant is eliminated when calculating the adaptive Lasso. The innovation of this paper is that the proposed method breaks through the limitation of the number of variables that adaptive Lasso can choose. When the observed data contain a large number of noise variables, it can effectively identify the real variables and noise variables. Compared with the existing penalty variable selection methods, the simulation studies in various scenarios show the superiority of the proposed method in the above two problems. The data of NCI-60 Cancer Cell Line are analyzed in the empirical study, and the results are much better than those in the previous literature.