Statistical Modelling 24 (2) (2024), 115138
On Lasso and adaptive Lasso for non-random sample in credit scoring
Emmanuel Ogundimu,
Department of Mathematical Sciences,
Durham University,
Durham,
UK
e-mail: emmanuel.ogundimu@durham.ac.uk
Abstract:
Prediction models in credit scoring are often formulated using available data on accepted
applicants at the loan application stage. The use of this data to estimate probability of default (PD)
may lead to bias due to non-random selection from the population of applicants. That is, the PD in
the general population of applicants may not be the same with the PD in the subpopulation of the
accepted applicants. A prominent model for the reduction of bias in this framework is the sample
selection model, but there is no consensus on its utility yet. It is unclear if the bias-variance tradeoff
of regularization techniques can improve the predictions of PD in non-random sample selection
setting. To address this, we propose the use of Lasso and adaptive Lasso for variable selection and
optimal predictive accuracy. By appealing to the least square approximation of the likelihood function
of sample selection model, we optimize the resulting function subject to L1 and adaptively weighted
L1 penalties using an efficient algorithm. We evaluate the performance of the proposed approach and
competing alternatives in a simulation study and applied it to the well-known American Express credit
card dataset.
Keywords:
Adaptive Lasso, Heckman model, reject inference, non-random selection, credit risk,
bivariate copula
Downloads:
Supplementary material in PDF.
back