I´m trying to build a credit risk model (an application scorecard) based on the logistic regression for my master thesis. A have a real data from 2006 to 2010 including crisis. The whole model has a power 60% which is good. But problem is that power in particular years is decreasing under the acceptable threshold. In particular years a have approximately 1500, 1500, 1500, 1000, 500 cases. In the 2 last years is less ratio between number of bad clients and total cases in that year (default rate) due to restriction in approving process and less number of accepted clients. It´s clear that the power in time will be decreasing (due to a decreasing amount of data and default rate), but it´s under 40% (not acceptable).
And I have to find a solution how to improve the power in the recent years (2 last years when the new approval conditions have been set up).
How would you solve this issue? I´m thinking about assigning weights to particular years (periods) to express importance of particular years with respect to approval process and crisis.
(values of weights are only for illustration)
I hope that this could help to improve the power in the recent years and be more realistic/suitable for future approval process.
Any opinion? Is this method correct?
I use SPSS Statistics and there is an option of WEIGHT CASES BY ... Would it be correct to use weighting by the variable of weights and to use logistic regression afterwards? Will weighting have any influence on logistic regression and its outcome?
Thanks for any advise.