Hello guys, I'm seeking your help regarding a statistics exercise that was given to me in a job application process in a bank. The position is for CRM analysis and here is the problem (you can find attached the data file):
I was given 10.000 data entries wich ultimately culminate in a variable Y wich can be 0 or 1. 1 being the rare score that is the interest of the exercise. Additionally we also have 50 demographic factors and 50 behavioral factors that influence the variable Y.
Additionally they give you 4.000 data entries to use as the population for this exercise with the same demographic and behavioral factos, but with no score for the Y variable, wich we need to obtain and select the 800 best cases ("1s").
I was thinking to run a Logistic Regression on Excel, but somehow it doesn't give me valid results :/ and I'm kinda stuck, and I don't know if I need to do something to the 100 independent variables.