1 Attachment(s)
I'm at a loss with this hypothesis testing
Attachment 26966
This is a question posted by a prof of mine, it was posted yesterday, due tomorrow, he does not answer e-mails and I don't know where to begin.
Are approved loans inclusive of those rejected by the applicant do you think?
for the first hypothesis H_0 we are making a null hypothesis asking if the real probability of a black person being approved for a loan is higher than the estimated probability (based on proportions) of the a white person being approved against an alternative hypothesis that it is less?
How do I set that up? Do I use P-values?
for the second hypothesis we are creating a hypothesis asking if the probability of that a white person is approved equal to the estimated probability that a white person is approved against the alternate hypothesis that the actual probability of a hispanic person being approved is less than the estimated probability of the white person being approved.
What does "the expected value of a Bernoulli variable is pi" mean? I assume pi is not 3.14, but how does one find it? Is there anything special one has to do to find the E(x) of a Bernoulli variable?
I find the whole thing confusing.
thanks!
Re: I'm at a loss with this hypothesis testing
Hey kingsolomonsgrave.
The expected value of a Bernoulli variable is just the probability of that variable being a 1/Yes/On etc. The pi is another way for representing the p or probability of a yes/1/etc.
Basically you are testing whether one is less than the other or not.
The first thing is to establish your test statistic and the distribution that its associated to before you can worry about p-values. (Hint: Can you for a 2-sample t-test for your hypotheses)?
Re: I'm at a loss with this hypothesis testing
I dont know how to do a two sample T-test. So far I know about Type 1 and type 2 errors and how to set up a z-score and p-score.
If I take the proportions to be the probabilities I can set up something where I set up
Probability [Z < (probability of a black person being approved - over all approval rate)/(standard error)] to get
a score of some kind. Even then, since standard error = sd/sqrt(n) what sd and n do I use? In the example above should I use
(black approval rate sd)/[sqrt(number of black applicants)]?
lastly
When setting anything up do I use percentage values like
0.6923-0.8557 or should i use the percentages themselves like
69.23-85.57?
Re: I'm at a loss with this hypothesis testing
A two sample t-test of difference of means is given by our t-statistic of:
T = (sample_mean_a - sample_mean_b)/SQRT[SEa^2 + SEb^2] ~ t_(n+m-2) where n is sample size of a, m is sample size of b and SE^2 = (Sample Variance of Observation)/n.
We use this because of Central Limit Theorem of the distribution of the means.
First you need to calculate the sample means and the Square Standard Errors for each group to get t-statistic.