I have a scenario where:
On a twitter analysis,
30% of accounts are empty
50% have not twitted in a week
10% of accounts do most, 75%, of tweeting, and 30% of these are from bots
60% are female (this proportion is thee same for all categories above)
I have a scenario where:
On a twitter analysis,
30% of accounts are empty
50% have not twitted in a week
10% of accounts do most, 75%, of tweeting, and 30% of these are from bots
60% are female (this proportion is thee same for all categories above)
For part 1, you are told that your sample space consists of the top 75% of all Tweeters, which is 0.10 of total Tweeters. We also know, that of this proportion of Tweeters, 0.30 of them are actually Bots (there may be bots elsewhere, but we're not concerned with that). So of all Tweeters, 0.03 of them are bots and in the top 75% of Tweeters. Thus the probability of an actual user would be the compliment to the probability that it is a bot given it's in the Top 75%:
For Part 2, an account belonging to a male, and it being empty are not independent events. Are you comfortable with the idea of "If. . .given" statements to determine sample space (which is all a "given" statement is - a reevaluation of the initial sample space, which in this case is ALL of Twitter Accounts).
Yea. I suck at LATEX. The first thing I do with "given" statements, is determine what the sample space is. In the case of Part 1, you are given that the sample space is the Top 75% of all Twitter uses - which is 0.1 of the entire sample space. So 0.1 becomes our new sample space. Now we want to find out within this sample space, what accounts meet a certain condition. We are told that 0.1 of all Twitter accounts are Top 75% of of that 0.1, 0.3 are bots. Which means 0.03 Twitter accounts are Bots and in the top 75%. which leads us to the form of a given statement: (0.1)(0.3)/(0.1) or more generally.