The thing is that this is probably not an unbiased sample. I doubt the OP would be trying to calculate a confidence interval for his all-ins unless he had an exceptionally cold run of cards.
I did not mistype the formula. To get the confidence interval around the total expected win 8500, multiply by , not divide. To get the confidence interval around the mean expected win 8500/450, divide by
I calculate , so your 80% confidence interval is . Thus it is likely the difference between your actual and expected win in not due to chance.
An example:
I've run my program against a 40k hands database and filtering the ALL INs after potsize>80BB
I've got:
Analyzed games: 154
aggregate should_win= 5527.73
aggregate actually_win=4647.30
tau = 1.98 SD = 36.30
Chance of not being rigged: 5.00%
I take z = 1.96 that coresponds to a confidence level of 95%
This means there is a 95% chance that S satisfies:
5527.73 - 1.96*36.30*sqrt(154) <= S <= 5527.73 + 1.96*36.30*sqrt(154) <=>
5527.73 - 871.38 <= S <= 5527.73 + 871.38 <=>
4656.35 <= S <= 6399.11
But S (actually_win) is 4647.30 in our case.
This means there's at least 95% chance to be rigged in big pots.
The interesting part happens when I run the program for small pots ( < 80BB )
Analyzed games: 342
aggregate should_win= 3632.40 aggregate actually_win=3654.20
Average pot:16.80
As you can see the difference 3654.20-3632.40 is very small.
It seems strange things happens when the pot gets big.
How are you calculating the win %? I'm assuming you wrote a program that calculates the pot equity given the board cards and hole cards. There may be an error with your program.
Having run similar experiments in the past, I feel like your earlier finding of SD = 14.2 to be extremely low. I'd have guessed a range of 8500 +- 3000 for an 80% interval after only 450 hands.
36.30 is the SD from 80+ BB pots.
For 80- BB pots is nothing "suspect" because my program doesn't display SD and stats Z values < 1.29 (80% confidence interval).
So if it is not at least 80% "rigged suspect" I ignore it.
I'm a casual poker player for 5 months ....playing 6 tables of 0.5$ NL usually.
I agree that calculating a confidence interval after you've seen the results (a bad run) violates the rules of hypothesis testing. But it provides a good reference point: just how bad was that run of luck? And it allows the following kind of analysis.
Good analysis. It looks to me like when your opponents are betting big, leading to a big pot, you are not giving them proper credit for having a big hand. Are you adjusting your expected win for the size of the bets?
Good question.
The SD looks reasonable to me. When I saw that SD I did the following thought experiment. Suppose every pot was 20 and the win percentage was .5, so the average win was 10. The SD would beHaving run similar experiments in the past, I feel like your earlier finding of SD = 14.2 to be extremely low. I'd have guessed a range of 8500 +- 3000 for an 80% interval after only 450 hands.
which equals the average win. Since SD 14.2 was not too far from the average win of 16.7, it seemed reasonable.
Whether or not he is making bad calls or bad all-ins doesn't change the analysis. Basically what he's doing is taking a situation like 8c8d vs. 7h6h on a flop like 8h9hAc. Say he's holding the 76 and he moves all-in on that board. In this situation with two cards to come, 88 will win 58% of the time and 76 will win 42%. So if the pot is $100 (excluding the amount of his all-in bet), he expects to win $42. If he's making poor choices in big pots or he bluffs too much, it should result in a higher than average SD.
In general, NL poker is a very high variance game and I'd expect the SD to be a multiple of mean. I track a different stat, the ratio of SD of hourly win-rate to average win-rate, and it is 24:1.
Thanks for clearing that up. I didn't know exactly what he was doing. But it doesn't change the calculations.
I didn't go into this before, but the overall SD can be calculated exactly and there is no need to estimate it. For your example hand, the variance is
The standard deviation for that hand is then , still in the ballpark of the expected win of 42. But to calculate the overall standard deviation, we need to use the variances for each hand. The total variance is the sum of the variances for each hand since the hands are independent. The total standard deviation is the square root of the total variance. To respond to your other post, you are right the SD needs to be calculated separately for each sample of hands analyzed.
If he is doing what you describe, then the size of the pot should not matter. That leads me to think there is something else going on.
That is a totally different stat, but one you need to know to calculate how much bankroll you'll need to withstand the day-to-day fluctuations.In general, NL poker is a very high variance game and I'd expect the SD to be a multiple of mean. I track a different stat, the ratio of SD of hourly win-rate to average win-rate, and it is 24:1.
When one of the opponents (me or the other guy) moves all in there's no decisions to be made by any of us. It's just computing my chance of winning the resulting pot ( i also take care of overbets & stuff).
Sorry. The SD of 14.2 was obtain using wrong formula.
Right now I'm using
S = sqrt (( sum after i of pow( R[i]-M , 2) ) / (N-1)
and i got SD = 36.30 for big pots.
The average pot size for big pots was 57$
As I said in previous posts it doesn't matter if I make poor choices in big pots.When I'm against one other player and one of us moves all and the other guy calls it's only math. no decisions can affect after the outcome after this moment.
Whether or not he is making bad calls or bad all-ins doesn't change the analysis. Basically what he's doing is taking a situation like 8c8d vs. 7h6h on a flop like 8h9hAc. Say he's holding the 76 and he moves all-in on that board. In this situation with two cards to come, 88 will win 58% of the time and 76 will win 42%. So if the pot is $100 (excluding the amount of his all-in bet), he expects to win $42. If he's making poor choices in big pots or he bluffs too much, it should result in a higher than average SD.
I'm trying to analyze bigger databases.