Inference from SD
So far I thought that if you assume an EV of 5bb/100 and a SD of 100bb/100 you'd observe in ~70% of cases a win/loss in the interval of -95bb/100 and 105bb/100, i. e. 1SD in each direction of your EV. However, the depiction of a SD as 100bb/100 is apparently read differently from how I do it. According to primedope, said inputs yield a 70% confidence interval of -1.32bb/100 and 11.32bb/100.
Can someone help me understand this correctly?
8 Replies
First let us define standard deviation:
Standard deviation is a statistical measure that quantifies the amount of variation or dispersion in a set of data points. Essentially, it tells you how spread out the data is around the mean (average) value.
^ in poker convention is to have units of bb/100 for this value.
Next let us define confidence interval:
A confidence interval is a statistical range used to estimate the true value of a population parameter (like a mean or proportion) with a specified level of confidence. In simpler terms, it provides a range of plausible values for a quantity based on sample data, along with the likelihood that this range contains the true value.
For example:
So, you input a confidence interval % (70% and 95% are the defaults for primedope), a sample n (which is hands/100), and a standard deviation (calculated by your database software or assumed to be ~100 bb/100 for 6-max NLH and a win rate of whatever). The win rate is your mean... basically your expected return per 100 hands on average.
For a specific confidence interval % you're going to have a value z which you can look up in a table for whatever confidence interval % you want to calculate... Then given sample size (n), standard deviation (sigma), and mean (x-bar), it will spit out a value plus or minus the mean stating that for that sample if you repeated it infinitely about -insert confidence interval %- of the values (means/winrates in this case) would fall within that range.

Let me know if this makes sense... alternatively you can just do some lessons for confidence intervals on khanacademy or watch a couple youtube videos and do a worksheet. I imagine someone can learn this pretty quickly in an afternoon or two.
But usually when they say, for instance, the mean height of males in a certain town is 175cm with a SD of 10cm, you could immediately say that if you pick a random guy from that town, in 95% of the cases he`ll be between 155cm and 195cm. No z-table necessary. Or am I wrong?
But usually when they say, for instance, the mean height of males in a certain town is 175cm with a SD of 10cm, you could immediately say that if you pick a random guy from that town, in 95% of the cases he`ll be between 155cm and 195cm. No z-table necessary. Or am I wrong?
You'd be wrong or they'd using the wrong terminology. Again just working through a chapter in a statistics book would help you understand this. You can also calculate a lot of these values by hand or in excel using a data set. It would be a good exercise to do if you truly want to understand it.
edit** for your example they are saying its +/- 2 standard deviations and there is no need for a z-table because the general rule of thumb for a 95% confidence interval in a normal distribution is +/- 2 standard deviations around the mean. This statement applies to the distribution, not to individual cases. Saying "95% of the time, a random person will fall in this range" is a practical simplification. In reality, individual data points are subject to randomness, but over many samples, you’d observe about 95% of the population within this range.
But usually when they say, for instance, the mean height of males in a certain town is 175cm with a SD of 10cm, you could immediately say that if you pick a random guy from that town, in 95% of the cases he`ll be between 155cm and 195cm. No z-table necessary. Or am I wrong?
You'd be wrong or they'd using the wrong terminology. Again just working through a chapter in a statistics book would help you understand this. You can also calculate a lot of these values by hand or in excel using a data set. It would be a good exercise to do if you truly want to understand it.edit** for your example they are saying its +/- 2 standard deviations and there is n
Okay, reading this again I am understanding the confusion. Your confusion is arising from the sample size n... for a sample of 1 its going to be ~ +/- 2 std dev.

I'll message statmanhal. He can explain it better than me with a 100% certainty, lol.
I guess you're right, I just looked it up. Thanks for claryfying!
To add one thing - would you at first sight have considered these intervals to be realistic in a real-life context. I'm not debating that they're mathematically correct. They just don't seem anywhere near what I'd expected to hear if I had asked a 5bb/100 live player about in what range is average swings over the course of roughly three hours fall. Especially as a 5bb/100 live player is so far from the typical 10bb+/100 crusher.
For instance a live NL500 player with said expected winrate of 5bb/100 could in 95% of the cases expect his results to fall somewhere in between -$38.25 and +$88.25 over ~3 hours of play? Intuitively, that sounds like a ridiculous assumption for a no-limit game with $500 Buy-In.
I guess you're right, I just looked it up. Thanks for claryfying!To add one thing - would you at first sight have considered these intervals to be realistic in a real-life context. I'm not debating that they're mathematically correct. They just don't seem anywhere near what I'd expected to hear if I had asked a 5bb/100 live player about in what range is average swings over the
3 hours? no.. 3 hours is going to be less than 100 hands for live play which is effectively a sample size of 1. STD deviation for a 9-max live game is also unlikely to be 100bb/100. Again, I think you're confusing the sample size thing.
This is more useful in determining how accurate your winrate (mean) is given a std deviation, confidence interval, and sample size. Like you'd play 25,000 hands and if your winrate is like 20bb/100, the odds of you being a losing player in that pool is ~0%. Whereas lets use your above example it says there is about a 21% chance of losing over that sample size with those constraints. So let's say you are losing, but you think you have a 5bb winrate, it would be like evaluating what the odds of that occuring are.
delete
Okay, reading this again I am understanding the confusion. Your confusion is arising from the sample size n... for a sample of 1 its going to be ~ +/- 2 std dev.I'll message statmanhal. He can explain it better than me with a 100% certainty, lol.
Yeah, I guess that's what I got wrong. I'd have to divide the SD by the square root of the sample size (in my example the SD was 100bb/100 and sample size of 25k hands would therefore be 250). And the z values of ~1, 2, and 3 yield the 68.26%, 95.46%, and 99.73% confidence intervals respectively.

With CI = X̄ Âą ZĂ(s/√n) the ~70% CI would then be 5 Âą 1x(100/√250)= 5 Âą6.325, i. e. [-1.325; 11.325]
Edit: And the other thing I got wrong was the "what can you expect to happen in 3 hours of play" type of question. With such a small sample size the ~70% CI get's much bigger (for instance over 100 hands it would be somewhere in the realm of [-95; 105] and not [-1.325; 11.325]