tennis probabilty issue
I'm getting different answers for this tennis probem from two AI apps using simulation, chagpt and gemini. Maybe a human can settle it. Given a p to win one game what is the p to win the set? ATP rules. A tiebreak is considered as one game. I used a p win of one game=.5206 . Gemini gave me an answer and also gave me a python simulation program to do it myself below. I haven't run it on my computer it since i'm rusty. Chatgpt also said the code was fine. Still, the two AI apps keep giving consist different answers from each other . It seems more than just variance, I think. .5876 vs .5687. any thoughts?
import random
def simulate_set(p_win_game):
"""Simulates a tennis set and returns 1 if player wins, 0 if loses."""
player_games = 0
opponent_games = 0
while True:
# Simulate a game
if random.random() < p_win_game:
player_games += 1
else:
opponent_games += 1
# Check for set win conditions
if (player_games >= 6 and player_games - opponent_games >= 2):
return 1 # Player wins
elif (opponent_games >= 6 and opponent_games - player_games >= 2):
return 0 # Player loses
elif player_games == 6 and opponent_games == 6:
# Simulate tie-break (treat as one game)
if random.random() < p_win_game:
return 1 # Player wins tie-break
else:
return 0 # Player loses tie-break
def simulate_sets(p_win_game, num_simulations):
"""Simulates multiple sets and returns the probability of winning."""
player_wins = 0
for _ in range(num_simulations):
player_wins += simulate_set(p_win_game)
return player_wins / num_simulations
Parameters
p_win_game = 0.5206
num_simulations = 100000
Run the simulation
p_win_set = simulate_sets(p_win_game, num_simulations)
Output the result
print(f"Probability of winning a set (p_win_game = {p_win_game}): {p_win_set:.4f}")
12 Replies
After some time i figured out that one AI app was trying to give me the probability of winning the match not a set. So the issue is now resolved.
Sent from my SM-S916U using Tapatalk
More accurate would be to have two different probabilities: that of winning a game with/without serve respectively. Even if they average to .5206, it's wrong to average them because doing so results in a different answer.
But pretending each game is .5206 regardless of who serves, the answer would be .5581164615. The LLM's are wrong as usual, but your sim is correct. I pasted your code into tio.run, added the necessary indents, ran 10 million sims and it got .5584
The calculation is straight-forward: P(win >5 out of 10 games) + P(win exactly 5/10 games)*P(win >1 out of 3 games)
From there it's just Binomial distribution.
Note that P(x successes in at most n trials) = P(at least x successes out of a fixed n trials), so Binomial and Negative Binomial are two sides of the same coin.
More accurate would be to have two different probabilities: that of winning a game with/without serve respectively. Even if they average to .5206, it's wrong to average them because doing so results in a different answer.But pretending each game is .5206 regardless of who serves, the answer would be .5581164615. The LLM's are wrong as usual, but your sim is correct. I pasted yo
cool, thanks.
Clarify . Did you get .55811 from the 100,000 trial original code or from the binomial dist ? 10 m trials seems like plenty for precision, btw.
I'm interested in tennis betting. If I differentiated by p to win serve games vs. return games I'd use .80 and .2412 which averages to .5206 for the match favorite.
I don't show the details here but it's interesting to me that the small edge in winning one game produces a bigger edge for winning a set and a pretty big one for winning a match.
.55811 was from calculation
Ok so let's recalculate using .8 and .2412. The tie-break is separate; we can calculate it from the point probabilities (which we can reverse engineer from the game probabilities), but for now I'll approximate it with .5206 like before.
Revising my previous math: P(win >5 out of 10 games) + P(win exactly 5/10 games)*[.8*.2412 + (.8*.7588 + .2*.2412)*.5206]
Now instead of Binomial probabilities to fill in, we have "Poisson Binomial" which would be a pain in the ass to compute manually, but code for it is available (mostly in Python) if you search for that or "poibin".
With my own function I get P(win >5/10)≈.408097 and P(win exactly 5/10)≈.302468, implying P(win set) = .5696447862
To check that, I tweaked your sim:
[CODE]import random
def simulate_set(p_s, p_r, p_t):
player_games = 0
opponent_games = 0
while True:
if (player_games + opponent_games)%2==0:
q = p_s
else:
q = p_r
if random.random() < q:
player_games += 1
else:
opponent_games += 1
if (player_games >= 6 and player_games - opponent_games >= 2):
return 1 # Player wins
elif (opponent_games >= 6 and opponent_games - player_games >= 2):
return 0 # Player loses
elif player_games == 6 and opponent_games == 6:
if random.random() < p_t:
return 1 # Player wins tie-break
else:
return 0 # Player loses tie-break
def simulate_sets(p_s, p_r, p_t, num_simulations):
player_wins = 0
for _ in range(num_simulations):
player_wins += simulate_set(p_s, p_r, p_t)
return player_wins / num_simulations
p_s = 0.8
p_r = .2412
p_t = .5206
num_simulations = 10000000
p_win_set = simulate_sets(p_s, p_r, p_t, num_simulations)
print(f"Probability of winning a set: {p_win_set:.4f}") [/CODE]10M trials gave an output of .5696
Fabulous. Another non-intuitive result. Two players who have an same known overall chance of winning an unspecified game in match actually have different chances of winning a set . The one who holds serve more often is the favorite.
Also, humans are still better at this figuring this out than AI.
Two players who have an same known overall chance of winning an unspecified game in match actually have different chances of winning a set . The one who holds serve more often is the favorite.
More precisely, assuming P(win return) If P(A wins game vs B)=P(C wins game vs D) but A holds serve more often, then P(A wins match vs B) > P(C wins match vs D). But I've far from proven it and won't yet say it's true. It's true for the one calculated case. Even if it's true, if you were talking about when A plays C for the first time and inferring A is the favorite based on A's higher hold % vs other players, we can't draw that conclusion. It would be something to test empirically. A's hold and return % will be different vs C than vs other players, as will C's vs A. Now let's derive the point %'s from the hold % and return %. P(win game) = P(win 4 points in <7 serves) + P(deuce)*P(reach +2 before -2) = .8 or .2412 Letting p=P(win point), P(win 4 in <7) = p⁴[1 + 4(1−p) + 10(1−p)²] P(reach +2 before -2) = p²/[p² + (1−p)²] All told: P(win game) = p⁴[1 + 4(1−p) + 10(1−p)²] + 20p⁵(1−p)³/[p² + (1−p)²] Set that to .8 and solve to get p_serve = .6329154229 Now we can calculate P(win tiebreak). To win, the player can either win 7 points in at most 12, or reach a 6-6 tie and then reach +2 before -2. P(win 7 in at most 12) = Poisson Negative Binomial CDF = .4137478081 So, P(win tiebreak) = .5373 With that, we can make yesterday's equation for P(win set) more accurate: replace .5206 with .5373 to obtain Also, humans are still better at this figuring this out than AI. Bear in mind that LLM's aren't the only form of AI and LLM's weren't designed for math. LLM's aren't figuring anything out, they're regurgitating MathStackExchange posts etc to provide a plausible-sounding response, aka bullshitting. They have no concept of a right/wrong answer, they're just carrying on a conversation. You might get a correct response if your exact question was asked on the internet before.
This time it's more convenient to leave it as Negative Binomial, but the curious reader can verify that the above also equals P(win >3/6).
Set to .2412 and solve to get p_return = .38870114565
P(6-6 tie) = Poisson Binomial PMF = .232624676
P(+2 before -2) = (p_serve)(p_return) / [(p_serve)(p_return) + (1−p_serve)(1−p_return)] = .5229761436
P(win set) = .572955
P(win set) = .572955
From that, we have:
P(win best-of-3 sets) = .608656
P(win best-of-5 sets) = .634862
thanks.
AI suggested this formula when assuming the same nominal p for each game or tiebreak:
p(win 3 set match ) =3p_wset**2-2p_wset**3
then for p_wset=.572955, p(3 set match )=0.6088
close to your result but not exactly the same.
I haven't looked into the 5 set match solution yet.
Another issue that that may be of interest is why I started using p(win game)=0.5206 in the original post. It was the result I got from trying to estimate which value of P(win game) would lead to the favorite winning exactly one more total games than his opponent. This value helps in understanding how matches are handicapped.
That's correct and it does equal .608656. What software or calculator is giving you .6088?
That formula is the Binomial CDF but using inclusion-exclusion. 3p^2 triple-counts the chance of winning all 3 sets, so p^3 gets subtracted twice so as to be counted once. Of course, a player can't win 3-0 since the match would already be over at 2-0, but if one chooses to reframe the problem from Negative Binomial to plain Binomial, one must pretend 3 sets are played unconditionally.
I haven't looked into the 5 set match solution yet.
I'm not sure if best-of-5's are still a thing, so I included that just in case. It's the same calculation: plug into a Binomial calculator. In a spreadsheet it's called binomdist, on a graphing calculator it's binomcdf, in the SciPy package it's stats.binom.cdf
Another issue that that may be of interest is why I started using p(win game)=0.5206 in the original post. It was the result I got from trying to estimate which value of P(win game) would lead to the favorite winning exactly one more total games than his opponent. This value helps in understanding how matches are handicapped.
So the game spread was -1 and you worked backwards from that to derive .5206?
Weird world of AI
When I use a basic web calculator I get .608656 as you did which is the correct value I believe. But Chatgpt does this:
Pmatch=3(0.572955)**2−2(0.572955)**3 =3(0.328278)−2(0.188034)=0.984834−0.376068=0.6088
it cubed to a slightly different value
So today I asked it why? it says:
"Based on earlier approximated square, not full chained precision
Likely a rounding or pre-rounded value in the pipeline"
then it re-calculated using higher precision and got the more precise correct answer as above. Not very confidence building.
yes, I used a backwards python simulation. I didn't think the "difference in total games won" followed a known distribution model very well as far as I could tell. So that was the quick and dirty alternative. I'm not totally confident in the result so I plan to take another look at it later.
Here's how I'd go about it.
The spread's odds, after removing the vig, tell us the conditional probability of covering given we don't push, which I'll call y.
That is to say, P(cover)/[1−P(push)] = y
We can deduce a more convenient form: P(cover) + y*P(push) = y
For this post I'll pretend serve doesn't matter and we're solving for a single p. In my next post post I'll attempt the full problem.
P(win 6 in <11) + P(win 5/10)*p² + P(win 5/10)*2yp²(1−p) = y
p⁶[1 + 6(1−p) + 21(1−p)² + 56(1−p)³ + 126(1−p)⁴] + 252p⁵(1−p)⁵[(1+2y)p²−2yp³] = y
(The coefficients 21, 56, 126 are 7C2, 8C3 and 9C4.)
If both sides of the spread were paying -110 then y=1/2 and p=.511088
If the line was -1(-120) / +1(100) then y=.5217 and p=.518407