GTO Characteristics
I am not a GTO expert. I donβt have and never used a solver. So, with that in mind, I am listing what I believe to be 10 GTO characteristics.
Definition. A GTO strategy in a heads-up game, is one where, if both players are playing GTO, neither has any incentive to unilaterally deviate. It achieves what is known as a Nash equilibrium.
Some Heads-Up GTO Characteristics:
1. GTO is unexploitable. It does not explicitly exploit and cannot be exploited.
2. Aside from rake, two GTO players can expect to break even in the long run
3. However, not every hand is break-even. The in-position player will generally be +EV.
4. GTO inherently profits from opponent mistakes when opponent doesnβt play GTO.
5. The strategy is fixed. It does not assess villain characteristics to adjust.
6. GTO never chooses a losing strategy to achieve future gain on subsequent hands.
7. GTO often chooses a mixed strategy, each of which has the same EV.
8. GTO always chooses the strategy that maximizes EV. On a closing bet, the GTO action is the same as comparing pot odds to card odds.
9. The GTO strategy results in what is considered balanced play.
10. The GTO strategy is currently unknown but solvers can approximate it
.
Any comments, corrections, or additions?
I never find the semantics of game theory very stable or convincing, sometimes words seem to be used interchangeably to mean different things. Define 'exploit' for example. To my mind GTO is already maximal exploitation, just versus a perfect hypothetical opponent.
Also Nash can be exploited by greater (or in a timed setting, faster) approximations, as more powerful solvers can beat less powerful solvers, so to say it's 'unexploitable' is also a contradiction and misleading. It's unexploitable 'on its own terms' i guess, but that's an important caveat, and doesn't correspond to what most people think of when they think of that word.
I never find the semantics of game theory very stable or convincing, sometimes words seem to be used interchangeably to mean different things. Define 'exploit' for example. To my mind GTO is already maximal exploitation, just versus a perfect hypothetical opponent. Also Nash can be exploited by greater (or in a timed setting, faster) approximations, as more powerful solvers can
Nash cannot be exploited by definition. A solver may be exploitable, but that's a solver problem, not GTO. Too many people think solvers are GTO when in fact they are just approximations.
Only from pure mistakes.
Nash does not gain EV vs mixing mistakes, for these reasons:
5. The strategy is fixed. It does not assess villain characteristics to adjust.
7. GTO often chooses a mixed strategy, each of which has the same EV.
It only maximizes EV against a perfectly exploitative opponent.
Nash cannot be exploited by definition. A solver may be exploitable, but that's a solver problem, not GTO. Too many people think solvers are GTO when in fact they are just approximations.
Ding ding this is exactly what I'm talking about. If you define GTO as this pure, abstract, technically impossible to achieve metric, then literally any approximation or use of Nash (that has not used up all the infinite energy of the universe) is therefore not, by your own definition, GTO.
It thus becomes impossible to talk concretely about something that is ultimately indefinable (see other abstract nouns like: love; truth; vibes, etc).
Not sure what you mean by this one?
8....On a closing bet, the GTO action is the same as comparing pot odds to card odds.
Game Theory Optimisation is a superior phrase imo.
Implying an iterative process > some fictional final state. 'Optimal' implies there already exists a definitive, fixed , realisable endpoint. Whereas in reality that's almost never true outside of abstractions.
To avoid any more confusion i suggest people use the easy to remember initialism-suffix hybrid 'GTOtion' instead. Or GTOn for short. Then everything will start to make more sense.
i.e.
1. GTO is unexploitable.
Is correct. And..
2. Aside from rake, two GTOn players can expect to break even in the long run
Is also now correct.
Definition. A GTOn strategy in a heads-up game, is one where, if both players are playing GTOn, neither has (much) incentive to unilaterally deviate.
ftfy
IΓβve never seen a fully convincing proof of your statement 4. It may be true (and probably is), but in general, it is not true that a Nash Equilibrium strategy will necessarily increase in EV against an opposing non-NE strategy. Consider as a simple counter example rock, paper, scissors. ItΓβs well known that the NE strategy for RPS is mixing rock, paper and scissors with probability 1/3 for each. Suppose you face an opponent who always plays rock. If you follow the NE strategy against this opponent, you do not increase your EV - you still win 1/3, lose 1/3 and tie 1/3. Thus, for game theory in general, it is not necessarily true that opponents deviations will be profitable when playing NE - it is possible that opponents deviations will have no effect on our EV (they canΓβt decrease it Γβ thatΓβs the very definition of NE. If an opposing counterstrategy would decrease our EV, our strategy is not a NE one)
An optimal strategy against this opponent (100% paper) wins 100%. This brings up another point about GTO - itΓβs a misnomer. It is NOT optimal. Since nobody plays a pure GTO strategy (we donΓβt know exactly what it is; we only have approximations), it follows that playing a pure GTO strategy is likely not optimal.
So why bother with GTO at all then? Well, itΓβs because we need to know what it is (at least approximately) so we can recognize and exploit opponents who deviate substantially from it. We also need to know a good approximation of it so we can avoid being exploited ourselves when facing good opponents.
^^^^^ Is a very good post.
IΓ’ve never seen a fully convincing proof of your statement 4. It may be true (and probably is), but in general, it is not true that a Nash Equilibrium strategy will necessarily increase in EV against an opposing non-NE strategy. Consider as a simple counter example rock, paper, scissors. ItΓ’s well known that the NE strategy for RPS is mixing rock, paper and scisso
Because that's a mixing mistake and mixed actions have the same EV vs a fixed opponent.
See my post above.
Statmanhal, why are you writing a list of GTO Characteristics?
Because that's a mixing mistake and mixed actions have the same EV vs a fixed opponent.
See my post above.
Ob
No offense but your point then seems trivial. Non-mixed strategies are ones where the correct alternative is higher EV than any other. So basically, your point is that in a zero sum game, a mistake that yields negative EV for our opponent increases our EV. That certainly is true, but itβs no great insight, nor is it a feature unique to playing a GTO strategy. Itβs simply basic game theory; itβs the very definition of βzero sum gameβ.
ObNo offense but your point then seems trivial. Non-mixed strategies are ones where the correct alternative is higher EV than any other. So basically, your point is that in a zero sum game, a mistake that yields negative EV for our opponent increases our EV. That certainly is true, but it’s no great insight, nor is it a feature unique to playing a GTO strategy. It’s
Exactly. It's a very basic insight. That was my point. So it's not at all surprising that your example doesn't change EV.

Quote:
8....On a closing bet, the GTO action is the same as comparing pot odds to card odds.
If one knows his equity, the pot and villain bet sizes, the proper EV+ play on a closing bet( call or fold) is to compare pot odds to card odds; Call if your Card Equity > 1/(1+Pot Odds)
I believe this would be the pure GTO strategy as well.
In logic conditionals can be tricky. Given a statement of the form if A then B, there are three related statements that can be formed - if B then A (the converse), if not B then not A (the contrapositive), and if not A then not B (the contrapositive of the converse). In general the converse and the original conditional are not equivalent but must be proven seperately. The contrapositive of a conditional is logically equivalent (and therefore the converse and its contrapositive also are equivalent).
Now let A be Γβa mistake is mixingΓβ and B be Γβthe opponents EV does not increase. Then our initial assertion is if A then B - that is if a mistake is mixing then it does not increase opponents EV. That may or may not be true. The insight you offered to justify it was that if a mistake is non mixing if increases opponents EV. Symbolically this is if not-A then not-B, which is equivalent to if B then A - the converse of the original assertion. Yes, this converse is trivially true based on the definition of zero sum, but it does not show anything about the original assertion. It still remains to be proven that a mixing mistake cannot increase opposing EV in general (although that is the case for the RPS game, it may not be for all possible games).
And again, a non mixing mistake increasing our EV is NOT a unique feature of a GTO strategy. That is true of ANY strategy due to the game being zero sum. It certainly is not true that a mixing mistake has no effect on opposing EV for arbitrary opposing strategies. The 100% paper strategy in RPS certainly gains EV against any mixing mistake that involves playing rock at greater probability than 1/3, for instance. The real question is do mixing mistakes increase EV of a player using a GTO strategy. That is an open question AFAIK.
I never find the semantics of game theory very stable or convincing, sometimes words seem to be used interchangeably to mean different things. Define 'exploit' for example. To my mind GTO is already maximal exploitation, just versus a perfect hypothetical opponent. Also Nash can be exploited by greater (or in a timed setting, faster) approximations, as more powerful solvers can
Claiming that Nash can be exploited by greater/faster engines isn't true in a theoretical sense though.
I think MPT also touches on how, at a certain point, the difference in EV between strategies very close to (but not at) Nash equilibrium is effectively negligible.
Your point about GTO being maximally exploitative isn't just "to your mind" - it's correct and the literal definition of the Nash equilibrium strategy pairs. Both players are maximally exploiting one another, which is why unilateral deviation cannot improve your EV.
EDIT: I see a few people talking about solvers being approximations. MPT uses a nice metaphor - imagine Nash equilibrium as the summit of a mountain. However, the top of the mountain is quite rounded/flat. Getting 90% of the distance to the summit puts you within 0.1% of the altitude of the summit. Small Nash distances exhibit an exceptionally small loss of EV, so much so as to be almost negligible.
In a Monte Carlo simulation of billions of hands? Sure, Nash Distance 0 will beat Nash Distance 0.0001. But in practical, human-lifespan and human-gameplay terms, it barely matters.
More importantly, it doesn't matter because, even if we could study the true Nash Equilibrium strategy, we have to simplify it because we're humans, not computers; no human meatperson is capable of playing anywhere near true GTO. Solvers may be approximations; whatever strategy a human uses can only be an approximation of an approximation.
Claiming that Nash can be exploited by greater/faster engines isn't true in a theoretical sense though.
I think MPT also touches on how, at a certain point, the difference in EV between strategies very close to (but not at) Nash equilibrium is effectively negligible.
But then how do we explain this?

Not only do better/more refined approximations beat other approximations, they crush them:
Your point about GTO being maximally exploitative isn't just "to your mind" - it's correct and the literal definition of the Nash equilibrium strategy pairs. Both players are maximally exploiting one another, which is why unilateral deviation cannot improve your EV.
Agreed. And that's basically my problem with it. We're using the verb 'exploit' in two different (contradictory) ways. One, as you say, to mean GTO v GTO, and one that is exclusively about deviating from a standard GTO solution via nodelocking etc. So that when some poor mug uses it in either fashion without the strictest of clarifications it won't be long until some smart alec pops up to claim unfair use and we're locked in these circular arguments that seem to plague all game theory discussion. Hence:
1. GTO is unexploitable. It does not explicitly exploit and cannot be exploited.
...is an absolute cluster**** of contradiction.
I'm not a maths or theoretical person (somewhat obviously π), but even I can tell game theory was invented by linguistic masochists.
Bots=/=GTO(Nash)
But then how do we explain this?Not only do better/more refined approximations beat other approximations, they crush them:
Agreed. And that's basically my problem with it. We're using the verb 'exploit' in two different (contradictory) ways. One, as you say, to mean GTO v GTO, and one that is exclusively about deviating fI think the problem is more with language use in the poker community than with game theory. Game theory does not use the term ΓβexploitΓβ at all. A Nash Equilibrium strategy is defined as a strategy whose EV cannot be reduced by any possible opposing counter strategy. That is a precise definition that does not use potentially imprecise terms like exploit.
The other term you wonΓβt see is the one that is in the very name we usually use for the poker NE strategy - optimal. That is why I absolutely hate the term GTO. There is no guarantee that a Nash Equilibrium strategy will always yield maximum EV against any opposing strategy. Against a player using a close GTO approximation, then yes, a close GTO approximation is going to be optimal. But many players donΓβt use GTO approximation strategies.
I always like extreme examples to clarify thinking. Suppose you play heads up against a non-GTO opponent who flat out tells you ΓβItΓβs dumb to play any hand other than AA, and itΓβs dumb to do anything but shove when you get AA.Γβ If heΓβs playing that strategy against us, it would be insane to try to play anything close to GTO. We should simply minraise ATC and win the blinds with probability 220/221. If he shoves we fold (assuming we donΓβt also have AA). This is obviously a much better strategy than a balanced GTO one against this specific opponent,
But what if this opponent isnΓβt a complete moron? What if he realizes his original idea was bad and adapts by playing more hands than just AA? Our optimal strategy for playing against his original strategy is no longer optimal - it loses EV. We should then figure out how to make an adjustment to optimize our strategy against his new one. If he adjusts to our adjustment, we must further adjust. It seems like this cycle of adjustment and counter adjustment could go on indefinitely, but the mathematics of game theory tell us otherwise. It will end with both players reaching a strategy such that any adjustment by our opponent is ineffective. No adjustment will increase his EV (and consequently decrease ours). This is the Nash equilibrium aka GTO.
GTO is fine in the sense that we can follow it without having to worry that our opponent might change his strategy and cost us EV. But if our goal is maximizing EV, as it should be, we should not be playing strictly GTO in all situations. In my example above, had we been playing GTO from the start, we would not have had to recognize our opponentΓβs adjustment and adjust our own strategy once he stopped playing the AA-only strategy. A GTO strategy is completely indifferent to what our opponent does. We would not lose EV regardless of his adjustment. We obviously can do better, though, by playing an actually optimal strategy against non-GTO opponents. The risk is that our opponent might change up his strategy and lower our EV.
The real question is do mixing mistakes increase EV of a player using a GTO strategy. That is an open question AFAIK.
I think the consensus is that mixing mistakes vs a fixed opponent do not lose EV.
Changing mixtures between indifferent actions cannot lose value against a fixed strategy. Mixing mistakes can only be exploited if the opposition adapts their strategy.
Expected value is zero-sum. For one player to gain, the other has to lose. So, if you’re mixing incorrectly between two decisions with the same EV, how would one player gain anything without changing their strategy?
One major exception to this rule is when EV is not zero-sum. Rake cuts into your EV, so mixing mistakes can increase or decrease the total sum of rake paid by both players.
Think about it this way. All you’ve done is moved a hand between two actions that have the same EV. You’ve simply moved a hand between two indifferent decisions. Since both those actions have the same value against your opponent’s strategy, they cannot gain unless villain changes their strategy!
GTO is a fixed strategy. Therefore, it doesn’t gain against mixing mistakes. However, it does gain against “pure mistakes”. Any action that loses money against a fixed strategy is a “pure mistake”. There’s a common misconception that GTO gains against “any mistake”. However, this isn’t true, as GTO can only gain against pure mistakes.
Aside from the RPS example, we can do a poker toy game: Nuts or Air vs Bluffcatcher.
10 combos of nuts and 5 combos of air.
1x pot.
MDF = 50%
** GTO
EV(Value): (10/15) * (50%*2 + 50%*1) = 1
EV(Bluff): (5/15) * (50%*1 - 50%*1) = 0
EV(Total): 1+0 = 1
** 100% fold
EV(Value): (10/15) * (100%*1) = 0.67
EV(Bluff): (5/15) * (100%*1) = 0.33
EV(Total): 0.67 + 0.33 = 1
** 100% call
EV(Value): (10/15) * (100%*2) = 1.33
EV(Bluff): (5/15) * (100%*-1) = -0.33
EV(Total): 1.33 - 0.33 = 1


