A Simple Rule Change To Boggle a Bot?
You guys who have studied typical holdem rules and have learned to efficiently utilize computer simulations and solvers, might have a bias against this game I just thought of. But try to put that aside in your replies.
No Limit Holdem. Except that after the flop betting round, remaining players must pick one of their two cards to expose. Either simultaneously or sequentially. Or possibly the exposure comes after the turn card is dealt but before the bets. Or possibly after the turn betting round.
It seems to me that the knowledge of one card for each opponent and the decision as to which card to show makes for a game where the amount of computer time to derive the GTO strategy rises exponentially. And it monumentally affects everything including preflop strategy. And the best players would not usually be the ones who are best now. Do you agree? Do you think people would play this game?
I think the card should be revealed on the turn before the betting, that way players have the best chance to pretend they have a straight or flush or straight draw or flush draw that they can bluff on the turn or river. And it should be sequential so that position is even more important.
It seems like a fun game, but it also has some of the flavor of five card stud which died out. It makes reading people's hand a lot easier. In normal hold 'em it's harder to read hands than in a lot of other games, and in PLO it's even harder. PLO didn't take over hold 'em so maybe there's a balance where hold 'em is near the sweat spot. I have a feeling making it this much easier to read hands will make the game less popular than normal hold 'em.
It seems to me that the knowledge of one card for each opponent and the decision as to which card to show makes for a game where the amount of computer time to derive the GTO strategy rises exponentially. And it monumentally affects everything including preflop strategy. And the best players would not usually be the ones who are best now. Do you agree? Do you think people would play this game?
What leads to an exponential increase in compute time? You have to show one of two cards, so it adds one binary decision to a game tree that is already enormous. Computationally it's nothing compared to adding another betting round, more starting cards, more ranks in the deck, etc.
I agree it changes the game considerably and the new best players would be different.
Assuming the card is shown on the turn before the betting, the number of sub-games played from that point forward has changed, but there's two opposing forces: a new sub-game for each possible revealed card that increases complexity, and a smaller amount of possible opponent hands that decreases complexity.
Consider the heads-up sub-game on the turn where player A is out of position with a specific hand and forced to reveal their card first:
"Boggle" hold 'em:
Player A has a binary decision of which card to reveal and is facing 46 possible revealed cards from player B times 45 possible hole cards from player B times all the remaining gameplay options.
Normal hold 'em:
Player A is facing 46*45 possible hole cards from player B times all the remaining gameplay options.
Is there any appreciable difference in game-tree size or compute time between these sub-games besides the binary decision of which card to reveal?
NLHE is already the perfect game. Doesn't need changing
I think it could be better if the turn was two cards, and everything else is the same. Then the flop-turn-river is 3-2-1 cards. It adds some more luck and more interesting boards on the river. There can be two flushes possible, both a high and low straight possible (besides the A straights), or three pair on board.
If solvers were programmed to solve for this game, I don't see why they couldn't solve it just as with regular hold em.
As a novel game variant, I think showing one card after the flop would work better in PLO. In hold em it would give away too much info and turn into a glorified version of "war" where players are just betting on who has the best down card.
In any poker game where a card, previously face down, is exposed, there is a big difference between having it exposed randomly and via the player's choice. The opponent has to try to figure out why you might have chosen that card rather than the other, you have to try to figure out what his thought processes about your exposure will be, he will have to to try to deduce what you are thinking about his thinking and ad infinitum. Of course, there is a GTO solution and maybe I am overestimating the difficulty in arriving at it. But it certainly is a lot more difficult than if the card was exposed randomly.
Would you play A2suited under the gun? If yes, how would you change your flop strategy compared to a normal game if the flop was A93 rainbow? What about J22?
Is this more computationally intense than PLO?
In any poker game where a card, previously face down, is exposed, there is a big difference between having it exposed randomly and via the player's choice. The opponent has to try to figure out why you might have chosen that card rather than the other, you have to try to figure out what his thought processes about your exposure will be, he will have to to try to deduce what you are thinking about his thinking and ad infinitum. Of course, there is a GTO solution and maybe I am overestimating the
I think this variant makes it even worse to play low ranked cards, so I would rank A2 suited even lower than in normal hold 'em and not play it under the gun without very favorable game conditions.
If the flop is A93 rainbow then showing the 2 is dangerous because it caps your hand strength, and no matter what card your opponent shows other than a 2, they will know they can represent a hand that can beat your capped range. There's some clairvoyance game aspects going on. So I would play the flop more passively than in normal hold 'em.
On J22 obviously we will show the A and the opponent may or may not expect us to have A2 some of the time, and could think we have AA, AJ, AK-Q, or AT-3, so we can expect to get action from worse hands and bluffs, so I would tend to play aggressively on the flop, but it's very opponent dependent. If the opponent is overaggressive it's a great spot; it's like in normal hold-em where the aggro player can put the tight player on AK or some other A-high and call down with any pair or try to bluff you off your over-cards, except now they're even more confident you have A-high since they see the A confirming their suspicions, but if you had showed say a 9 instead they may put you on 99 and not pay off a lower pair or back off the bluff, or maybe try to represent the J to bluff you or wait for more overcards to come and bluff later.
I did an explicit calculation in a simplified game to see the increase in the size of players' strategies with respect to increase in deck size:
Heads-up, each player dealt 2 cards, 1 bet/raise cap per round, 1 preflop round, 1 postflop round, flop is 1 card, players show one card postflop after board dealt before betting.
For a 7 card deck, the number of decision variables for Player A's strategy is:
Non-reveal Hold 'em: 285
Reveal Hold 'em: 1,545
Ratio: 5.42
For a 52 card deck it's:
Non-reveal Hold 'em: 21,930
Reveal Hold 'em: 1,506,030
Ratio: 68.67
Basically, whatever the postflop strategy was before, revealing a card multiplies the size of that strategy by about the number of cards in the deck. This is for each player. A search online says the simplex algorithm is worst case exponential in the number of vertices but if done in an advanced way can be polynomial or even sub-polynomial. I'm guessing it's something like a 2*(size of deck) order increase in compute time.
The player strategy sizes in PLO are about double those in Reveal Hold 'em:
In my earlier calculation I made a mistake by not multiplying by the number of possible boards, this changes the situation to:
For a 7 card deck, the number of decision variables for Player A's strategy is:
Non-reveal Simplified Hold 'em: 645
Reveal Simplified Hold 'em: 4,425
Ratio: 6.86
For a 52 card deck it's:
Non-reveal Simplified Hold 'em: 741,030
Reveal Simplified Hold 'em: 71,977,830
Ratio: 97.13
Note that the number of board cards above 1 is almost irrelevant to these numbers, even 1 board card has the effect expanding the number of post-flop decision variables enough to dwarf the number of preflop decision variables, but not including even 1 board card led to the initial numbers from my post above. For 3 board cards the 52-card deck ratio is about 98.
In PLO the number of starting hands is 270,725 compared to 1,326 for Hold 'em for a ratio of 204.17. So the sum of the two player's strategies is an increase of 2*204.17 = 408.34. For Reveal Hold 'em the sum of the two player's strategies is an increase of 2*98 = 196.
Note that the number of board cards above 1 is almost irrelevant to these numbers, even 1 board card has the effect expanding the number of post-flop decision variables enough to dwarf the number of preflop decision variables, but not including even 1 board card led to the initial numbers from my post above. For 3 board cards the 52-card deck ratio is about 98.
Woops, in going from 1 to 3 flop cards I forgot to subtract the extra 2 cards from Player B's reveal options. The correction is:
Strategy size for 3 flop cards and a 52 card deck:
Non-reveal Simplified Hold 'em: 244,087,782
Reveal Simplified Hold 'em: 22,943,634,918
Ratio: 94.00
And I just realized that the complexity of normal Hold 'em depends on the sum of the sizes of the player strategies, so there's no need to multiply this ratio by 2; it will automatically be distributed over both player strategies.
So the size of player strategies in Reveal Simplified Hold 'em is 94 times larger than in Normal Simplified Hold 'em. Also, this ratio doesn't depend all that much on anything other than the deck size and number of board cards before a card is revealed. If you push the reveal later into the hand it lowers the ratio due to more board cards (less available reveal cards), and an additional factor due to there being less game tree remaining to multiply, but the later game tree is so much larger than the earlier game tree that this latter affect is not that important.
This is an interesting twist, Mr. Sklansky.
At first I thought it would just be way easier to solve since you've narrowed the ranges considerably. If the cards were flipped over randomly, solving would be trivial.
But the fact that you choose what card to flip adds a level of nuance that makes this really tricky. For example, if you flop top pair, do you reveal your pair?
- Strategy 1: Always reveal top pair - This is obviously bad because you give away the strength of your hand.
- Strategy 2: Never reveal your pair - This one is bad because if you have top set, you're forced to reveal it, effectively narrowing your range to one hand.
Therefore the optimal strategy would (probably) be to sometimes reveal top pair card, and sometimes show the kicker. A mixed strategy, most likely depending on how the kicker interacts with your opponent's range.
The problem for solvers is that this adds a huge layer of complexity. The brute force approach just doesn't work here, there are way too many revealing strategies. CFR can't solve this. It would require a brand new approach.
The problem for solvers is that this adds a huge layer of complexity. The brute force approach just doesn't work here, there are way too many revealing strategies. CFR can't solve this. It would require a brand new approach.
Are you actually calculating specific numbers with respect to CFR, or are you just guessing? From a human psychology perspective it seems like it's adding a lot more complexity, but from a game theory standpoint it just makes the game tree larger. All the betting is exactly the same as before. So whatever data structure that was in the computer storing Player A's strategy for the non-reveal game, from the Reveal betting round onward, you take all those options and multiply by 2 for A's decision of which card to reveal, and multiply by N, where N is the number of cards Player B can reveal, since A can have a different response strategy to every possible card B can reveal. So, e.g. if the reveal is on the turn, then we multiply by 2*46 = 92. It's more complicated because you don't multiply the part of their strategy on earlier rounds, and their total strategy size is the sum of the strategies over all the rounds.
Let's take the worst case: the reveal happens preflop before any action. From Player A's perspective, there's 50 cards Player B can show, and two cards Player A can show, so Player A's strategy size increases by 100 times. Everything else is the same as in the non-reveal game, except there's one less board card that can come later, which reduces Player A's strategy size somewhat. Putting the reveal later in the game results in smaller strategy sizes.
Of course, if state of the art methods are already taxed in finding a solution, then maybe 92 times more computing resources overwhelms them; I'm no expert on solvers algorithms, but it's not like it's fundamentally changing the algorithm you would use to solve it; CRF, the simplex algorithm, or whatever solves these kinds of games will certainly still work in principle, given enough resources. PLO in comparison has more game states before pruning than this Reveal version of Hold 'em.
So far these have just been heads-up calculations. For N players, each player's strategy size is getting larger by, very roughly speaking, something like D^(N-1), where D is the number of cards in the deck, so there is an exponential increase with respect to the number of players compared with the normal game. The normal game is exponential in something like B^N where B is each player's number of betting options, while the reveal game is (B^N)*(D^(N-1)).
For 3-handed or more Reveal Hold 'em overtakes PLO in player strategy size, and quite rapidly with respect to the number of players. 6-handed we're talking on the order of 100^5 = 10 billion times larger strategy size than normal Hold 'em. So, yeah, this game is a monster multiway.
To be more accurate, for 6-handed it's something like:
2*50*49*48*47*46 = 508,502,400. That's a five-hundred million times larger player strategy size than in normal hold 'em. For context, PLO has only a 204 times larger player strategy size than in normal hold 'em.
I would love to play this game!
Are you actually calculating specific numbers with respect to CFR, or are you just guessing? From a human psychology perspective it seems like it's adding a lot more complexity, but from a game theory standpoint it just makes the game tree larger. All the betting is exactly the same as before. So whatever data structure that was in the computer storing Player A's strategy for the non-reveal game, from the Reveal betting round onward, you take all those options and multiply by 2 for A's decision
REALphysical I think it's more complicated than this. My math might be off here, but hear me out.
You're calculating a multiplier based on how many ways hero can reveal their current exact hole cards x how many ways the villain can reveal. But that's not how CFR works.
Solvers calculate entire ranges vs entire ranges. If revealed on the turn, there are 47 x 46 ways to reveal one of each player's cards, giving us a multiplier of 2162.
Secondly, you forgot about future streets. CFR doesn't calculate turn strats in a vacuum, it has to go through the entire game tree every iteration. Each one of those 2162 combinations creates a new river branch. So we need to multiply by 45, giving us an upper-bound multiplier of 47 x 46 x 45 = 97290. Granted the revealed ranges are much tighter, but this is still far too big to solve.
REALphysical I think it's more complicated than this. My math might be off here, but hear me out.
You're calculating a multiplier based on how many ways hero can reveal their current exact hole cards x how many ways the villain can reveal. But that's not how CFR works.
Solvers calculate entire ranges vs entire ranges. If revealed on the turn, there are 47 x 46 ways to reveal one of each player's cards, giving us a multiplier of 2162.
Secondly, you forgot about future streets. CFR doesn't calculate
in other words, I need not worry that my use of "boggle" and "exponential" is an Alzheimer's warning sign?
I like playing poker with AI bots. Nothing feels better than bluffing or winning with a clever bot. Humiliating bots in any possible way is a human's duty !!
I think it's a great idea and would 100% like to play this game
REALphysical I think it's more complicated than this. My math might be off here, but hear me out.
You're calculating a multiplier based on how many ways hero can reveal their current exact hole cards x how many ways the villain can reveal. But that's not how CFR works.
Solvers calculate entire ranges vs entire ranges. If revealed on the turn, there are 47 x 46 ways to reveal one of each player's cards, giving us a multiplier of 2162.
Secondly, you forgot about future streets. CFR doesn't calculate
This is incorrect. The problem is you're not differentiating what actually changes between the normal game and the reveal game, and not differentiating between the size of a player's strategy and the size of the game tree that needs to be walked to compute the EV of the two player's strategies.
The size of a player's strategy increases by about the numbers given in my previous posts, specifically see the worst case scenario in a heads-up game with a preflop reveal is 100 times increase for each player's strategy size.
For calculating the EV of each player's strategy, what you refer to as walking the entire game tree, you have to consider every possible situation that can occur. This is exactly every hand combination for each player, every board combination, every betting line combination, and every reveal decision. You don't have to consider e.g. what Player A COULD have done in response to Player B showing different cards, you only consider what he actually does do. Think of it like actually playing poker except instead of randomly playing you play every single situation once and just weight the probability of that situation by multiplying the probabilities in the player strategies that lead to that game state. You're thinking that the algorithm has to consider every combination of each player showing every possible card against each other, but actually the algorithm already pairs every possible 2 cards against every possible 2 cards in the normal game, so the only extra part for the game tree is which card they show; player A and player B don't suddenly change their hole cards because of the card the other player showed. Another way to think about it, as I mentioned in my second post in this thread, is that while showing a card adds 46 sub-games, the size of those sub-games is reduced by 46 because the size of the player's remaining hole card combinations is only 45 instead of 46*45, and to get to a leaf node you have to consider every combination of hole card he can have.
Yet another way to think about it is, the aspect of poker that causes an exponential increase is the depth of the game, adding more betting rounds, whereas the reveal only changes the breadth of the game on a specific round.
Now, it's my understanding of these GTO algorithms that they are in the worst case exponential in the size of the combination of player strategies and game tree (maybe the product of them since it's repeatedly iterating; call it the total-game-tree), but with intelligent adjustments practically they are typically polynomial or even linear. So a sub-exponential increase in the total-game-tree, which we have here for the heads-up game, results in a sub-exponential increase in running time. And this is still relevant to the multi-player game because what people actually do is run solvers on heads-up postflop situations with pre-determined ranges and try to generalize intuitively to multi-way and other ranges.
in other words, I need not worry that my use of "boggle" and "exponential" is an Alzheimer's warning sign?
But you never said what aspect of the game that it's exponential with respect to. In the heads-up game there is no aspect of the game (e.g. # of rounds, # of betting options, # cards in deck) you can change that leads to an exponential increase in the player strategy sizes or the game tree that needs to be walked in the reveal game compared to the normal game. In the multiplayer game there is an exponential increase in the reveal game strategy sizes compared to the normal game strategy sizes with respect to the number of players in the game. Note, however, that how people practically use solvers is to run heads-up post-flop scenarios with pre-determined ranges, and then use intuition and heuristics to generalize to multi-way and different ranges. That process will not suffer an exponential increase.
REALphysical, wouldn't it depend if cards are revealed simultaneously or sequentially?
Furthermore, do you think revealing hole cards randomly vs deciding what to reveal would substantially change the complexity?
REALphysical, wouldn't it depend if cards are revealed simultaneously or sequentially?
Furthermore, do you think revealing hole cards randomly vs deciding what to reveal would substantially change the complexity?
Revealing the cards sequentially increases the second player's reveal strategy size by 46, and doesn't change the size of his betting strategy compared to the simultaneous reveal since his betting strategy is based on which cards are actually shown and in both cases there's 46 options for his opponent and 2 options for himself (his strategy knows his own cards).
The sequential reveal doesn't change the size of the game-tree that needs to be walked because the game-tree only cares about the actual cards revealed. It will change the algorithm a little in that there's a new probability from Player 2's strategy to multiply to get through the reveal round.
Are you two discussing techniques that will result in you being more likely to go all in on the flop than in a normal 9 handed game when you have 88 and the flop is Ah 8h 2c?
Or that because you would go all in with that flop you will be more likely to go all in with other hands?
Or that when you don't move in you are more likely to call the flop with 87 suited because you can represent the set?
Or that you sometimes don't move in with the set of eights because the reveal of the 8 might just be 87suited?
And of course "sometimes" and "more likely" means a specific percentage as far as GTO is concerned. Do your techniques give us those percentages?