TwoPlusTwo

Quantal Response Equilibrium

Head coach of GTO Wizard • 448 posts • Joined Sep 2018

1yr ago

For the last decade Nash Equilibrium has been the gold standard for GTO poker solutions.

Today, GTO Wizard challenged that paradigm. We've introduced a new type of algorithm called Quantal Response Equilibrium (QRE).

[B]Introducing Quantal Response Equilibr...

The Problem:

Traditional solvers optimize strategies for spots they expect to happen, neglecting spots that “shouldn’t happen”. We call these 0% frequency spots “ghostlines”. Once a solver determines that a node/decision is irrelevant, it stops improving that spot, settling on a response just “good enough” to discourage opponents from entering that ghostline.

But real opponent's make mistakes. Real players take non-GTO lines all the time. So what do you do in these spots?

The problem is the lack of a defined range. If a player never takes a betting line, their range doesn’t exist—they’re representing nothing. What’s the optimal response against a non-existent range? How do you fight a ghost?

The Solution

Quantal Response Equilibrium (QRE) introduces a realistic model where players occasionally make mistakes. The probability of a player making a mistake is proportional to the regret of that mistake. That means big mistakes are less likely than small ones. By adding mistakes, we can model the optimal counter strategy facing those mistakes.

It should be noted that these mistakes are so infrequent that they have a negligible impact on the exploitability of the strategy. (In fact, new QRE algo is 25% more accurate on early streets). But by doing so we solve the problem of ghostlines and get optimal responses against mistakes.

To put it simply, QRE outperforms Nash against opponents who make mistakes.

Strategy Comparison

BTN vs. BB single-raised pot, 35bb deep .... Flop comes AK6, which is fantastic for the preflop aggressor. The BB should check range here, but not everyone understands action flow. Instead, the BB leads out with a pot-sized bet. How would you respond in the BTN?

Nash Equilibrium: Apparently, we should respond by mixing folds with everything, folding 2nd pair sometimes and calling 8-high air sometimes. This is obviously just bad. The donk node was abandoned early in the solving process, thus the response facing a donk has not converged.

QRE: Provides a logical, clear, and converged solution. Call with hands that have strong outs against the top of BB’s range (e.g., King-x, 6-x, gutshots), fold your air, and leverage position effectively by using small raises.

Try It Out

From now on, all custom solutions solved with GTO Wizard AI will be solved using QRE. Pre-solved solutions continue to use traditional NE.

Custom solving requries an elite subscription. However, everyone can test QRE for free by solving this flop: Q♠T♠7♥.

3 likes

16 April 2025 at 05:48 PM

Reply...

10 Replies

Head coach of GTO Wizard • 448 posts • Joined Sep 2018

1yr ago

Hey guys, I don't usually post GTO Wizard content here but I thought this was a really nice advancement to the frontiers of poker theory. I feel QRE will become the new SoTA.

If you want to learn more about QRE as it applies to poker I'd recommend reading our

https://blog.gtowizard.com/introducing-quantal-response-equilibrium-the-next-evolution-of-gto

GTO Wizard

Introducing Quantal Response Equilibrium: The Next Evolution of GTO

Or check out the wiki page for more details about how QRE works under the hood.

stranger • 13 posts • Joined Nov 2024

1yr ago

How does it change the strategy in lines played like 0.3% of the time?

Pooh-Bah • 1497 posts • Joined Mar 2019

1yr ago

Oh bloody hell.. can people stop inventing more theory pleeeassseee??

(no carry on like, great stuff)

Head coach of GTO Wizard • 448 posts • Joined Sep 2018

1yr ago

by babolino m

How does it change the strategy in lines played like 0.3% of the time?

See for yourself.

Here's a spot where BB is donking 0.2% of the time:

Here's how each algorithm responds to the donk:

https://app.gtowizard.com/solutions?soltab=strategy&solution_type=gwiz&custree_id=c9e6ccb5-f939-4b40-8e44-1ab7461fc340&gmfs_solution_tab=gwiz_sols&gametype=Cash6m500zGeneral&depth=100&gmfft_sort_key=0&gmfft_sort_order=desc&gmff_depth=100&gmff_type=general&gmff_rake=NL500&gmff_favorite=false&cussol_id=623f7d0d-d23a-4935-8f94-027e27233479&history_spot=7&preflop_actions=F-F-F-R2.5-F-C&board=QsTs7h&flop_actions=R1.8

GTO Wizard

https://app.gtowizard.com/solutions?soltab=strategy&solution_type=custom&custree_id=5ed51427-a63b-4c86-bf8b-71540ef23890&gmfs_solution_tab=ai_sols&gametype=Cash6m500zGeneral&depth=100&gmfft_sort_key=0&gmfft_sort_order=desc&gmff_depth=100&gmff_type=general&gmff_rake=NL500&board=QsTs7h&cussol_id=f61b0c95-a485-4634-8985-973d5c3c3c7d&flop_actions=R1.8&history_spot=1

GTO Wizard

[This flop is free to solve]

QRE is much more converged in these spots.

Pooh-Bah • 80 posts • Joined Mar 2014

1yr ago

Will multi-street solves with max exploits be implemented in the near future? Recently there was a video where it was seen that the solver punishes a range bettor that doesn't defend enough vs a x/r by x/r everything. But from what I understand, the solver does this assuming villain will be playing "correctly" on the turn/river as well as responding "correctly" to any imbalances in the x/r unless we node lock it not to. As mentioned in the video, a true max exploit is not gonna x/r everything (why would you x/r a set if villain's over folding).

Carpal \'Tunnel • 831 posts • Joined Sep 2016

1yr ago

Tipton wrote about this in one of his books, although he didn’t call it that. As far as I remember, there are several ways to respond to these non-existent lines, as long as we don’t open ourselves up to being exploited through them.

Head coach of GTO Wizard • 448 posts • Joined Sep 2018

1yr ago

I mean maybe, but it doesn't really have anything to do with Quantal Response Equilibrium. QRE isn't trying to model some donkey strat, it's trying to acheive an extremely accurate strategy on par with Nash, with just enough perturbation baked in to handle ghost lines.

But ok to answer your question - the hard part is modelling real people's strategies. "Opponent profiling" as we call it, is in its infancy. You have to know how they play every hand in every spot. Typically if you nodelock an entire tree all the way to the river for one player, it leads to fishy, face up, brittle exploits. Like extreme deviations that are very face up. Not saying its a bad way to study, just pointing out my observations from experience.

Yeah there's a class of game theory called "trembling hand equilibria" that deals with this. Sam Ganzfried (a proper game theorist), actually posted about it here a few year.... Hell, you commented on that thread lol.

The difference is that QRE makes "rational" mistakes that scale with the EV loss:

Trembling Hand Equilibrium: Random mistakes.
Quantal Response Equilibrium: Rational mistakes (the greater the EV loss, the less likely the mistake).

Technically trembling hand is considered a refinement of Nash, while QRE is considered a completely separate type of equilibrium in game theory.

2 likes

Head coach of GTO Wizard • 448 posts • Joined Sep 2018

1yr ago

Just dropped a video about QRE!

https://www.youtube.com/watch?v=lsN92LHT...

1 like

enthusiast • 124 posts • Joined Sep 2023

1yr ago

Does QRE change how you should play the game in a meaningful way or is it just a trick to get lines which occur in practise but not in theory to converge in a simultation. Like if the Nash simulation had infinite time it would just converge to the QRE response anyway. Am I missing some deeper idea that should actually affect your in game decisions?

Head coach of GTO Wizard • 448 posts • Joined Sep 2018

1yr ago

by SchrodingersBluff m

QRE and NE tend to produce very similar strategies in common nodes. Where it differs are the uncommon nodes.

This is a good thing btw. The goal is to produce highly unexploitable strategies with *just enough* perturbation thrown in to handle ghostlines and make it robust. We're not aiming to overthrow Nash strategies, we're aiming to improve the practical weaknesses and blindspots of traditional GTO solutions.

> "Like if the Nash simulation had infinite time it would just converge to the QRE response anyway"

No it would not. Traditional simulations stop updating a node once it goes to 0%. If it doesn't get visited, it doesn't improve.

$30M GTD Battle of Malta Online Series Join the Action on CoinPoker July 12 - Aug 3 $1M CoinMillion • $2.5M Main Event • $250k Added Value Daily Rewards

Reply...

Quantal Response Equilibrium

10 Replies

About OP

Top Posts

Quantal Response Equilibrium

10 Replies

Search in Thread

About OP

Top Posts