CFR // Strategy Explorer
Deep CFR · HU NLHE · 200BB · 75% pot · Neural inference
Street
Hole cards
Card 1
Card 2
Betting state
About this project
Learning by regretting
The algorithm plays against itself millions of times and keeps track of one thing: regret. After every hand it asks — "what would have happened if I had bet instead of checked?" Over time it plays the actions it regrets skipping the most. No poker rules were hard-coded; bluffing and value-betting emerge entirely on their own.
Why a neural network?
A simple lookup table would need a separate entry for every possible game situation. In real poker that's more entries than atoms in the observable universe. A neural network sidesteps this by learning patterns — it generalises from situations it has seen to ones it hasn't, the same way a human learns to play new hands from experience.
What is a Nash equilibrium?
A strategy where neither player can do better by changing their approach — even if they know exactly what the opponent is doing. Think of it as a perfectly balanced strategy: unpredictable enough that no one can exploit it, yet rational enough that it doesn't throw away value. This is what the algorithm converges towards.
From toy games to real poker
The project starts with Kuhn Poker — a 3-card game solvable in milliseconds — and scales up to full No-Limit Hold'em. Each step adds cards, betting rounds, or players, revealing exactly what makes the problem harder. The same algorithm runs on all of them; only the computational cost changes.
Making it fast
Training requires hundreds of millions of simulated hands. The core engine is written in C++ and runs on the GPU, cutting each training round from hours to minutes on a standard laptop. The trained strategy is then compressed into an ONNX model that runs inference directly in your browser — no server involved.
How to use this tool
Select your two hole cards, set the street and pot size, then hit Analyse. The network returns the GTO probability for each action — how often a theoretically optimal player would fold, call, raise, or go all-in in that exact situation. Higher iterations of training produce sharper, more reliable recommendations.