Quantcast
Channel: Hacker News 50
Viewing all articles
Browse latest Browse all 9433

adamsmith/game-theory-poker · GitHub

$
0
0

Comments:"adamsmith/game-theory-poker · GitHub"

URL:https://github.com/adamsmith/game-theory-poker


game-theory-poker

This is a game-theoretic, heads-up limit Texas Hold'em poker player I wrote in 2005, based on the paper Approximating Game-Theoretic Optimal Strategies for Full-scale Poker. It's about 6,000 lines of Java.

Theory of Operation

Playing the optimal strategy for any game guarantees that you'll achieve at least the expected value of that game. In poker, assuming no rake, the expected value is zero, so if you can compute the optimal strategy then you're guaranteed not to lose on average.

Consider rock-paper-scissors. The optimal strategy for that game is to throw rock 1/3'rd of the time, paper 1/3'rd, and scissors 1/3'rd. If you play by this strategy you are guaranteed not to lose on average, regardless of what strategy your opponent plays by. (Note that even if your opponent throws rock every time, you'll still just tie (not lose) on average. But there are other games where the optimal strategy can dominate some opponents' strategies, and poker is one of them.)

It's just too hard to compute the optimal strategy for poker, even two player limit poker. But we can compute the optimal strategy for an abstract version of poker that, for example, during pre-flop betting treats pairs of aces the same way as pairs of kings.

This code computes the optimal strategy for an abstraction of poker that it builds based on the strength of different hands in different situations.

A more detailed description of this idea is given in the paper Approximating Game-Theoretic Optimal Strategies for Full-scale Poker (Billings, 2003).

How Good Is It?

Okay, so how good is the resulting player? Well, you can tweak how abstract the game should be at each stage. The less abstract the game, the better the player will be, but you'll hit computation and memory constraints.

The furthest I got was computing five clusters worth of representation for each turn. This took about a month to compute using three machines in my apartment living room. As a median-skilled poker player I could beat the resulting player, but not that easily.

Opportunities For Improvement

Our abstraction of the game is fairly coarse. We can build a less abstract version of the game by leveraging the dramatic reduction in game size post-flop. The whole game tree has 10^18 nodes, but after you have your two hole cards and after the flop, there are just three rounds of betting and (47 choose 2) more cards that are coming. It's possible to compute the exact optimal strategy for such a narrowed-down game.

So what we should do is: for every (52 choose 2) * (50 choose 3) combination of post-flop states, compute the exact optimal strategy. Each result will also tell you the expected value of that sub-game. Then you compute the starting strategy for each (52 choose 2) initial states substituting the expected value from the sub-games as terminal nodes in the game tree.

This isn't an exact version of the game, but it's far better.

The problem is it'd take about 400,000 CPU-hours to compute.

So I started working on this during MIT's Independent Activities Period in January 2006, but I never finished the code. I stopped because classes started, but also because I didn't have access to 400k cpu-hours. (Coincidentally, I now have access to that kind of compute through Gridspot, but alas working on this is no longer feasible given my opportunity costs.)

Notes On Using The Code

  • All calculations are done up-front. The resulting multi-gigabyte files fully specify the game strategies, so the actual player just has to do a few disk reads to know what to do at each turn.
  • Each step creates new data files that are used in later steps. There are 16 types of files all-in-all.
  • All of this code is written to process data sequentially, so that you can compute a multi-gigabyte strategy with comparatively little memory.
  • If you want to play around with this code you can compute a strategy for reduced version of poker. For example, the code defaults to computing the strategy for 14-card poker (2 suits, 7 card ranks). Use _game.Cards to control the parameters of the poker game you're computing for (_eg_ 14 versus 52 card poker) and how many clusters you want at each turn.

Usage Instructions

Compute showdown rankings You should configure Constants.DATA_FILE_REPOSITORY first and create the subdirectory {{ GAME_NAME }}/5, before running. Run stage1.DoShowdowns Create files sorted on hole cards, and counts of winCounts (+ tiecount / 2) for hole cards during pre-flop, post-flop, post-turn, plus a new data format Run stage1.DoBacktracking Calculate clustering into abstract game Create stage2 directory in {{ GAME_NAME }} subdir Run stage2.DoClusteringStep1 with command line argument 0 Run stage2.DoClusteringStep2 with command line arguments 0 5 Calculate transition probabilities for abstract game Run stage2.DoStartClusterPDT Run stage2.DoTransitionPDTStep1 0 5 Run stage2.DoTransitionPDTStep2 5 Run stage2.DoTerminalClusterValuesStep1 Run stage2.DoTerminalClusterValuesStep2 Construct game tree, convert game tree to sequence form (single pass), solve sequence form LP problem Run stage3.DoGT root Run stage3.DoPreprocessRm mem-heavy root Run stage3.DoWriteLP root from {{ GAME_NAME }}\stage3\root, run bpmpd game.p1.mps and bpmpd game.p2.mps the last few steps overlap with the first few steps in the following script, which executes everything else needed to get to the MeerkatPlayer. (Meerkat is a poker player API.) In the steps below, /net1/poker/poker_data is the Constants.DATA_FILE_REPOSITORY, and 52-5-1.1 is the {{ GAME_NAME }}. bpmpd resides in /net1. note that there are other LP-solvers available. I just happened to use bpmpd. The MPS file format is supported by most. java -Xms400M -Xmx1700M -Xincgc -classpath bin stage3.DoPreprocessRm mem-heavy root java -Xms400M -Xmx1700M -Xincgc -classpath bin stage3.DoWriteLP root cd /net1/bpmpd ln -s /net1/poker/poker_data/52-5-1.1/stage3/root/game.p1.mps rootp1.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/root/game.p2.mps rootp2.mps rm -f *.out ./bpmpd rootp1 cp rootp1.out /net1/poker/poker_data/52-5-1.1/stage3/root/game.p1.sol ./bpmpd rootp2 cp rootp2.out /net1/poker/poker_data/52-5-1.1/stage3/root/game.p2.sol cd /net1/poker java -Xms400M -Xmx1700M -Xincgc -classpath bin stage3.DoParseClpOutput root java -Xms400M -Xmx1700M -Xincgc -classpath bin stage3.DoSubtreeGames java -Xms400M -Xmx1700M -Xincgc -classpath bin stage3.DoGT all-subtrees java -Xms400M -Xmx1700M -Xincgc -classpath bin stage3.DoPreprocessRm mem-heavy all-subtrees java -Xms400M -Xmx1700M -Xincgc -classpath bin stage3.DoWriteLP all-subtrees cd /net1/bpmpd rm -f ???.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/a/game.p1.mps ap1.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/a/game.p2.mps ap2.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/b/game.p1.mps bp1.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/b/game.p2.mps bp2.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/c/game.p1.mps cp1.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/c/game.p2.mps cp2.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/d/game.p1.mps dp1.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/d/game.p2.mps dp2.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/e/game.p1.mps ep1.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/e/game.p2.mps ep2.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/f/game.p1.mps fp1.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/f/game.p2.mps fp2.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/g/game.p1.mps gp1.mps ln -s /net1/poker/poker_data/52-5-1.1/stage3/g/game.p2.mps gp2.mps rm -f *.out ./bpmpd ap1 ./bpmpd ap2 ./bpmpd bp1 ./bpmpd bp2 ./bpmpd cp1 ./bpmpd cp2 ./bpmpd dp1 ./bpmpd dp2 ./bpmpd ep1 ./bpmpd ep2 ./bpmpd fp1 ./bpmpd fp2 ./bpmpd gp1 ./bpmpd gp2 cp ap1.out /net1/poker/poker_data/52-5-1.1/stage3/a/game.p1.sol cp ap2.out /net1/poker/poker_data/52-5-1.1/stage3/a/game.p2.sol cp bp1.out /net1/poker/poker_data/52-5-1.1/stage3/b/game.p1.sol cp bp2.out /net1/poker/poker_data/52-5-1.1/stage3/b/game.p2.sol cp cp1.out /net1/poker/poker_data/52-5-1.1/stage3/c/game.p1.sol cp cp2.out /net1/poker/poker_data/52-5-1.1/stage3/c/game.p2.sol cp dp1.out /net1/poker/poker_data/52-5-1.1/stage3/d/game.p1.sol cp dp2.out /net1/poker/poker_data/52-5-1.1/stage3/d/game.p2.sol cp ep1.out /net1/poker/poker_data/52-5-1.1/stage3/e/game.p1.sol cp ep2.out /net1/poker/poker_data/52-5-1.1/stage3/e/game.p2.sol cp fp1.out /net1/poker/poker_data/52-5-1.1/stage3/f/game.p1.sol cp fp2.out /net1/poker/poker_data/52-5-1.1/stage3/f/game.p2.sol cp gp1.out /net1/poker/poker_data/52-5-1.1/stage3/g/game.p1.sol cp gp2.out /net1/poker/poker_data/52-5-1.1/stage3/g/game.p2.sol cd /net1/poker java -Xms400M -Xmx1700M -Xincgc -classpath bin stage3.DoParseClpOutput all-subtrees Build your player data files around the resulting mixed strategy Run stage4.DoConvertSolsForRndAccess root Run stage4.DoConvertSolsForRndAccess all-subtrees Use your player Use the stage4.MeerkatPlayer class I originally used the player in PokerAcademy but it now seems that there are open source platforms that can use Meerkat players.

License

This code is released under the public domain / no license.


Viewing all articles
Browse latest Browse all 9433

Trending Articles