MilkTeaNoGo: Fall 2022 CMPUT 455 Student Tournament Champion<!-- --> | <!-- -->Steven Tang

MilkTeaNoGo: Fall 2022 CMPUT 455 Student Tournament Champion

January 9, 2023

My bot, MilkTeaNoGo, placed first in the Fall 2022 CMPUT 455 Student Tournament.

This was a tournament where students were tasked with creating a bot to play NoGo, a game with similar to Go, but with the objective of no capturing, making the strategies different. The tournament was a round robin tournament where each bot played against each other bot. The top 16 bots then played in a single elimination tournament. More information about the assignment is available here.

MilkTeaNoGo won 39 out of 40 games, and all 6 of its matches against the other student teams. In addition, it also won all games against the two test opponents created by the marking team, one was an optimized MCTS-based player, and the other was a mystery opponent. Ranks were seeded from performance against the test opponents, MilkTeaNoGo seeded as rank 1 out of 81.

In this blog post, I will briefly outline the techniques behind the success of MilkTeaNoGo. Hopefully, this will inspire the development of stronger and more interesting bots in future tournaments and provide insight into an efficient NoGo board implementation.

CMPUT 455 Student Tournament Bracket

Techniques

Monte Carlo Tree Search (MCTS)

In the context of MCTS, there are two factors that affect its performance: bias and variance. To minimize bias, we need more accurate simulations, I used the Nakade heuristic for this. To minimize variance, we need more simulations which is why an optimized NoGo Board implementation is important.

Optimized NoGo Board Implementation

My optimized board implementation allowed for MilkTeaNoGo to perform around 25000 simulations on an empty board within the allotted time limit of 30 seconds. In contrast, the second place team could perform about 7000 simulations.

Using profiling, it was found that the two most computationally expensive board interactions were the legality check and copying the board.

The legality check involves checking whether the neighbors of a point have liberties. Instead of checking on every move with floodfill (depth first search), it was significantly more efficient to use a disjoint-set data structure that keeps track of the liberties of each point.

In the simulation step, copying the board was the provided yet inefficient method for resetting the board to its original state. This was addressed by implementing an undo operation using a stack. This reduced the efficiency of the disjoint-set data structure, as path compression was not used since it is inefficient to undo. However, overall this combination was orders of magnitude faster than the original approach.

In addition, if the opponent plays a move that is in our search tree, we reuse the simulation data between turns, allowing us to take advantage of the computation invested in the node from the previous turn.

Heuristics

At first I tried implementing the RAVE heuristic, however, I found that it didn't improve performance.

Inspired by BobNoGo, I implemented the Nakade rollout heuristic. I used the Nakade heuristic for move ordering: moves that my bot could play in but the opponent could not play in were tried only when all other moves were exhausted in simulation. The heuristic does slow down performance from about 41000 simulations. However, the improved accuracy of the simulation makes up for it. With this heuristic, I found increased win rates by about 10% against my faster Upper Confidence Bound bot.

Unit Tests

Unit tests were extremely useful in ensuring that the optimized board was correct. I found a bug where a player was able to place on points it already occupied. I added this test case to the unit tests, later it helped me catch a regression when I accidentally reintroduced the bug.

Performance Tests

It is difficult to know whether one bot is better than another due to stochasticity. To be able to validate whether changes actually improved bot performance, I modified the provided play.py script to run games in parallel processes using the multiprocessing package to use all my CPU cores.

I found that my bot's performance improves with time, so in tests I often used a reduced 3 second timeout to play more games to get statistically significant results quickly.

To get results even faster, I did run a few games with reduced board sizes like 5 x 5. I used a 95% confidence interval on win rate in an attempt to make design decisions based on statistical significance rather than random chance.

Moreover, this is especially difficult when the bots are of similar strength. I found that a good work around to measuring performance was to run comparisons against a weaker bot. Generally relative performance can be estimated, where if Bot A wins 90% of the time against the weaker bot, and Bot B wins 80% of the time against the weaker bot, then Bot A is likely stronger than Bot B.

Potential Future Improvements

  • Transposition table: NoGo's game tree is a DAG (Direct Acylic Graph) and benefit from using a transposition table, however, I found my implementation did not work well, as the overhead from the DAG decreased performance.
  • Opening book: There are only ceil(n/2) ** 2 + 1 starting positions in a n-sided Go board, we could precompute a large number of simulations for each position, store this data with the solution. This could give simulation based bots an early head start.