Modeling boardgame strategies
If you were to describe a strategy in a board game to a friend, how would you go about doing so?
Typically, when I do this, I talk about the outcomes we’re looking to maximize and then try to describe the steps that I want to take in order to do so. Generally, I have high level guidelines about what conditions I’m trying to satisfy throughout different parts of the game but in general I’m trying to express what actions I think are going to return the most value to me within the constraints of the game.
If you were to program a board game strategy, how would you do it?
For something like tic-tac-toe, where the set of gamestates possible is actually relatively small, I could hard code the response I would have to every possible condition. What if there are too many possible game states for you to provide a response for? Sometimes games can be solved such that optimal play can be described with an algorithm, but this requires reducing a lot intuition into a mathematical expression, which could be much harder for a comlpex game with many variables at play. Thankfully, this can be done in a much more generalized way that allows us to turn the unknown problem of “How can I represent a game strategy using an algorithm” into the much more easily answered problem “search”.
Value
Value is a thing which we want to maximize. A value function an equation that tells us the value we have at single point in the game, or game state. In board games, there are have which tell us what we can do on our turn, each of which changes the gamestate somewhat differently. This list of possible actions is enumerable, and each of them resulting in a new game state. If we could somehow create a function that tells us how close to our strategy we are given a gamestate, then we could iterate through all the actions possible and then choose the action that scored the highest was the closest to the strategy. Now this function is obviously a complex one, but assuming we can somehow mathematically represent gamestate somewhat consistently, we could assert that there is some kind of formula which roughly converts the gamestate informaton into a strategy. Thankfully, we can use neural networks to represent this unknown combination of our game state assuming the neural net is has enough features to accomodate the nuance of the strategy, we just need to find the correct structure for our neural network and the correct weights for all of our nodes. This is a much more approachable problem.
We have a game which allows a user to perform an action from the set of actions on their turn. This set of actions depends the current state of the game, so
we will introduce the function A(s).
A(s) takes in a gamestate and returns a set of possible actions you can take
We are representing our strategy, S, as a function which takes in the gamestate and an action and tells us how in-line with our strategy it was.
S(s, a) = numerical scalar
Thankfully, the game has already defined rules that tell us what each of these possible actions does to change the game state, we’ll call that G(s, a) = s
taking in a game state and an action and returning a new gamestate.
Because G is largely knowable, aside from aspects of randomness in games such as drawing from a deck or rolling a die, we can reduce our strategy to a new function
S(s, a) = S'(a') | a' = G(s, a)
where our strategy is just represented by a numerical scalar the scores how good the knowable state resulting from an action is with regards to our strategy.
Now the only thing we have to do is find S'(a')
In most games I play, you win through gathering the most points. Usually, there are secondary resources in the game which may or may not have value at the end of the game, but they help to earn those victory points. We could create a simple function that adds all of the victory points and secondary resources you have at a given game state and use that as our strategy, but this wouldn’t capture much complexity and is likely to be a horrible strategy in the long run.
By creating a sufficiently large neural network and using enough training data*** we could find a close aproximation of our strategy.
Enough training data
Training data is oftentimes expensive due to time spent hand labeling information in the case of supervised learning. What if we were able to have many players employing a variety of random strategies playing my game regularly? This would generate a large amount of data which could be automatically labeled based off of the result of the game, which would likely give us a roughly equivalently skilled player to our playerbase. What if we were to expand this further and run simulated tournaments at a rate of 100 games per second and refine our strategies from there? I’d bet we’d be able to come up with a pretty damn good strategy for a game.