Iain Duncan
What do the games Chess, Connect-4 and Noughts and Crosses have in common? They all have Cs and double-consonants, but more importantly they are all turn-based, perfect knowledge games. This means they have solutions.
In the context of a game, having a solution means that if all players play optimally, some outcome will be reached. For Noughts and Crosses, it is likely that you already know that most games end in a draw and this can be proven to be the game’s solution. In a previous article, we explored why games of this type have solutions and why they are not always so easy to find. However, the mathematical study of games, known as game theory, is not limited to solvable games.
We’ll look first at games which have sequential turns but introduce elements of randomness. Previously, we defined a perfect knowledge game as one where all players know the total present state of the game, what the other players want, and what will happen if they make a certain move. Games which have random elements are no longer perfect knowledge because players are no longer certain of what will happen if they make a particular move. However, these games can still be written as a (slightly modified) game tree. Here we will consider a game between two players, Ahmed and Betty. A point will be labelled A if Ahmed makes the choice, B if Betty does, and N if the outcomes will be determined randomly. The numbers on the right hand side of the diagram denote the scores in the form (A,B) where A is Ahmed’s score and B is Betty’s. Both players are trying to maximise their own score and don’t care about the other person’s.
To try to understand this game, we first need a way of measuring how attractive each of the choices are to the players . Ideally, this attractiveness score would incorporate all the possible outcomes of a random event and their associated probabilities. The simplest such score is the expected value, which you may have encountered before when studying probability. The expected value is found by adding up all of the potential outcomes of a random event multiplied by their probabilities. A helpful way to think about it is as the answer to the question: ‘if this random event were repeated many times what would the average score be per round?’. If the players use expected value, the analysis of the game would proceed in the same way as it did for the perfect knowledge games, with the players working backwards up the tree and picking their best move, given that the other player will likely pick the option which is best for them. The only difference here is that when they encounter a random event, instead of figuring out which decision the player would have made at that branch, they take the expected value of the random event.
For the game tree above, we begin by finding the expected values for each of the random pairs at the right-hand side of the tree. From top to bottom, they are: (2.5,2.5), (1,1.75), (3,3.33), (4.5,3). We then consider which choices B will make for either of the two situations she may find herself in. If A has picked the top path (meaning the first two pairs are the possible outcomes), B will choose the top path as well, because her expected value there is 2.5, compared to 1.75 on the lower path. If A chooses the bottom path (so the last two pairs are the possible outcomes), B will choose the top, because she prefers the expected value of 3.33 to 3. Now that A knows what B’s choices will be, he looks at the outcomes of his decisions. If he chooses top, B will choose top, meaning he will have an expected value of 2.5. If he chooses bottom, B will choose top and he will have an expected value of 3. Therefore, he will choose bottom, because he prefers the expected value of 3. The expected game path is shown below.
Even though we have used expected value, it is important to realise that we could have evaluated the desirability of a random event occurring in any way we wanted. For example, A may be incredibly pessimistic. If this is the case, he would make decisions based on the lowest value he could receive from a random event, rather than the expected value. Likewise, B might want to bias her decision-making to get bigger rewards and care less about risk. In this case, she could evaluate nodes using the outcomes squared times their associated probabilities. None of these approaches are more ‘right’ than any others. In fact, even though expected value may seem like the obvious choice to a mathematician, real people will rarely use it. The lottery has a negative expected value but people still play it, because they have subjectively evaluated that the possibility of winning is more important to them than the money they spend on tickets. The best way to deal with this variability is to think of the way the players deal with risk as part of the game set-up, in the same sort of way as you would consider the final scores as part of the game set-up, much as we have done here.
But what about games which can’t be written as game trees? This happens when we make the transition from sequential to simultaneous turns. Take the common childhood game rock, paper, scissors. In this game, the two players chant ‘rock, paper, scissors’ and then both present one of the three in the form of a hand signal. Rock beats scissors, scissors beat paper, paper beats rock and all three tie with themselves. The players choose which they will pick during the chant, without knowing what the other will do, and victory is more or less a matter of chance. This means that any simultaneous turn game is also an imperfect knowledge game, because the players do not know for sure what the others will do.
A game tree will not accurately describe this game, so we will introduce another tool: the game table. We’ll bring back Ahmed and Betty and say that the choices available to Ahmed are the columns and the choices available to Betty are the rows. The numbers in the boxes are the scores the players receive if that situation occurs. The red number is Betty’s score and the blue is Ahmed’s. The game table for their game of rock, paper, scissors is thus:
As you probably already know, if you’ve played the game, there isn’t really any optimal strategy. Players are theoretically just as likely to choose any of the three possibilities and all of them are just as good as the others (winning once, losing once and drawing once). However, this is not the case for all simultaneous turn games.
Imagine you were a bank robber. You and your partner have robbed banks all around the world without ever getting caught. Until today. You are now in police custody. Your Partner is in another room. The detectives come to you with the following deal. If neither you nor your partner confess you will go to jail for the robbery you attempted today, but none of the other robberies, since there is not enough evidence to convict you. However, if you confess to all the robberies you will get a reduced sentence and your partner will go to jail for longer. You know that the same deal is likely being offered to your partner and that, if you both confess, you will not get as low a sentence as you would if you had both simply kept your mouths shut. The sentence for only today’s robbery is 3 years. If you confess and your partner does not you only go to jail for 1 year and they go to jail for 14. However, if your partner confesses and you do not they go to jail for one year and you’re in for 14. If you both confess you’re both in jail for 6 years.
Not of your own choosing, you and your partner are now in a simultaneous-turn game, with the choice to either confess or remain silent. The game table for this game, known as the prisoners’ dilemma looks like this:
How do you decide what to do? Let’s assume that you and your partner always kept your robberies professional, i.e. you don’t particularly care how long they go to jail for and they don’t care how long you go to jail for. How do you decide what the best strategy is? In previous imperfect knowledge games, we found the best strategy by considering expected values (or some other quantity). Here though, we don’t know what the probabilities are, making such an approach unfeasible. Instead, we look at what happens for either of our partner’s decisions. If they choose not to confess we have a choice between confessing or not. If we confess, we get one year in jail and if we don’t we get three. Because we don’t care about our partner, or abstract concepts like solidarity, three years is worse than one so we are incentivised to confess if we think our partner will decide to not confess. On the other hand, if we think our partner will confess, we have the choice between not confessing and getting 14 years or confessing and only getting six. Six is better than 14, so if we think our partner will confess we should also confess.
Our heart begins to sink as we realise we are spending 6 years in jail. What we have discovered is that, regardless of what our partner does, we are better off confessing. Since the game looks exactly the same from our partner’s perspective and we know they’re smart enough to realise the same thing, we can be pretty sure that they will choose to confess. Since we were better off confessing anyway, we can be pretty sure we’re getting six years. Crime doesn’t pay kids…
This game is famous for two reasons. Firstly, it shows that pure self-interest will not always lead to the best result globally (both of us only getting 3 years if we agree to both stay silent). Secondly, it is a good example of something called a dominant strategy. A dominant strategy is one which will lead to a favourable outcome regardless of your opponent’s actions. The existence of such strategies allow us to better predict the outcome of games. If one strategy is simply worse than another, the worse strategy will never be picked and we do not need to worry about it. This is why, in the absence of external factors like caring about your partner or fear about what confession does to your reputation, we can be fairly certain that both prisoners are going to jail for 6 years.
Game theory is a fascinating area of maths and, I think, in some ways that is due to how ingrained in our lives it is. We have an incredible intuition for games. Take, for example, the qualifiers I used when talking about the prisoners’ dilemma: ‘we don’t care about solidarity’, ‘we don’t care about our partner’, ‘we don’t have to worry about our reputation’. I included those disclaimers because, more likely than not, in your own life you have been exposed to prisoners’ dilemma-like situations and observed a very complicated effect. When the prisoners’ dilemma is repeated, the game changes to favour people working together. Why? If you betray someone, people will stop cooperating with you and you will fare worse than people who can be trusted to not tell. Which I imagine on some level you will have experienced in your daily life. Amazing!
We’ve now looked at solvable games, random games, and game theory. There are many many more topics which can be explored in the realm of ‘games’ – from the use of supercomputers to solve near-infinitely complex game trees, to predicting the behaviour of two children who have eaten all the cookies in the house. The maths of games is incredibly broad and these two articles have barely even scratched the surface. Keep playing.
