The Legend of AlphaGo Part I: The Rules of the Game

Clare Teng

In 2016, AlphaGo took the world by storm by becoming one of the select few professional Go players to receive the highest ranking in the game. The catch? AlphaGo is not a person, but a program designed by researchers at Google DeepMind. The match with Lee Sedol was watched by more than 200 million people. To put that into context, that’s approximately 3 times the entire UK’s population. Subsequently, the researchers involved in the project have been awarded the Marvin Minsky Medal and earned a prestigious spot on the front cover of Nature magazine (to name but just a few of their achievements). Since the original, several iterations of AlphaGo have been made, namely, Master, AlphaGo Zero, AlphaZero and MuZero.

But what is this all about and why is it important? In this 3-part series, we’ll be covering the rules of the game Go including why it’s challenging, what AlphaGo actually is and how it works, and what’s coming next…

Part IA: What is ‘Go’?

Go is one of the oldest board games still played today. There are 2 players, and each player uses a set of white or black stones. A standard Go board has 19×19 grid lines, with 361 intersecting points. The objective of the game is to surround more territory than your opponent, where territory is defined to be the number of empty points surrounded by a stones of a single colour. Each player takes turns to place a stone onto an intersection.

There are 4 main rules.
1) Once a player places a stone on the board, it cannot move from its position unless captured.
2) A player’s stone is captured if it’s surrounded by the opposing player’s stones.
3) Players cannot repeat the previous play to avoid a back-and-forth scenario when capturing an opponents’ stone.
4) A player is allowed to pass if they do not wish to make a move, and the game ends if both players pass consecutively. The final score is tallied as the count of a player’s territory minus their captured stones.

A handicap is used when the players are of different ranks, where the lower ranked player is given a form of concession. This could be by way of extra points at the end of the game or having more available stones to play.

Part IB: What’s hard about ‘Go’?

It is difficult to program a computer to play Go because a player’s skill is based on positional judgement and human pattern recognition. A human player recognises patterns (black vs white stones) early on and decides whether their stones will be captured based on board positions. As current programs are currently not complex enough to model a player’s visual awareness, the algorithmic solution is to perform an exhaustive search of possible moves based on previous and legal plays. To put this into context, there are ~10¹⁷⁰ distinct game positions, and many more distinct game states. Each state contains the player’s unique move history.

Conversely, games such as chess are combinatorial in nature and therefore it is easier for a computer to evaluate possible moves and combinations as there are numerous algorithmic techniques that can speed up the search process. Human players are less able to juggle all possible combinations in a game of chess. You can find out more about solvable games here.

Part IC: What is AlphaGo?

AlphaGo is a computer program which uses several concepts in artificial intelligence, deep learning, reinforcement learning and supervised learning to play the game of Go. So, instead of playing against a person online, you’re playing against a computer program that has learned how to play the optimal moves to win. The stats? AlphaGo lost just 1 out of 495 games against other Go programs and defeated 2 world champions.

One of the key concepts that makes AlphaGo successful is the use of deep learning (DL). As mentioned above, one of main difficulties in computer Go is pattern recognition. DL is notoriously good at pattern recognition in the visual domain, resulting in highly successful image classifiers. For example, distinguishing between handwritten digits in a range of 1-10, or identifying the type of animal in a photograph (cat, dog, frog, squirrel etc). The algorithmic and expressive power of DL is utilised to model a human Go player, where each game play and position is represented by 2-dimensional images.

Alright, so now we know the rules and the terminology, the next article will explain how AlphaGo actually works. See you there!

2 comments

The Legend of AlphaGo Part II: How does it work? – TOM ROCKS MATHS says:

April 16, 2024 at 1:21 pm

[…] bolts’ of the inner workings of AlphaGo, let’s first define some terminology mentioned in part I, namely deep learning (DL), reinforcement learning (RL) and supervised learning […]

LikeLike

The Legend of AlphaGo Part III: Epic Battles and the Future of AI – TOM ROCKS MATHS says:

April 29, 2024 at 2:19 pm

[…] the first two articles we’ve learned what AlphaGo is – an AI trained to play Go – and how it works – deep learning – so what’s next? […]

LikeLike

TOM ROCKS MATHS

Maths, but not as you know it…

The Legend of AlphaGo Part I: The Rules of the Game

Part IA: What is ‘Go’?

Part IB: What’s hard about ‘Go’?

Part IC: What is AlphaGo?

2 comments

Leave a comment Cancel reply

The Legend of AlphaGo Part I: The Rules of the Game

Part IA: What is ‘Go’?

Part IB: What’s hard about ‘Go’?

Part IC: What is AlphaGo?

Share this:

Related

2 comments

Leave a comment Cancel reply