AlphaGo – How Google’s AI Defeated the European Go Champion

I felt compelled to return from my over-long hiatus to share some exciting news: Google DeepMind recently announced that their artificial intelligence program AlphaGo has defeated Fan Hui 2p, three-time European Go Champion, in five games out of five. This is the first time a computer has won against a professional Go player in an even game (without a handicap).

If someone had told me five years ago or even yesterday that a computer program would be capable of winning an even game against a professional Go player in 2016, I would have said it’s impossible, and that is not a word I use often! Remarkably more complicated than chess, developing a competitive Go program has been a major challenge in 21st-century computer science and artificial intelligence studies. The number of possible Go games far exceeds the number of atoms in the observable universe. A brute-force approach to solving Go is unfeasible. Rather than narrow focus on local regions of the board, good gameplay requires accurate assessments of the entire game situation and intuitive judgements, which humans have excelled at — until now.

So, how did Google do it?

Contemporary Go programs typically utilize Monte-Carlo tree search (MCTS) algorithms. These systems involve selecting promising candidate moves, then simulating up to hundreds of thousands of variations before eventually backtracing and selecting the best option. Although such programs have shown promising results and made significant improvements over the last decade or so, they can still take a long time to identify sensible options and might miss good moves under time constraints. Google DeepMind utilized its access to enormous computing power to develop sophisticated neural networks which can take a less mechanical, more ‘instinctual’ approach.

Neural Networks

A neural network is esentially a class of mathematical functions with dynamic parameters, that is built from a large amount of data. In the case of AlphaGo, the data consists of 30 million game positions. With feedback from the researchers, the system was taught to identify expert moves. Here’s the important part: the system was then able to play games against itself, learning which moves were more likely to succeed, continuously improving itself in a process called reinforcement. The result is a policy network that chooses moves most similar to those it has learned produce good results. The system essentially develops intuition about good moves.

But it is not enough to know “expert-like” moves if you do not consider the context of the entire board. Go involves a large amount of compensation that human players find quite natural; when you are losing you might play more agressively to catch up, and when you are winning you might play more defensively to avoid losing your foothold. Some areas may be sacrificed to make gains elsewhere. So AlphaGo utilizes a second neural network called the value network that predicts the outcome of the game (that is, it predicts whether the computer will win or lose) if it plays the move chosen by the policy network.

Finally, with the combination of the two neural networks,  AlphaGo implements a Monte-Carlo tree search algorithm that is now far more efficient than traditional MCTS models. By starting off with expert moves from the policy network and evaluating the game outcome with the value network, no time is wasted on useless variations.

Computing Power

Of course, such intensive computation requires a lot of power. To quote the research paper in which AlphaGo was described,

The final version of AlphaGo used 40 search threads, 48 CPUs, and 8 GPUs. We also implemented a distributed version of AlphaGo that exploited multiple machines, 40 search threads, 1202 CPUs and 176 GPUs.

It will still be some time before a Go program like this is commercially available for home use!

What Next?

To truly prove itself, AlphaGo will face off against the world-renowned Lee Sedol in Seoul this March. According to goratings.org Fan Hui is ranked as the 633rd best Go player in the world; Lee Sedol is ranked 5th. A victory against Lee Sedol would be a truly historic, game-changing achievement.

It will be interesting to see what DeepMind does with this software in the future. Many have already commented about the eventual possibility of a tool for professional-level analysis of one’s own games.

The development of Deep Blue, the chess computer that defeated world chess champion Gary Kasparov in 1997, eventually led to the creation of IBM’s Deep Computing Institute for tackling large-scale computing and data analysis problems. What about practical applications of AlphaGo? Similar systems may eventually outperform human judgements in a wide range of fields from stock trading to medical diagnoses.

Many did not believe that computers would reach a professional level for decades, if ever. The rapid advancement of this technology and innovation demonstrated by DeepMind is incredibly impressive! These are exciting times, and one gets the feeling that this is just the beginning.

Further Reading

Nature Announcement

Research Paper (a little technical!)

Computer Go page on Wikipedia

The Singularity Is Near: When Humans Transcend Biology by Ray Kurzweil. A thought-provoking book exploring how advances in artificial intelligence and other technologies might influence our future.

Tags:

Leave a Reply