Learning how far computers Go in defeating humans

AlphaGo, the artificial intelligence system built by Google subsidiary DeepMind, has just defeated the human champion, Mr Lee Se Dol, four games to one in a tournament of the strategy game of Go. Why does this matter? After all, computers surpassed humans in chess in 1997, when IBM's Deep Blue beat Garry Kasparov. So why is AlphaGo's victory significant?

Like chess, Go is a hugely complex strategy game in which chance and luck play no role. Two players take turns placing white or black stones on a grid; when stones are surrounded on all four sides by those of the other colour, they are removed from the board, and the player with more stones remaining at the game's end wins.

Unlike the case with chess, however, no human can explain how to play Go at the highest levels. The top players, it turns out, cannot fully access their own knowledge about how they are able to perform so well. This self-ignorance is common to many human abilities, from driving a car in traffic to recognising a face. This strange state of affairs was beautifully summarised by philosopher and scientist Michael Polanyi, who said: "We know more than we can tell." It's a phenomenon that has come to be known as Polanyi's Paradox. This paradox has not prevented us from using computers to accomplish complicated tasks, such as processing payrolls, optimising flight schedules, routing telephone calls and calculating taxes. But as anyone who's written a traditional computer program can tell you, automating these activities has required painstaking precision to explain exactly what the computer is supposed to do.

This approach to programming computers is severely limited; it cannot be used in the many domains, like Go, where we know more than we can tell, or other tasks like recognising common objects in photos, translating between human languages and diagnosing diseases - all tasks where the rules-based approach to programming has failed badly.

Deep Blue achieved its superhuman performance almost by sheer computing power: It was fed millions of examples of chess games so it could sift among the possibilities to determine the optimal move. The problem is that there are many more possible Go games than there are atoms in the universe, so even the fastest computers cannot simulate a meaningful fraction of them. To make matters worse, it's usually far from clear which possible moves to even start exploring.

Mr Lee, the world's top Go player, taking on the AlphaGo program as Google DeepMind's lead programmer Aja Huang (far left) looked on during the Google DeepMind Challenge Match in Seoul, South Korea, earlier this month.
Mr Lee, the world's top Go player, taking on the AlphaGo program as Google DeepMind's lead programmer Aja Huang (far left) looked on during the Google DeepMind Challenge Match in Seoul, South Korea, earlier this month. PHOTO: REUTERS

What changed? The AlphaGo victories vividly illustrate the power of a new approach in which instead of trying to program smart strategies into a computer, we instead build systems that can learn winning strategies almost entirely on their own, by seeing examples of successes and failures.

Since these systems do not rely on human knowledge about the task at hand, they are not limited by the fact that we know more than we can tell.

AlphaGo does use simulations and traditional search algorithms to help it decide on some moves, but its real breakthrough is its ability to overcome Polanyi's Paradox. It did this by figuring out winning strategies for itself, both by example and from experience. The examples came from huge libraries of Go matches between top players amassed over the game's 2,500-year history. To understand the strategies that led to victory in these games, the system made use of an approach known as deep learning, which has demonstrated remarkable abilities to tease out patterns and understand what is important in large pools of information.

Learning in our brains is a process of forming and strengthening connections among neurons. Deep learning systems take an analogous approach, so much so that they used to be called "neural nets". They set up billions of nodes and connections in software, use "training sets" of examples to strengthen connections among stimuli (a Go game in process) and responses (the next move), then expose the system to a new stimulus and see what its response is. AlphaGo also played millions of games against itself, using another technique called reinforcement learning to remember the moves and strategies that worked well.

Deep learning and reinforcement learning have both been around for a while, but until recently, it was not at all clear how powerful they were, and how far they could be extended. In fact, it's still not but applications are improving at a gallop, with no end in sight. And the applications are broad, including speech recognition, credit-card fraud detection and radiology and pathology. Machines can now recognise faces and drive cars, two of the examples that Polanyi himself noted as areas where we know more than we can tell.

We still have a long way to go but the implications are profound. As when James Watt introduced his steam engine 240 years ago, technology-fuelled changes will ripple throughout our economy in the years ahead, but there is no guarantee that everyone will benefit equally. Understanding and addressing the social challenges brought on by rapid technological progress remain tasks no machine can do for us.


• Andrew McAfee is a principal research scientist at the Massachusetts Institute of Technology, where Erik Brynjolfsson is a professor of management. They are the co-founders of the MIT Initiative on the Digital Economy and the authors of The Second Machine Age: Work, Progress And Prosperity In A Time Of Brilliant Technologies.

A version of this article appeared in the print edition of The Straits Times on March 17, 2016, with the headline 'Learning how far computers Go in defeating humans'. Print Edition | Subscribe