AlphaGo's algorithm uses a combination of machine learning and tree search techniques, combined with extensive training, both from human and computer play. It uses Monte Carlo tree search, guided by a "value network" and a "policy network", both implemented using deep neural network technology.[1][5] A limited amount of game-specific feature detection pre-processing is used to generate the inputs to the neural networks.[5]
The system's neural networks were initially bootstrapped from human game-play expertise. AlphaGo was initially trained to mimic human play by attempting to match the moves of expert players from recorded historical games, using a database of around 30 million moves.[11] Once it had reached a certain degree of proficiency, it was trained further by being set to play large numbers of games against other instances of itself, using reinforcement learning to improve its play.[1]
Facebook has also been working on their own Go-playing system darkforest, also based on combining machine learning and tree search.[22][26] Although a strong player against other computer Go programs, as of early 2016, it had not yet defeated a professional human player.[27]