1. 程式人生 > >Negamax in Clojure (Tic Tac Toe) — Work-Through

Negamax in Clojure (Tic Tac Toe) — Work-Through

In general terms, when the board is not in a terminal state, the negamax function must be applied from the start to find the optimum next move for the new marker. In this simple example, the only terminal board state is found in the next move (placing a winning score against the index 7, so 7 is returned as the optimum position, and a board-state is returned as if the next player, X, had placed their mark there). However, if there were any boards not in a terminal state against the new set of indices at this point, the minimax function would continue to recur until all board states had been fully populated with scores at all levels of the rabbit hole. Once this board has been found, the score-move

function is recurred with the new board, with the marker and perspective inverted (and the depth incremented). This means that evaluate-result works in the same way for the opponent player as well and it finds a positive score if it wins or a negative if it loses. The difference is that this time, the algorithm remembers that we’re actually looking at these scores from the perspective of the higher-level player, which seeks always to pick the maximum score. For that to work, these scores must be inverted so that what is best for the opponent translates to worst for the current negamaxer.

This is the fundamental difference between the negamax and the minimax algorithms. In negamax, the terminal board is analysed according to the marking player — a win if the marker wins or a loss if they lose. This is then inverted according to whether it is the negamaxer reviewing the speculative board states or the opponent. In contrast, in minimax, the terminal board state is always analysed as a win if the minimaxer wins or a loss if they lose — but after, the minimaxer picks the maximum of the scores and the opponent picks the minimum of the scores.