Home Tech/AIUnderstanding why AIs get stumped by some games

Understanding why AIs get stumped by some games

by admin
0 comments
Understanding why AIs get stumped by some games

In Nim, only a small set of moves are optimal for any given board state. If you fail to choose one of those moves, you effectively hand control to your opponent, who can then win by playing solely optimal moves. Those optimal choices can, once again, be determined by computing a parity function.

That suggests the training regimen that succeeds for chess might be ill-suited to Nim. What’s surprising is how poorly it fared in practice. Zhou and Riis observed that on a five-row Nim board the AI learned quickly and continued improving after 500 training iterations. Adding a single extra row, though, drastically slowed the improvement. For a seven-row board, progress had essentially stalled by the time the system had played itself 500 times.

To make the issue clearer, the researchers replaced the module that recommended candidate moves with one that picked moves at random. On the seven-row board, the performance of the trained system and the randomized version was indistinguishable over 500 training runs. In other words, once the board was large enough, the system could not extract learning from game outcomes. The initial seven-row position contains three moves that all lead to a win, yet when their trained move evaluator was asked to assess every option it rated them all as roughly equivalent.

The researchers conclude that Nim demands that players internalize the parity function to play well. And the training approach that is so effective for chess and Go fails to achieve that.

Not just Nim

One interpretation is that Nim (and, by extension, impartial games in general) is simply peculiar. But Zhou and Riis also found indications that similar issues can arise in chess AIs trained this way. They identified several “wrong” chess moves—moves that missed a mating sequence or ruined an endgame—that the AI’s board evaluator initially ranked highly. Only by exploring additional branches several moves ahead did the software avoid those mistakes.

You may also like

Leave a Comment