Volume 23, 2019
|Page(s)||176 - 216|
|Published online||01 May 2019|
On Monte-Carlo tree search for deterministic games with alternate moves and complete information★
Laboratoire de Probabilités et Modèles Aléatoires, UMR 7599, Université Paris Diderot, Case courrier 7012,
Paris Cedex 13, France.
2 Laboratoire de Probabilités et Modèles Aléatoires, UMR 7599, Université Pierre-et-Marie Curie, Case 188, 4 place Jussieu, 75252 Paris Cedex 5, France.
* Corresponding author: email@example.com
Accepted: 15 January 2018
We consider a deterministic game with alternate moves and complete information, of which the issue is always the victory of one of the two opponents. We assume that this game is the realization of a random model enjoying some independence properties. We consider algorithms in the spirit of Monte-Carlo Tree Search, to estimate at best the minimax value of a given position: it consists in simulating, successively, n well-chosen matches, starting from this position. We build an algorithm, which is optimal, step by step, in some sense: once the n first matches are simulated, the algorithm decides from the statistics furnished by the n first matches (and the a priori we have on the game) how to simulate the (n + 1)th match in such a way that the increase of information concerning the minimax value of the position under study is maximal. This algorithm is remarkably quick. We prove that our step by step optimal algorithm is not globally optimal and that it always converges in a finite number of steps, even if the a priori we have on the game is completely irrelevant. We finally test our algorithm, against MCTS, on Pearl’s game [Pearl, Artif. Intell. 14 (1980) 113–138] and, with a very simple and universal a priori, on the game Connect Four and some variants. The numerical results are rather disappointing. We however exhibit some situations in which our algorithm seems efficient.
Mathematics Subject Classification: 91A05 / 68T20 / 60J80
Key words: Monte Carlo tree search / deterministic games with alternate moves and complete information / minimax values / finite random trees / branching property
© The authors. Published by EDP Sciences, SMAI 2019
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.