Exploration vs Exploitation of Chess Openings Learner with Metropolis-Hastings Algorithm
Kuupäev
2022
Autorid
Ajakirja pealkiri
Ajakirja ISSN
Köite pealkiri
Kirjastaja
Tartu Ülikool
Abstrakt
Opening lines are an integral part of ensuring a win, or at least a draw in chess. However,
there are no well known openings learners that are available for the public that respond
dynamically, thus allowing players to learn multiple openings at the same time. The
problem with implementing such an algorithm is the fact that most chess databases use a
notation that only expresses the movement of each step, not the board state as a whole.
It has been reported that the frequencies of the opening moves follow a power-law
with an exponent increase linearly with game depth. This thesis presents an innovative
approach that can compare the states in different opening lines and dynamically choose
which one to respond with. To be able to implement such a response, it is necessary
to determine how to pick which state to transition to. We have proposed a weighted
Metropolis Hastings algorithm to decide which move should be played by the computer.
The move is jointly decided as a weighted sum of the Metropolis Hastings output and
parameters set by the learner. The algorithm allows for the modification of the parameters
used, as well as the weight (importance) of them. Our method gives the flexibility of
exploration (from Metropolis-Hastings) vs exploitation (from the player parameters) in
the chess openings and adapts the game accordingly.
We have converted the opening lines represented in the string form to a list of
matrices. This matrix encodes the board state of the game, with each entry of the matrix
encoding the presence or the absence of a piece, the type of the piece, and the color. This
transformation allows us to find matching states in the opening lines, even at a different
move order. We have reported the distribution of chess board states under different
number of iterations, parameters and weights, depending on the preferred playing style.
We believe this is the first kind of study of chess openings from weighted stochastic
approach and we are not aware of similar study to the best of our knowledge. Our method
has the applicability for all chess learners and players, both new and experienced. In
future we want to extend our method to include more complex variations, such as playing
in positions with colors reversed, with one piece missing/displaced.
Kirjeldus
Märksõnad
Acceptance/Rejection Sampling, Monte-Carlo Methods, Markov Chain, Metropolis- Hastings, Chess, Chess Opening Moves