Komodo 12 and MCTS

Milos · Post by **Milos** » Mon May 14, 2018 3:20 pm

mjlef wrote: ↑Mon May 14, 2018 1:46 pmI think some people are confusing a neural network with Monte Carlo Tree Search. The neural network used in programs like AlphaZero Chess are used to estimate winning chances for a move, and for helping select which moves to expand first. MCTS does not depend on having a neural network, and the initial MCTS schemes had no such network at all. Komodo already knows a lot about chess. So we have come up with other ways to estimate initial winning chances and select moves to search/expand. MCTS is simply another way to search a game tree. The formula we use is not the standard UCT, but it shares characteristics, such as being more likely to expand lower scoring moves the higher the number of parent node visits is.

Larry and I have been thinking about MCTS for several years. And the basic scheme was one we had discussed and proposed years ago. We are finding out new things every day.

Nice effort.
Care to comment which backup operator you use?
Do you use multiPV search instead of policy network? What is the depth of basic alpha-beta you use instead of rollouts (instead of value network)? Is it adjustable as parameter and do you plan to offer it to users?

mjlef · Post by **mjlef** » Mon May 14, 2018 3:46 pm

Milos wrote: ↑Mon May 14, 2018 3:20 pm
mjlef wrote: ↑Mon May 14, 2018 1:46 pmI think some people are confusing a neural network with Monte Carlo Tree Search. The neural network used in programs like AlphaZero Chess are used to estimate winning chances for a move, and for helping select which moves to expand first. MCTS does not depend on having a neural network, and the initial MCTS schemes had no such network at all. Komodo already knows a lot about chess. So we have come up with other ways to estimate initial winning chances and select moves to search/expand. MCTS is simply another way to search a game tree. The formula we use is not the standard UCT, but it shares characteristics, such as being more likely to expand lower scoring moves the higher the number of parent node visits is.

Larry and I have been thinking about MCTS for several years. And the basic scheme was one we had discussed and proposed years ago. We are finding out new things every day.
Nice effort.
Care to comment which backup operator you use?
Do you use multiPV search instead of policy network? What is the depth of basic alpha-beta you use instead of rollouts (instead of value network)? Is it adjustable as parameter and do you plan to offer it to users?

We include a UCI parameter "MCTS Exploit" which allows users to control how much exploiting versus exploring the MCTS node selection process uses. If by backup operator, MCTS normally backups up new information (node expansion, win probability) in a sort of averaging scheme. Komodo does that with one exception. I have seen posts here suggesting what is effectively a min-max backup scheme. This often ignores the new information found and just backs up the best (or worst depending on which side you mean) up the tree. To me this effectively duplicates regular min-max/alpha-beta search, so is less likely to select different best moves. We do it the normal MCTS way (non-min-max) except for one small case in the tree. As for depths, and what we changed in the evaluation, I will have to be a bit nebulous, well, until someone figures out what we did. But it is neither a single depth, and it is not multi-PV. We continue to learn more each day, and even if I gave you specifics, they are likely to change more in the next month or two. I think the power of MCTS is in how it expands the tree, which is quite different from traditional schemes. So it is important to us to keep that.

CMCanavessi · Post by **CMCanavessi** » Mon May 14, 2018 4:17 pm

Nice release just in time for CCLS Elite League, and maybe some future Leela Gauntlets?

Milos · Post by **Milos** » Mon May 14, 2018 4:47 pm

mjlef wrote: ↑Mon May 14, 2018 3:46 pm
Milos wrote: ↑Mon May 14, 2018 3:20 pm Nice effort.
Care to comment which backup operator you use?
Do you use multiPV search instead of policy network? What is the depth of basic alpha-beta you use instead of rollouts (instead of value network)? Is it adjustable as parameter and do you plan to offer it to users?
We include a UCI parameter "MCTS Exploit" which allows users to control how much exploiting versus exploring the MCTS node selection process uses. If by backup operator, MCTS normally backups up new information (node expansion, win probability) in a sort of averaging scheme. Komodo does that with one exception. I have seen posts here suggesting what is effectively a min-max backup scheme. This often ignores the new information found and just backs up the best (or worst depending on which side you mean) up the tree. To me this effectively duplicates regular min-max/alpha-beta search, so is less likely to select different best moves. We do it the normal MCTS way (non-min-max) except for one small case in the tree. As for depths, and what we changed in the evaluation, I will have to be a bit nebulous, well, until someone figures out what we did. But it is neither a single depth, and it is not multi-PV. We continue to learn more each day, and even if I gave you specifics, they are likely to change more in the next month or two. I think the power of MCTS is in how it expands the tree, which is quite different from traditional schemes. So it is important to us to keep that.

Thanks Mark for sharing some of the info. I completely understand that some things are still new and best to be kept closed for the moment.

Daniel Shawul · Post by **Daniel Shawul** » Mon May 14, 2018 4:48 pm

We include a UCI parameter "MCTS Exploit" which allows users to control how much exploiting versus exploring the MCTS node selection process uses. If by backup operator, MCTS normally backups up new information (node expansion, win probability) in a sort of averaging scheme. Komodo does that with one exception. I have seen posts here suggesting what is effectively a min-max backup scheme. This often ignores the new information found and just backs up the best (or worst depending on which side you mean) up the tree. To me this effectively duplicates regular min-max/alpha-beta search, so is less likely to select different best moves. We do it the normal MCTS way (non-min-max) except for one small case in the tree. As for depths, and what we changed in the evaluation, I will have to be a bit nebulous, well, until someone figures out what we did. But it is neither a single depth, and it is not multi-PV. We continue to learn more each day, and even if I gave you specifics, they are likely to change more in the next month or two. I think the power of MCTS is in how it expands the tree, which is quite different from traditional schemes. So it is important to us to keep that.

Ok so you use regular MCTS with a different formula. To mix that with standard alpha-beta searcher is very hard because you have no alpha-beta bounds to feed it. For that reason scorpio MCTS uses depth=0 (a qsearch()) at the leaves. I tried other depth 2-3-4 etc and they all failed for that reason. Stockfish's MCTS also failed for similar reason when it tried to use depth=8 alpha-beta at the leaves. To effectively mix alpha-beta and MCTS, I had to use alpha-beta rollouts (with LMR+null-move). This rollout version of alpha-beta gives you bounds and you can mix it with standard alpha-beta very well. I highly doubt your mcts search will give you an engine anywhere close to 3000 elo. Your evaluation is definately better than scorpio's though, so that may help.

Larry and I have been thinking about MCTS for several years. And the basic scheme was one we had discussed and proposed years ago. We are finding out new things every day.

I highly doubt this and it sounds to me very much like a commercial stunt. I honestly do not remeber a single dicussion with you or larry on MCTS and yet you make it sound like it is a fruit of years of your research ...

mjlef · Post by **mjlef** » Mon May 14, 2018 5:13 pm

Daniel Shawul wrote: ↑Mon May 14, 2018 4:48 pm
We include a UCI parameter "MCTS Exploit" which allows users to control how much exploiting versus exploring the MCTS node selection process uses. If by backup operator, MCTS normally backups up new information (node expansion, win probability) in a sort of averaging scheme. Komodo does that with one exception. I have seen posts here suggesting what is effectively a min-max backup scheme. This often ignores the new information found and just backs up the best (or worst depending on which side you mean) up the tree. To me this effectively duplicates regular min-max/alpha-beta search, so is less likely to select different best moves. We do it the normal MCTS way (non-min-max) except for one small case in the tree. As for depths, and what we changed in the evaluation, I will have to be a bit nebulous, well, until someone figures out what we did. But it is neither a single depth, and it is not multi-PV. We continue to learn more each day, and even if I gave you specifics, they are likely to change more in the next month or two. I think the power of MCTS is in how it expands the tree, which is quite different from traditional schemes. So it is important to us to keep that.
Ok so you use regular MCTS with a different formula. To mix that with standard alpha-beta searcher is very hard because you have no alpha-beta bounds to feed it. For that reason scorpio MCTS uses depth=0 (a qsearch()) at the leaves. I tried other depth 2-3-4 etc and they all failed for that reason. Stockfish's MCTS also failed for similar reason when it tried to use depth=8 alpha-beta at the leaves. To effectively mix alpha-beta and MCTS, I had to use alpha-beta rollouts (with LMR+null-move). This rollout version of alpha-beta gives you bounds and you can mix it with standard alpha-beta very well. I highly doubt your mcts search will give you an engine anywhere close to 3000 elo. Your evaluation is definately better than scorpio's though, so that may help.
Larry and I have been thinking about MCTS for several years. And the basic scheme was one we had discussed and proposed years ago. We are finding out new things every day.
I highly doubt this and it sounds to me very much like a commercial stunt. I honestly do not remeber a single dicussion with you or larry on MCTS and yet you make it sound like it is a fruit of years of your research ...

I and Larry have read about MCTS for years, following progress of the Go programs (Don Dailey had a Go program using Monte Carlo). But I am not claiming anything like "years of research". Although the the basic scheme we are using has been discussed between us for several years, we have only actively worked on it for the last month or two. We tried several variants and tuned the initial method, which was only giving us elos in the mid 2000s. But we found ways to improve it, with some changes giving us 100+ elo gains. The Exploit/Explore ratio is particularly important.

As for search, you can still use aspiration on a search even if you do not have especially useful bounds. We also found tuning search parameters to be a big help. As for elo, although we have followed your recent postings, what we are doing is similar, but with a lot of differences from what you have posted, so it is not surprising we get different results. We found sticking with our initial idea to be pretty good. But we have more to learn.

All MCTS schemes seems to expand the tree a lot slower than the nps a typical chess engine gets. The neural network engines use an evaluation capable of predicting things like piece swapoffs. But there are other ways of getting that.

As for "commercial stunt", that is simply untrue. Before passing judgement, how about taking a look at how the program behaves? Releasing an MCTS mode that is hundreds of elo weaker than what a program gets with standard search is not exactly a headline grabber. But we find its moves/search interesting and useful.

Jesse Gersenson · Post by **Jesse Gersenson** » Mon May 14, 2018 5:29 pm

Daniel Shawul wrote: ↑Mon May 14, 2018 4:48 pm
Larry and I have been thinking about MCTS for several years. And the basic scheme was one we had discussed and proposed years ago. We are finding out new things every day.
I highly doubt this and it sounds to me very much like a commercial stunt. I honestly do not remeber a single dicussion with you or larry on MCTS and yet you make it sound like it is a fruit of years of your research ...

Here is a thread from 10 years ago. Note Larry's posts in this thread:
http://www.rybkaforum.net/cgi-bin/rybka ... l?tid=5030

peter · Post by **peter** » Mon May 14, 2018 5:36 pm

mjlef wrote: ↑Mon May 14, 2018 1:28 pm When a second UCI engine is running, it is a different process and so they are not sharing any memory. It could be possible to share hash table memory, but it is possible the two searching methods would disagree in the information stored which would "corrupt" the other search. MCTS uses win probabilities and grows its search tree in a vary different way from traditional alpha-beta search. It is an interesting idea. I will have to think about it.

Tried sharing the hash by storing it with one and reloading it with other one, seems to work well.

lkaufman · Post by **lkaufman** » Mon May 14, 2018 5:51 pm

Werewolf wrote: ↑Mon May 14, 2018 8:52 am So Larry, is this new version using the old Komodo hand-tuned evaluation function, but with MCTS "search"?

There's no NN in the pipeline..?

The eval for Komodo MCTS is different than the eval for normal Komodo, but they are related. No NN planned as of now, but that doesn't mean we won't try it. As I wrote in my article for New In Chess (about AlphaZero), I was not convinced that their success was due to NN, more likely due to MCTS and hardware. I'm doubtful that NN can do as well as five centuries of accumulated human knowledge about chess.
GPU is not useful for MCTS, only for NN. If we find a way to make good use of GPU, we will do so.

lkaufman · Post by **lkaufman** » Mon May 14, 2018 5:54 pm

Ozymandias wrote: ↑Mon May 14, 2018 11:10 am
lkaufman wrote: ↑Mon May 14, 2018 8:11 amwe have spent the last month creating a Komodo MCTS (Monte-Carlo Tree Search) option for it. It is really a second engine, and I understand that ChessBase will treat it as an independent engine when they release their version of Komodo 12 (soon)
I guess they plan to offer free upgrades, at least for this new engine? Otherwise they'll be selling an early prototype that will soon become outdated.

We plan to give ChessBase at least one free upgrade for Komodo MCTS. I expect they would offer it to their customers free, but it is their decision, not ours.

Komodo 12 and MCTS

Re: Komodo 12 and MCTS

Re: Komodo 12 and MCTS

Re: Komodo 12 and MCTS

Re: Komodo 12 and MCTS

Re: Komodo 12 and MCTS

Re: Komodo 12 and MCTS

Re: Komodo 12 and MCTS

Re: Komodo 12 and MCTS

Re: Komodo 12 and MCTS

Re: Komodo 12 and MCTS