Komodo 12 and MCTS

cma6 · Post by **cma6** » Wed May 16, 2018 11:02 pm

"when the memory for MCTS nodes fills up, it simply stops searching and returns the best move."

MJL:
Since RAM is so important for MCTS mode, would you recommend drastically increasing hash size (assuming extra RAM is available)? I use 4096 MB hash for normal Komodo. Since I have 32 GB of RAM, what about increasing hash dramatically for MCTS mode?

cma6 · Post by **cma6** » Wed May 16, 2018 11:07 pm

" Leela and AlphaZero Chess would also not "learn" when using it in game mode. The learning process is done separately as training session to train the neural network which they use. In any case, Komodo also would not "learn" in this way."

MJL:
Are you saying that when we play against LC0 in infinite mode for long periods of time, that that use of LC0 on the network has no effect on LC0's "stored knowledge"; so that in entering the same position the next day to play LC0, our previous session did not teach LC0 anything at all about the position?
Or to put it another way, LC0--other than in training mode run by the operators--does not "learn" from the games that LC0 clients play?

cma6 · Post by **cma6** » Wed May 16, 2018 11:13 pm

"We include a UCI parameter "MCTS Exploit" which allows users to control how much exploiting versus exploring the MCTS node selection process uses."

MJL:
What do these terms mean: "exploiting" & "exploring"?
In very long infinite analysis, how should one set "MCTS Exploit" (default 25)?

Same question for "Skill" (default 20)

What is Smart Szygy and when to check that parameter?

You and LK have done a fantastic job to get the ball rolling with MCTS for CPUs.

Thanks, CMA

yanquis1972 · Post by **yanquis1972** » Wed May 16, 2018 11:40 pm

cma6 wrote: ↑Wed May 16, 2018 11:07 pm " Leela and AlphaZero Chess would also not "learn" when using it in game mode. The learning process is done separately as training session to train the neural network which they use. In any case, Komodo also would not "learn" in this way."

MJL:
Are you saying that when we play against LC0 in infinite mode for long periods of time, that that use of LC0 on the network has no effect on LC0's "stored knowledge"; so that in entering the same position the next day to play LC0, our previous session did not teach LC0 anything at all about the position?
Or to put it another way, LC0--other than in training mode run by the operators--does not "learn" from the games that LC0 clients play?

yes. the learning has to be done by running the client.exe so the games are recorded & there's oversight

mjlef · Post by **mjlef** » Thu May 17, 2018 12:38 am

cma6 wrote: ↑Wed May 16, 2018 11:02 pm "when the memory for MCTS nodes fills up, it simply stops searching and returns the best move."

MJL:
Since RAM is so important for MCTS mode, would you recommend drastically increasing hash size (assuming extra RAM is available)? I use 4096 MB hash for normal Komodo. Since I have 32 GB of RAM, what about increasing hash dramatically for MCTS mode?

In the present Komodo MCTS mode, a fixed amount of memory is used for the MCTS nodes. Hash is used for the regular mode, and for the short searches used in MCTS mode. I am likely to add another parameter like Hash or Table Memory which Komodo will use for the MCTS nodes/tree. The present allocation is fine for a couple of hours with one Thread, but I am sure people with machine with enough memory would like to continue MCTS analysis for longer times.

Mark

mjlef · Post by **mjlef** » Thu May 17, 2018 1:25 am

cma6 wrote: ↑Wed May 16, 2018 11:13 pm "We include a UCI parameter "MCTS Exploit" which allows users to control how much exploiting versus exploring the MCTS node selection process uses."

MJL:
What do these terms mean: "exploiting" & "exploring"?
In very long infinite analysis, how should one set "MCTS Exploit" (default 25)?

Same question for "Skill" (default 20)

What is Smart Szygy and when to check that parameter?

You and LK have done a fantastic job to get the ball rolling with MCTS for CPUs.

Thanks, CMA

We are still working on finding the best settings. At the time Komodo 12 was released, MCTS Exploit = 25 was the best value we had. Future updates will likely modify the formula, at which point we will set it then to the best values we have.

Monte Carlo Tree Search is a scheme where the search tree is expanded by Exploiting what you now know, and also Exploring new parts of the tree. There is a formula with two parts, like this:

node score = win percentage of node + exploration term

You score all the child nodes from where you are in the tree, and select the highest scoring node to descend to the next level in the tree. When you hit a node with no children, you generate a child node(s) to expand the tree.

The exploration term above could be something like log(parent nodes)/child nodes. This would encourage nodes with few child visit counts to be selected more. Lots of formulas are possible. The classic one is UCT, which you can read about online.

The bigger the exploration part of the equation, the more likely a lower win percentage node is likely to be selected. Exploitation will make the best win percentage lines longer. Exploration will make the tree wider, with more visits to parts of the tree with a lower win percentage. Both are needed to play well, but the balance between how much you explore lines less likely to be good versus lines that so far appear good is important for playing strength. Of course, sometimes a low scoring move that looks like a piece sacrifice really is good. We are still learning, but we spend a lot of time trying to come up with the best balance.

For Skill, always set it to 20 for best play. Lower Skill levels are meant to weaken Komodo to better match a user's current playing strength. A Skill level of 0 has a quite low elo, something around 200. Skill 19 should be roughly 2800. These estimates are on fast time controls. For longer time controls, you might need to raise the Skill level for longer games to match your strength at faster time controls.

Smart Syzygy is meant for people using 6 Piece Syzygy tablebases on machines with older Hard disk (not SSD) drives. When on, it only probes the 6 piece endgame tables at whatever Syzygy Probe Depth you set. Something like Depth 10 works well. But 5 piece Syzygy will be probed at all positive depths. The 5 piece WLD tables are only about 500 MB, which is small enough for modern operating systems to cache them in system RAM, so access to them is very fast. 6 piece are MUCH bigger and will not fit into RAM for most people. If you have a SSD, no need for Smart Syzygy, and you can probably safely probe even 6 piece Syzygy at all depths.

The Readme file contains more information on all these topics. And you can email me with any other questions.

Mark

mjlef · Post by **mjlef** » Thu May 17, 2018 1:38 am

cma6 wrote: ↑Wed May 16, 2018 11:07 pm " Leela and AlphaZero Chess would also not "learn" when using it in game mode. The learning process is done separately as training session to train the neural network which they use. In any case, Komodo also would not "learn" in this way."

MJL:
Are you saying that when we play against LC0 in infinite mode for long periods of time, that that use of LC0 on the network has no effect on LC0's "stored knowledge"; so that in entering the same position the next day to play LC0, our previous session did not teach LC0 anything at all about the position?
Or to put it another way, LC0--other than in training mode run by the operators--does not "learn" from the games that LC0 clients play?

Yes, that is how I think the neural network was implemented. You can think of the neural network in play mode as a black box. You feed it the same information and it outputs the same thing as last time. Note I think Leela has some type of randomness, which I assume is used to explore more of th tree. But that is not learning.

Of course, they have new networks all the time with Leela. So over time, the networks should improve play. I have no way of knowing how much they can improve play. Logically, a neural network of a limited size should approach some maximum value, so an elo limit. But networks can be enlarged and retrained. I am anxious to see how much they are able to improve play over time.

lkaufman · Post by **lkaufman** » Thu May 17, 2018 5:39 am

cma6 wrote: ↑Wed May 16, 2018 11:13 pm "We include a UCI parameter "MCTS Exploit" which allows users to control how much exploiting versus exploring the MCTS node selection process uses."

MJL:
What do these terms mean: "exploiting" & "exploring"?
In very long infinite analysis, how should one set "MCTS Exploit" (default 25)?

Same question for "Skill" (default 20)

What is Smart Szygy and when to check that parameter?

You and LK have done a fantastic job to get the ball rolling with MCTS for CPUs.

Thanks, CMA

Very long infinite analysis won't even work with MCTS on right now, it will just stop analyzing when reaching memory limits set by the program (not by user). We don't have any basis for recommending a value other than the default for longer time contnrols or analysis; the MCTS mode is just too new. Don't bother setting Skill for anything other than default 20 with MCTS on right now, it will only have small and somewhat unpredictable effects. Most of the UCI options just don't work yet with MCTS on, it's just too new. Syzygy also doesn't work yet with MCTS on. Hash size, table memory, Ponder, and Threads (up to 3) should work ok with MCTS. We'll gradually make more UCI options work properly with MCTS over time. Threads beyond 3 is top priority.

Kaj Soderberg · Post by **Kaj Soderberg** » Thu May 17, 2018 7:13 am

mjlef wrote: ↑Wed May 16, 2018 10:13 pm
Kaj Soderberg wrote: ↑Wed May 16, 2018 9:08 pm
Jesse Gersenson wrote: ↑Wed May 16, 2018 8:45 pm
Isn't a bug?! Is it expected behavior?

Komodo's causing his computer to freeze, that's a serious bug.
Just a bit of clarification. The engine freezes and causes the GUI to stop giving information. The PC is still responding.
Cheers, Kaj
It seems odd to me that the GUI would stop working. Nothing we do in the engine should make the GUI stop. The GUI has its own clock. In MCTS mode, when the memory for MCTS nodes fills up, it simply stops searching and returns the best move. If you find a position where the GUI freezing is repeatable, please send me the position, the GUI you are using and your settings and I can try to reproduce on my machines.

Well, what happens is that the GUI (Shredder Mac) stops giving information on the analysis and time used on it. Seeing the (later) discussion just probably because the engine stopped. I can still give commands to the GUI, and also the PC. So it seems simply that the engine stops, and therefore stops giving information to the GUI.

main line · Post by **main line** » Thu May 17, 2018 8:15 pm

lkaufman wrote: ↑Mon May 14, 2018 8:11 am We have just released Komodo 12 this morning at komodochess.com. While the strength of the normal version is only up about five elo from Komodo 11.3.1 (roughly forty elo at blitz levels from Komodo 11), that is because we have spent the last month creating a Komodo MCTS (Monte-Carlo Tree Search) option for it. It is really a second engine, and I understand that ChessBase will treat it as an independent engine when they release their version of Komodo 12 (soon), but for this release you just select it by UCI option. It is by no means as strong as normal Komodo, but we are reasonably certain that it is the strongest MCTS engine available at this time for the pc. We estimate that its rating on the CCRL 40/40 scale will be about 3000 on one thread, 3050 on two, and 3070 on three or more. Currently it is not able to benefit from running on more than three threads, but we expect to raise or eliminate this limit in the next month or so and if so we will offer one update (together with other improvements to the MCTS version) free to all purchasers of Komodo 12. Of course subscribers get free updates for a full year. We intend to focus on the MCTS version for at least the next month or two, because it is still improving rapidly while progress has slowed with normal Komodo. Potential subscribers can expect more frequent updates than in the past year, but the updates may be primarily of benefit to MCTS users. Unlike "Leela", it does not need or utilize a GPU. Note also that for people with quadcore computers, the three thread limit is not much of a disadvantage, as many users prefer to keep one thread free for the GUI, the operating system, the internet etc.
There are several advantages to the MCTS option.
1. Multi-PV display does not weaken Komodo MCTS in contrast to normal Komodo. In fact, tests indicate that with all legal moves displayed by both versions, the elo gap disappears (within margin of error).
2. Although tactical strength is lower with MCTS, positional play and judgment may well be better in many positions. Judging between good and bad gambits seems to be better with MCTS, although that is just my subjective opinion.
3. Evals don't fluctuate so suddenly or wildly.
4. Eventually it should be able to benefit from many cores more than normal Komodo, although that is obviously not true yet.
5. It will be much easier to improve. It is still in its infancy, and practically everything it does has yet to be tuned. It is also possible that fairly radical changes will prove to be very beneficial, as we have only tried a couple approaches so far.

I have no idea whether Komoco MCTS can catch or surpass normal Komodo, but we intend to try!

I need in next released version Komodo supports 96 threads minimum. Thanks

Komodo 12 and MCTS

MCTS memory

Re: Komodo 12 and MCTS

Exploiting versus exploring

Re: Komodo 12 and MCTS

Re: MCTS memory

Re: Exploiting versus exploring

Re: Komodo 12 and MCTS

Re: Exploiting versus exploring

Re: Komodo 12 and MCTS

Re: Komodo 12 and MCTS