I just make each legal move on the board from startpos.chrisw wrote: ↑Tue May 21, 2019 2:26 pmThat sounds about right.Ferdy wrote: ↑Tue May 21, 2019 2:02 pmWithout revising the code, I think your idea of moving the candidate move on the board and then search to a fix time or nodes or depth, get the score, invert the score sign, is a good idea.Laskos wrote: ↑Tue May 21, 2019 10:13 amThanks, I think I have to content myself with serially running the positions, maybe using Python. I had bad experiences with spreading GPU on several parallel instances of Lc0, they are GPU hungry and seem unstable or slowing down too much when in parallel. Guenther's UCI option would be useful if it will be available.chrisw wrote: ↑Tue May 21, 2019 9:13 amAssuming you don’t want to actually attack the source code and recompile ...Laskos wrote: ↑Tue May 21, 2019 8:39 amI want already to rewrite the basis of the opening theory using Lc0, being an 1700 Elo player .
Now, a bit more seriously, I am confident that late nets of the T30 and T40 runs are VERY strong in the openings (provably MUCH stronger than any standard AB engine). So, I let Lc0 21.2rc1 with a good late net ID42361 to analyze the starting opening position for as long as my RAM allows (in fact in the first run my machine crashed). My OC-ed RTX 2070 quickly (~15 min) fills all available RAM on my 16GB RAM machine. I set MultiPV = 20 for Lc0. Then I compared the output (after some 30 million nodes searched) to the human FIDE Elo above 2200 statistic from the "Chess Tempo" site. The comparison is here:
Many curious and interesting things can be said from these numbers, but for now I want to focus on say last 12 of possible opening moves. The statistic of strong human players becomes weak...but so is the MCTS tree search there even in MultiPV mode with both Lc0 and Komodo MCTS. The search tree is shallow and narrow on these moves and they have few visits. With AB engines and usual MultiPV, I don't have such worries about the weaker moves. More or less, all are explored to a similar degree. So, with MCTS, if I really want to see how say a4 move in the opening shows itself, I do have to play it, right? Is there a way to have a "regular" MultiPV (similar to that of AB engines) with MCTS engines?
Modify the PUCT to give more width, less depth, but that will basically destroy the effectiveness of the search.
Play each move and then search (you already suggested that).
Write a little program in Python to launch several instances of LC0, spreading the available RAM and GPU between them, running in parallel, each one handling one root move.
Write a little Python program that serially runs each root move.
Probably what you would really want, is a recompiled LC0 source that allowed you to remove moves from the root move list. Then the search could concentrate on what remained. That would be relatively easy for Crem or someone to do, you would just pass in a command with a list of skip moves (or a list of do moves).
What PUCT and/or other parameters would give equal distribution of visited moves (equal distribution policy head, right?)? I am not interested in the quality of the chosen best move, but in the quality of several or all possible moves.
An automated extension of that idea would be, via a little bit of Python:
Carry out a normal search for N nodes.
Get the searched moves and each visit count and each win rate.
Then, for each move, except the first (most visited), play it on board and search for X nodes, where X is the difference between visits[0 ] and visits[ i]. Get score, invert as above.
Then score(i) = mean score of the two scores weighted by their visits.
Does that sound right?
Set Lc0 to return win_percent as this is the appropriate target output instead of cp.
setoption name scoretype value win_percentage
move e2e4 on the board.
(Thread-7 ) <UciProtocol (pid=3140)>: << position startpos moves e2e4
(Thread-7 ) <UciProtocol (pid=3140)>: << go movetime 1000
(Thread-7 ) <UciProtocol (pid=3140)>: >> info depth 5 seldepth 10 time 599 nodes 170 score cp 4877 hashfull 30 nps 283 tbhits 0 pv c7c5
(Thread-7 ) <UciProtocol (pid=3140)>: >> bestmove c7c5 ponder b1c3
win_percentage of e2e4 is (10000 - 4877)/100 = 51.23
Result would look like this.
Code: Select all
Engine: Lc0 v0.21.2-rc1 w11258-80x7-se blas Time(s)/move: 1.0 No. Move WinPercent 01. c2c4 51.62 02. d2d4 51.59 03. e2e4 51.22 04. g2g3 51.17 05. g1f3 51.10 06. e2e3 51.03 07. b1c3 50.16 08. c2c3 50.11 09. a2a3 48.99 10. d2d3 48.71 11. b2b3 48.66 12. h2h3 48.58 13. a2a4 48.40 14. f2f4 46.92 15. b2b4 45.86 16. h2h4 45.69 17. b1a3 44.24 18. g1h3 42.88 19. f2f3 42.76 20. g2g4 38.82
Another approach is to get the Q value via.
setoption name scoretype value Q
Then get the cp value and win_percent via.
Code: Select all
if (score_type == "centipawn") uci_info.score = 111.714640912 * tan(1.5620688421 * edge.GetQ(default_q));
Code: Select all
if (score_type == "win_percentage") uci_info.score = edge.GetQ(default_q) * 5000 + 5000;