Lc0 v0.17.0-rc1 egt test

Ferdy · Post by **Ferdy** » Tue Aug 21, 2018 1:14 pm

Run Lco on ending test positions, 4 and 5 men. There is improvement in using syzygy, but overall it seems it is still too low considering that it is already using egt and the test has only 1 best move, if it is a draw all other moves loses and if it is a win all other moves would just draw or lose. But perhaps it needs more time as I am not using a gpu and using only 5s per pos.

Conditions:
CPU Intel i7-2600K 3.4Ghz, 12GB RAM
Engine Hash: 256MB, Threads: 1

The test set is from Sergei.S.Markoff:Tablebase.Test but I removed the positions with 6-men and result with cursed wins. What remains are positions with best moves that would result to either win or draw.

Sample test pos.

Code: Select all

1B1b4/7K/1p6/1k6/8/8/8/8 w - - bm Ba7; id "Sergei.S.Markoff:Tablebase.Test.001"; c0 "draw"; c1 "5-men";
1b6/7k/8/8/6K1/8/8/R7 w - - bm Kf5; id "Sergei.S.Markoff:Tablebase.Test.002"; c0 "win"; c1 "4-men";
1k2q3/8/4N3/8/2Q5/3K4/8/8 w - - bm Qc7+; id "Sergei.S.Markoff:Tablebase.Test.003"; c0 "win"; c1 "5-men";

Results

You may try it on your system and lets see how it would fare.
Download the test file at
https://drive.google.com/file/d/13TQ0qY ... sp=sharing

Joerg Oster · Post by **Joerg Oster** » Tue Aug 21, 2018 1:52 pm

Ferdy wrote: ↑Tue Aug 21, 2018 1:14 pm Run Lco on ending test positions, 4 and 5 men. There is improvement in using syzygy, but overall it seems it is still too low considering that it is already using egt and the test has only 1 best move, if it is a draw all other moves loses and if it is a win all other moves would just draw or lose. But perhaps it needs more time as I am not using a gpu and using only 5s per pos.

Conditions:
CPU Intel i7-2600K 3.4Ghz, 12GB RAM
Engine Hash: 256MB, Threads: 1

The test set is from Sergei.S.Markoff:Tablebase.Test but I removed the positions with 6-men and result with cursed wins. What remains are positions with best moves that would result to either win or draw.

Sample test pos.
Code: Select all
1B1b4/7K/1p6/1k6/8/8/8/8 w - - bm Ba7; id "Sergei.S.Markoff:Tablebase.Test.001"; c0 "draw"; c1 "5-men";
1b6/7k/8/8/6K1/8/8/R7 w - - bm Kf5; id "Sergei.S.Markoff:Tablebase.Test.002"; c0 "win"; c1 "4-men";
1k2q3/8/4N3/8/2Q5/3K4/8/8 w - - bm Qc7+; id "Sergei.S.Markoff:Tablebase.Test.003"; c0 "win"; c1 "5-men";
Results

You may try it on your system and lets see how it would fare.
Download the test file at
https://drive.google.com/file/d/13TQ0qY ... sp=sharing

Afaik, LC0 doesn't probe dtz tables by now to sort or rank the root moves.
This might still lead to suboptimal play in many TB positions.

zenpawn · Post by **zenpawn** » Tue Aug 21, 2018 10:13 pm

Ferdy wrote: ↑Tue Aug 21, 2018 1:14 pm The test set is from Sergei.S.Markoff:Tablebase.Test

Could you please post a link to the original test suite? Thanks.

Ferdy · Post by **Ferdy** » Tue Aug 21, 2018 11:34 pm

zenpawn wrote: ↑Tue Aug 21, 2018 10:13 pm
Ferdy wrote: ↑Tue Aug 21, 2018 1:14 pm The test set is from Sergei.S.Markoff:Tablebase.Test
Could you please post a link to the original test suite? Thanks.

Try the link posted by Dann.
http://talkchess.com/forum3/viewtopic.p ... ilit=Ssmtt

George Tsavdaris · Post by **George Tsavdaris** » Wed Aug 22, 2018 1:01 pm

Ferdy wrote: ↑Tue Aug 21, 2018 1:14 pm Run Lco on ending test positions, 4 and 5 men. There is improvement in using syzygy, but overall it seems it is still too low considering that it is already using egt and the test has only 1 best move, if it is a draw all other moves loses and if it is a win all other moves would just draw or lose. But perhaps it needs more time as I am not using a gpu and using only 5s per pos.

Conditions:
CPU Intel i7-2600K 3.4Ghz, 12GB RAM
Engine Hash: 256MB, Threads: 1

The test set is from Sergei.S.Markoff:Tablebase.Test but I removed the positions with 6-men and result with cursed wins. What remains are positions with best moves that would result to either win or draw.

Results

You may try it on your system and lets see how it would fare.

Hi, how many nodes per second you were getting? Min, max and on average?

I run a similar TB test with yours, but with GPU instead of CPU and with a slightly modified set of testpositions since you had 2-3 with multiple solutions and some dubious ones that i didn't like and removed and also put 10 more to have 400 TB positions with mostly 5 men.
All 400 positions have a single solution/move only!

v17 Lc0 10815 with 3,4,5,6 TBs on a 1070 Ti for 3 seconds per position got 350/400 , i.e 87.5 %.
v17 Lc0 10815 WITHOUT TBs on a 1070 Ti for 3 seconds per position got 291/400 , i.e 72.8 %.

While you on a slightly different set of 394 positions, had only :
185/394 for Leela with TBs with 10780 net and
145/394 with Leela with NO TBs with 10780

Obviously the huge difference with me was not that you used a different net, nor that you used a slightly different set(around 15 positions difference), but the fact that you used CPU instead of GPU i've used.

I got from 4000 N/s to 90000 N/s with an average value of 10000 to 25000 N/s and TB hits from 30 to 8000 per position, with average of something like 1000-2000.

jorose · Post by **jorose** » Wed Aug 22, 2018 1:36 pm

I seem to be misunderstanding something.

How come all the engines with egt aren't getting all the problems right? These are all tablebase positions so they should essentially just be returning endgame table-base entries directly?

jkiliani · Post by **jkiliani** » Wed Aug 22, 2018 1:54 pm

jorose wrote: ↑Wed Aug 22, 2018 1:36 pm I seem to be misunderstanding something.

How come all the engines with egt aren't getting all the problems right? These are all tablebase positions so they should essentially just be returning endgame table-base entries directly?

The tested positions themselves aren't tablebase positions, i.e. they include more than 6 pieces. They just simplify to tablebase positions in the search tree, but the engines still have to do the search leading to those tablebase hits.

George Tsavdaris · Post by **George Tsavdaris** » Wed Aug 22, 2018 2:03 pm

jkiliani wrote: ↑Wed Aug 22, 2018 1:54 pm
jorose wrote: ↑Wed Aug 22, 2018 1:36 pm I seem to be misunderstanding something.

How come all the engines with egt aren't getting all the problems right? These are all tablebase positions so they should essentially just be returning endgame table-base entries directly?
The tested positions themselves aren't tablebase positions, i.e. they include more than 6 pieces. They just simplify to tablebase positions in the search tree, but the engines still have to do the search leading to those tablebase hits.

For the current testpositions you are wrong since they are 5 men TBs mainly(1-2 4 men also), so indeed result of SF for Ferdy is strange.

Guenther · Post by **Guenther** » Wed Aug 22, 2018 2:05 pm

jkiliani wrote: ↑Wed Aug 22, 2018 1:54 pm
jorose wrote: ↑Wed Aug 22, 2018 1:36 pm I seem to be misunderstanding something.

How come all the engines with egt aren't getting all the problems right? These are all tablebase positions so they should essentially just be returning endgame table-base entries directly?
The tested positions themselves aren't tablebase positions, i.e. they include more than 6 pieces. They just simplify to tablebase positions in the search tree, but the engines still have to do the search leading to those tablebase hits.

According to the epd file and Ferdys post he only used 5/4 men positions.

Ferdy · Post by **Ferdy** » Thu Aug 23, 2018 7:34 am

George Tsavdaris wrote: ↑Wed Aug 22, 2018 1:01 pm
Ferdy wrote: ↑Tue Aug 21, 2018 1:14 pm Run Lco on ending test positions, 4 and 5 men. There is improvement in using syzygy, but overall it seems it is still too low considering that it is already using egt and the test has only 1 best move, if it is a draw all other moves loses and if it is a win all other moves would just draw or lose. But perhaps it needs more time as I am not using a gpu and using only 5s per pos.

Conditions:
CPU Intel i7-2600K 3.4Ghz, 12GB RAM
Engine Hash: 256MB, Threads: 1

The test set is from Sergei.S.Markoff:Tablebase.Test but I removed the positions with 6-men and result with cursed wins. What remains are positions with best moves that would result to either win or draw.

Results

You may try it on your system and lets see how it would fare.
Hi, how many nodes per second you were getting? Min, max and on average?

Code: Select all

       _
|   _ | |
|_ |_ |_| v0.17.0-rc2 built Aug 21 2018
Found configuration file: C:\chess\engines\nobook\LCZERO\lc0_0.17\rc2/lc0.config
>> uci
id name The Lc0 chess engine. v0.17.0-rc2
[...]
uciok
>> ucinewgame
Loading weights file from: weights_10968.txt.gz
Creating backend [blas]...
BLAS, maximum batch size set to 256.
BLAS vendor: OpenBlas.
OpenBlas [DYNAMIC_ARCH NO_AFFINITY Sandybridge].
OpenBlas found 8 Sandybridge core(s).
OpenBLAS using 1 core(s) for this backend.
BLAS max batch size is 256.
>> isready
readyok
>> position startpos
>> go nodes 10000
info depth 1 seldepth 2 time 893 nodes 5 score cp 21 hashfull 0 nps 5 tbhits 0 pv d2d4 g8f6
[...]
info depth 6 seldepth 19 time 315282 nodes 10070 score cp 24 hashfull 39 nps 31 tbhits 0 pv d2d4 g8f6 c2c4 [...]
bestmove d2d4 ponder g8f6

So that is 31 nps using 1 thread without syzygy from start position on go nodes 10000 command.

I run a similar TB test with yours, but with GPU instead of CPU and with a slightly modified set of testpositions since you had 2-3 with multiple solutions

You are right, I re-checked again the test sets and there are 3 pos that have more than 1 move that can win. I removed it now and uploaded a new test file ssm_egt_4_5_men_1_bm.epd

and some dubious ones that i didn't like and removed

Could you specify which positions are these?

and also put 10 more to have 400 TB positions with mostly 5 men.
All 400 positions have a single solution/move only!

v17 Lc0 10815 with 3,4,5,6 TBs on a 1070 Ti for 3 seconds per position got 350/400 , i.e 87.5 %.
v17 Lc0 10815 WITHOUT TBs on a 1070 Ti for 3 seconds per position got 291/400 , i.e 72.8 %.

Is this Lc0 v0.17.0 rc1 or rc2?

While you on a slightly different set of 394 positions, had only :
185/394 for Leela with TBs with 10780 net and
145/394 with Leela with NO TBs with 10780

Obviously the huge difference with me was not that you used a different net, nor that you used a slightly different set(around 15 positions difference), but the fact that you used CPU instead of GPU i've used.

I got from 4000 N/s to 90000 N/s with an average value of 10000 to 25000 N/s and TB hits from 30 to 8000 per position, with average of something like 1000-2000.

So it needs more nodes to find more best moves. In your egt test of 350/400, 50 were still missed, so perhaps there is still something to be improved in this area.

I run another test with that new test file, thread is 1 and Lc0 on rc2 and new net id.

Lc0 v0.17.0-rc1 egt test

Lc0 v0.17.0-rc1 egt test

Re: Lc0 v0.17.0-rc1 egt test

Re: Lc0 v0.17.0-rc1 egt test

Re: Lc0 v0.17.0-rc1 egt test

Re: Lc0 v0.17.0-rc1 egt test

Re: Lc0 v0.17.0-rc1 egt test

Re: Lc0 v0.17.0-rc1 egt test

Re: Lc0 v0.17.0-rc1 egt test

Re: Lc0 v0.17.0-rc1 egt test

Re: Lc0 v0.17.0-rc1 egt test