After so many games in tournaments I think that test solving is the more accurate an impartial method to know the real strength of an engineVinvin wrote: ↑Wed Dec 15, 2021 1:49 am Added Crystal 3.1
+ one run with Crystal 3.2
+ some corrections in the sheet. Some formulas at far right was wrong.
Results with the new computer.
Conditions :
- CPU : 5950X running around 3.9 GHz
- 16 threads used (to avoid hyper-threading)
- 5 minutes per position
- 6 man Syzygy + KRPPKRP
- 64 GB HashTable (still more than 50 GB for Syzygy cache)
- Several runs are needed because multithread is used. I fill the sheet and I use some formula to get average numbers.
- AVX2 version used everywhere when possible
List sorted from best to worst :
Name / #found-average / (Time with penalty in seconds) / number of runs
Blue Marlin 14.4a : 101,2 (84) on 11 runs
SugaR AI ICCF 1.90 : 101,0 (82) on 6 runs
ShashChess18.2 : 100,5 (92) on 11 runs
ShashChess20.1 : 98,7 (99) on 10 runs
ShashChess20 : 98,4 (101) on 10 runs
Crystal 3.2 : 98,0 (99) on 6 runs
Blue Marlin 14.5 : 98,0 (102) on 13 runs
ShashChess17.1 : 98,0 (111) on 6 runs
SugaR AI ICCF 2.40 : 96,8 (103) on 14 runs
Crystal 3.1 : 96,4 (108) on 8 runs
Honey-v14 : 96,3 (116) on 8 runs
SugaR AI ICCF 2.50 : 93,9 (117) on 10 runs
Stockfish_21.11.23.21 : 92,4 (141) on 5 runs
Stockfish_21.08.05.16 : 91,2 (149) on 5 runs
Sheet with all results here (you can download the file if it doesn't display very well in the browser) : https://www.dropbox.com/s/ckc1fb1gfjm1h ... 5.ods?dl=0
Still to do : remove some doubtful positions posted in this thread.
Hard-Talkchess-2020 set, final release
Moderators: hgm, Rebel, chrisw
-
- Posts: 1536
- Joined: Sat Feb 06, 2021 8:06 am
- Full name: Alex Morales
Re: Hard-Talkchess-2020 set, final release
Chess engines and dedicated chess computers fan since 1981 Mac mini M1 8GB-256GB, Windows 11 & Ubuntu ARM64.
ProteusSF Dev Forum TROLLS KINDERGARTEN
ProteusSF Dev Forum TROLLS KINDERGARTEN
-
- Posts: 5236
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Hard-Talkchess-2020 set, final release
As it was asked in this thread, the sheet with the positions (FEN) in the last column : https://www.dropbox.com/s/ee8snilqtuu2b ... 9.ods?dl=0
-
- Posts: 5236
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Hard-Talkchess-2020 set, final release
I removed 6 positions considered as incorrect or not good enough.
So 108 positions remain.
Only 6 positions removed, but that's enough to change a bit the ranking :
List sorted from best to worst :
Sheet with the 108 positions and timings : https://www.dropbox.com/s/8dqjc2t5pvfrw ... 9.ods?dl=0
So 108 positions remain.
Code: Select all
Position #038 not good : several moves are wining
Position #069 not good : several moves are wining
Position #144 is not good enough because 3 moves are clearly winning.
Position #169 is not good enough because 2 moves are clearly winning : Ng5 and Bxh7+
Position #177 is not good enough because 2 moves are clearly winning : N3h4 and Qxe1
Position #187 is not good enough because 2 moves are clearly winning : Ng4+ and Kg7
List sorted from best to worst :
Code: Select all
Name / #found-average / (Avg Time with penalty in seconds) / number of runs
ShashChess18.2 : 98,1 (109) on 11 runs
SugaR AI ICCF 1.90 : 98,0 (103) on 6 runs
Blue Marlin 14.4a : 97,5 (109) on 11 runs
ShashChess20.1 : 96,4 (116) on 10 runs
ShashChess20 : 95,9 (119) on 10 runs
ShashChess17.1 : 95,2 (131) on 6 runs
Blue Marlin 14.5 : 94,9 (125) on 13 runs
Crystal 3.2 : 94,0 (127) on 6 runs
SugaR AI ICCF 2.40 : 93,6 (126) on 14 runs
Honey-v14 : 93,4 (137) on 8 runs
Crystal 3.1 : 93,3 (131) on 8 runs
Stockfish_21.11.23.21 : 91,0 (156) on 5 runs
SugaR AI ICCF 2.50 : 90,3 (142) on 10 runs
Stockfish_21.08.05.16 : 88,8 (169) on 5 runs
-
- Posts: 1169
- Joined: Thu Dec 25, 2008 9:07 pm
- Full name: Herbert L
Re: Hard-Talkchess-2020 set, final release
Thank you Vincent, excellent found with your new PC.Vinvin wrote: ↑Sun Dec 19, 2021 10:50 pmCode: Select all
Position #038 not good : several moves are wining Position #069 not good : several moves are wining Position #144 is not good enough because 3 moves are clearly winning. Position #169 is not good enough because 2 moves are clearly winning : Ng5 and Bxh7+ Position #177 is not good enough because 2 moves are clearly winning : N3h4 and Qxe1 Position #187 is not good enough because 2 moves are clearly winning : Ng4+ and Kg7
Little mistake: is not Pos 187 but Pos 185 (1...Kg7 and 1...Ng4+)
-
- Posts: 5236
- Joined: Thu Mar 09, 2006 9:40 am
- Full name: Vincent Lejeune
Re: Hard-Talkchess-2020 set, final release
Yes ! Thanks for the correction.Paloma wrote: ↑Sun Dec 19, 2021 11:23 pmThank you Vincent, excellent found with your new PC.Vinvin wrote: ↑Sun Dec 19, 2021 10:50 pmCode: Select all
Position #038 not good : several moves are wining Position #069 not good : several moves are wining Position #144 is not good enough because 3 moves are clearly winning. Position #169 is not good enough because 2 moves are clearly winning : Ng5 and Bxh7+ Position #177 is not good enough because 2 moves are clearly winning : N3h4 and Qxe1 Position #187 is not good enough because 2 moves are clearly winning : Ng4+ and Kg7
Little mistake: is not Pos 187 but Pos 185 (1...Kg7 and 1...Ng4+)
It was a typo, files are OK.
-
- Posts: 3400
- Joined: Wed Mar 08, 2006 8:15 pm
Re: Hard-Talkchess-2020 set, final release
Lc0 result with fast GPU would be interesting. Even with GTX card it's better than SF14 in my PC. In CSS forum one result: Lc0 0.28.2 Netz 610889 got 80/114 with RTX3060 and 30s limit!
Jouni
-
- Posts: 1536
- Joined: Sat Feb 06, 2021 8:06 am
- Full name: Alex Morales
Re: Hard-Talkchess-2020 set, final release
Hi to all!
I have downloaded EPD 2021 test with 65 positions. Is it the right one? If not, have you an updated download link?
How many seconds are allowed for each position?
How do you balance hardware speed differences? (with my hardware using 4 CPUs I get ~2 Mn / s with Stockfish NNUE on starting position)
What is the meaning of "runs"?
Thanks in advance,
Alex
I have downloaded EPD 2021 test with 65 positions. Is it the right one? If not, have you an updated download link?
How many seconds are allowed for each position?
How do you balance hardware speed differences? (with my hardware using 4 CPUs I get ~2 Mn / s with Stockfish NNUE on starting position)
What is the meaning of "runs"?
Thanks in advance,
Alex
Chess engines and dedicated chess computers fan since 1981 Mac mini M1 8GB-256GB, Windows 11 & Ubuntu ARM64.
ProteusSF Dev Forum TROLLS KINDERGARTEN
ProteusSF Dev Forum TROLLS KINDERGARTEN
-
- Posts: 3236
- Joined: Sat Feb 16, 2008 7:38 am
- Full name: Peter Martan
Re: Hard-Talkchess-2020 set, final release
HereAlexChess wrote: ↑Fri Mar 04, 2022 2:21 pm I have downloaded EPD 2021 test with 65 positions. Is it the right one? If not, have you an updated download link?
How many seconds are allowed for each position?
How do you balance hardware speed differences? (with my hardware using 4 CPUs I get ~2 Mn / s with Stockfish NNUE on starting position)
What is the meaning of "runs"?
forum3/viewtopic.php?p=884039&sid=631ec ... bf#p884039
Vincent posted the 114- subset of the earlier bigger collections.
Then he removed 6 more of those:
forum3/viewtopic.php?p=915515#p915515
so 108 are left.
His regular TC is 30 minutes/position single threaded to avoid SMP- spreading.
To get even more statistical reliability you can have more then one run of each position and engine to average the results regards
Peter.
-
- Posts: 1536
- Joined: Sat Feb 06, 2021 8:06 am
- Full name: Alex Morales
Re: Hard-Talkchess-2020 set, final release
Thank you Peter for your answer On ERET 111 positions I've used the right TC luckily, considering that my hardware is very slow compared to top Ryzen computers, I'll try also this subsetpeter wrote: ↑Sun Mar 06, 2022 12:12 amHereAlexChess wrote: ↑Fri Mar 04, 2022 2:21 pm I have downloaded EPD 2021 test with 65 positions. Is it the right one? If not, have you an updated download link?
How many seconds are allowed for each position?
How do you balance hardware speed differences? (with my hardware using 4 CPUs I get ~2 Mn / s with Stockfish NNUE on starting position)
What is the meaning of "runs"?
forum3/viewtopic.php?p=884039&sid=631ec ... bf#p884039
Vincent posted the 114- subset of the earlier bigger collections.
Then he removed 6 more of those:
forum3/viewtopic.php?p=915515#p915515
so 108 are left.
His regular TC is 30 minutes/position single threaded to avoid SMP- spreading.
To get even more statistical reliability you can have more then one run of each position and engine to average the results regards
forum3/viewtopic.php?f=6&t=79238&sid=22 ... 80#p922202
Kind regards, Alex
Chess engines and dedicated chess computers fan since 1981 Mac mini M1 8GB-256GB, Windows 11 & Ubuntu ARM64.
ProteusSF Dev Forum TROLLS KINDERGARTEN
ProteusSF Dev Forum TROLLS KINDERGARTEN
-
- Posts: 1536
- Joined: Sat Feb 06, 2021 8:06 am
- Full name: Alex Morales
Re: Hard-Talkchess-2020 set, final release
Hi!AlexChess wrote: ↑Sun Mar 06, 2022 7:54 amThank you Peter for your answer On ERET 111 positions I've used the right TC luckily, considering that my hardware is very slow compared to top Ryzen computers, I'll try also this subsetpeter wrote: ↑Sun Mar 06, 2022 12:12 amHereAlexChess wrote: ↑Fri Mar 04, 2022 2:21 pm I have downloaded EPD 2021 test with 65 positions. Is it the right one? If not, have you an updated download link?
How many seconds are allowed for each position?
How do you balance hardware speed differences? (with my hardware using 4 CPUs I get ~2 Mn / s with Stockfish NNUE on starting position)
What is the meaning of "runs"?
forum3/viewtopic.php?p=884039&sid=631ec ... bf#p884039
Vincent posted the 114- subset of the earlier bigger collections.
Then he removed 6 more of those:
forum3/viewtopic.php?p=915515#p915515
so 108 are left.
His regular TC is 30 minutes/position single threaded to avoid SMP- spreading.
To get even more statistical reliability you can have more then one run of each position and engine to average the results regards
forum3/viewtopic.php?f=6&t=79238&sid=22 ... 80#p922202
Kind regards, Alex
I would like to know the results for my ProteusSF RBE 008a Stockfish 15-dev derivative, but I have issues with my poor hardware. Could somebody with a powerful PC kindly test it for me? https://banksiagui.com/forums/viewtopic.php?p=132#p132
On my Windows 11 ARM64 running on a Mac M1 using 1 CPU I can reach only 500 kN/S (with all 4 CPUs it becomes too hot).
How many seconds are allowed in this case for solving each position? A good Ryzen 9 PC calculates 60 Mn/s, so I think in my case 1 hour for position would be fair (now limiting time to 6 minutes: Analyzing... 26 of 44 matching moves Rated time: 2:14:29
Thanks in advance!
Alex
Chess engines and dedicated chess computers fan since 1981 Mac mini M1 8GB-256GB, Windows 11 & Ubuntu ARM64.
ProteusSF Dev Forum TROLLS KINDERGARTEN
ProteusSF Dev Forum TROLLS KINDERGARTEN