SPCC: Testruns of Koivisto 9 finished

pohl4711 · Post by **pohl4711** » Mon Jan 16, 2023 8:56 am

Ratinglist-testrun of Koivisto 9 finished.
(I did the testrun with the Ipman-compile of Koivisto 8.17, but Koivisto 9 is identical to 8.17 (after "go depth 22" same pv-line and nodes). Only difference is, that the official Koivisto 9 binary is around +2% faster on my machines - such a small speed increase is meanigless for the Elo-performance in my ratinglist)

https://www.sp-cc.de

Also take a look at the EAS-Ratinglist, the world's first engine-ratinglist not measuring strength of engines but engines's style of play:
https://www.sp-cc.de/eas-ratinglist.htm

(Perhaps you have to clear your browsercache (press STRG+SHIFT+DEL) or reload the website))

Jouni · Post by **Jouni** » Mon Jan 16, 2023 5:37 pm

Nice they are back! But I can't detect any improvement here:

Code: Select all

Score of Koivisto9 vs Koivisto8.16: 46 - 58 - 396 [0.488]
...      Koivisto9 playing White: 35 - 10 - 205  [0.550] 250
...      Koivisto9 playing Black: 11 - 48 - 191  [0.426] 250
...      White vs Black: 83 - 21 - 396  [0.562] 500
Elo difference: -8.3 +/- 13.9, LOS: 12.0 %, DrawRatio: 79.2 %
500 of 500 games finished.

I used HERT book and 60+0,6 games. Compile from github. It's faster than Ipman compile. Github 2,19 Mnps and Ipman 2,14 Mnps.

Jouni · Post by **Jouni** » Mon Jan 16, 2023 7:57 pm

Also weaker in test suites: Arasan suite Koivisto8.16 scored 158/200 but Koivisto9 150/200

.

pohl4711 · Post by **pohl4711** » Tue Jan 17, 2023 7:35 am

Jouni wrote: ↑Mon Jan 16, 2023 5:37 pm
I used HERT book and 60+0,6 games. Compile from github. It's faster than Ipman compile. Github 2,19 Mnps and Ipman 2,14 Mnps.

As I said: Official avx2 binary is around 2% faster. No big deal. 1 Elo or less...

pohl4711 · Post by **pohl4711** » Tue Jan 17, 2023 7:39 am

Jouni wrote: ↑Mon Jan 16, 2023 5:37 pm Nice they are back! But I can't detect any improvement here:
Code: Select all
Score of Koivisto9 vs Koivisto8.16: 46 - 58 - 396 [0.488]
...      Koivisto9 playing White: 35 - 10 - 205  [0.550] 250
...      Koivisto9 playing Black: 11 - 48 - 191  [0.426] 250
...      White vs Black: 83 - 21 - 396  [0.562] 500
Elo difference: -8.3 +/- 13.9, LOS: 12.0 %, DrawRatio: 79.2 %
500 of 500 games finished.
I used HERT book and 60+0,6 games. Compile from github. It's faster than Ipman compile. Github 2,19 Mnps and Ipman 2,14 Mnps.

Last tested dev-version in my list was 8.13 and from 8.13 to 8.17(=9.0), the progress was +10 Elo. So, I can not say anything about progress from 8.16 to 8.17(=9.0)
Only change from 8.16 to 8.17(=9.0) was the new nnue-net. In the selftest on Openbench, there was a clear progress, using the new net:
ELO | 19.41 +- 6.65 (95%)
SPRT | 40.0+0.40s Threads=1 Hash=64MB
LLR | 2.95 (-2.94, 2.94) [0.00, 2.50]
GAMES | N: 5072 W: 1369 L: 1086 D: 2617

Of course, this test was done, using my UHO-openings, which are spreading the Elo at least factor 2. And the thinking-time was small. But there is definitly a progress. In your 500 game test, the regression is clearly inside errorbar.

Jouni · Post by **Jouni** » Tue Jan 17, 2023 2:53 pm

After more games I got:

Code: Select all

                        
1   Koivisto9       +5  +176/=845/-159 50.72%  598.5/1180
2   Koivisto8.16    -5  +159/=845/-176 49.28%  581.5/1180

Reminder, that even 500 games is not enough among equal engines!

SPCC: Testruns of Koivisto 9 finished

SPCC: Testruns of Koivisto 9 finished

Re: SPCC: Testruns of Koivisto 9 finished

Re: SPCC: Testruns of Koivisto 9 finished

Re: SPCC: Testruns of Koivisto 9 finished

Re: SPCC: Testruns of Koivisto 9 finished

Re: SPCC: Testruns of Koivisto 9 finished