Live Retest request Komodo 13.3 vs Stockfish 150220

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Dann Corbit, Harvey Williamson

mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Live Retest request Komodo 13.3 vs Stockfish 150220

Post by mwyoung »

Retest request Komodo 13.3 vs Stockfish 150220

Hardware 2950x, RTX 2080 ti

DESKTOP-CORSAIR, Blitz 4m+1s Ponder On. Book Perfect 2019 to 8 moves.

Komodo 13.3
4Gb HT
32 treads
Default

Stockfish 150220
4 Gb HT
32 treads
Default

Live Stream: https://www.youtube.com/watch?v=EaizPIJ-uRA
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
lkaufman
Posts: 5942
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Live Retest request Komodo 13.3 vs Stockfish 150220

Post by lkaufman »

I ran my own test of pure increment vs. SF, and in my test we did about ten elo worse than with normal increment play. So maybe we won't lose quite as badly this time, but Komodo 13.3 is still slightly behind Stockfish 9 in direct play, and I know that Stockfish has improved a lot between 9 and 11. On another topic related to your testing, I now have a new 32 core threadripper and I'm running some tests to determine how many threads is best to use for both Stockfish and Komodo. So far, it seems that 48 is better than either 32 or 63. If true, this might mean you would be better off using just 24 threads, but of course I don't know this since MP makes better use of 32 threads than of 64. But it might be worth testing to find out.
Komodo rules!
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Live Retest request Komodo 13.3 vs Stockfish 150220

Post by mwyoung »

lkaufman wrote: Thu Feb 20, 2020 6:25 pm I ran my own test of pure increment vs. SF, and in my test we did about ten elo worse than with normal increment play. So maybe we won't lose quite as badly this time, but Komodo 13.3 is still slightly behind Stockfish 9 in direct play, and I know that Stockfish has improved a lot between 9 and 11. On another topic related to your testing, I now have a new 32 core threadripper and I'm running some tests to determine how many threads is best to use for both Stockfish and Komodo. So far, it seems that 48 is better than either 32 or 63. If true, this might mean you would be better off using just 24 threads, but of course I don't know this since MP makes better use of 32 threads than of 64. But it might be worth testing to find out.
If something is not right with K13 playing with no base. Lets give K13 a retest.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Live Retest request Komodo 13.3 vs Stockfish 150220

Post by mwyoung »

lkaufman wrote: Thu Feb 20, 2020 6:25 pm I ran my own test of pure increment vs. SF, and in my test we did about ten elo worse than with normal increment play. So maybe we won't lose quite as badly this time, but Komodo 13.3 is still slightly behind Stockfish 9 in direct play, and I know that Stockfish has improved a lot between 9 and 11. On another topic related to your testing, I now have a new 32 core threadripper and I'm running some tests to determine how many threads is best to use for both Stockfish and Komodo. So far, it seems that 48 is better than either 32 or 63. If true, this might mean you would be better off using just 24 threads, but of course I don't know this since MP makes better use of 32 threads than of 64. But it might be worth testing to find out.
I tested this in the past on my 16 core TR with SF. And at the time SF liked 32 threads the best. I did not test Komodo from 1 to 32 threads like SF other then to see Komodo did not play worse given 32 threads. But something could have changed.
Or it could be you just have more real cores, and with more cores you get less gains when using logical cores.

The more threads the more diminishing returns plays a factor.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Live Retest request Komodo 13.3 vs Stockfish 150220

Post by mwyoung »

Sorry. You tube killed the old live stream studio during the stream. I had to setup a new broadcast with the new studio. The match was not disrupted, only the live stream. Below is the new link.

Live Stream: https://www.youtube.com/watch?v=uRRY3gF-2xY
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
lkaufman
Posts: 5942
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Live Retest request Komodo 13.3 vs Stockfish 150220

Post by lkaufman »

mwyoung wrote: Thu Feb 20, 2020 11:19 pm Sorry. You tube killed the old live stream studio during the stream. I had to setup a new broadcast with the new studio. The match was not disrupted, only the live stream. Below is the new link.

Live Stream: https://www.youtube.com/watch?v=uRRY3gF-2xY
The result after 64 games of -83 elo is reasonable. I'm not sure why your pure increment test was so bad; when I ran it on our own tester I had to set a base time equal to the increment just to give it time for the first move, otherwise it was forfeiting 100% of the games without play, but that was probably a tester anomaly, since you didn't have that experience. Maybe your GUI replaces a zero base with some very tiny fraction of a second, just a guess. Anyway problem solved.
Komodo rules!
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Live Retest request Komodo 13.3 vs Stockfish 150220

Post by mwyoung »

lkaufman wrote: Fri Feb 21, 2020 3:45 am
mwyoung wrote: Thu Feb 20, 2020 11:19 pm Sorry. You tube killed the old live stream studio during the stream. I had to setup a new broadcast with the new studio. The match was not disrupted, only the live stream. Below is the new link.

Live Stream: https://www.youtube.com/watch?v=uRRY3gF-2xY
The result after 64 games of -83 elo is reasonable. I'm not sure why your pure increment test was so bad; when I ran it on our own tester I had to set a base time equal to the increment just to give it time for the first move, otherwise it was forfeiting 100% of the games without play, but that was probably a tester anomaly, since you didn't have that experience. Maybe your GUI replaces a zero base with some very tiny fraction of a second, just a guess. Anyway problem solved.
You have to look at the error bar. That result is not inconsistent with the other test. And Looking at the video the GUI it seems it is doing exactly what it suppose to do. Adding 5 secs per move and being shown on the clock. The first 8 moves (book) add 40 seconds to the clock at the start of the game. If there is a issue with this TC. It seems to be a K issue.

Right now Komodo is -91 Elo after 70 games. With a 95% -162 to -60 and 97.7% -203 to -43
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
lkaufman
Posts: 5942
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Live Retest request Komodo 13.3 vs Stockfish 150220

Post by lkaufman »

mwyoung wrote: Fri Feb 21, 2020 4:39 am
lkaufman wrote: Fri Feb 21, 2020 3:45 am
mwyoung wrote: Thu Feb 20, 2020 11:19 pm Sorry. You tube killed the old live stream studio during the stream. I had to setup a new broadcast with the new studio. The match was not disrupted, only the live stream. Below is the new link.

Live Stream: https://www.youtube.com/watch?v=uRRY3gF-2xY
The result after 64 games of -83 elo is reasonable. I'm not sure why your pure increment test was so bad; when I ran it on our own tester I had to set a base time equal to the increment just to give it time for the first move, otherwise it was forfeiting 100% of the games without play, but that was probably a tester anomaly, since you didn't have that experience. Maybe your GUI replaces a zero base with some very tiny fraction of a second, just a guess. Anyway problem solved.
You have to look at the error bar. That result is not inconsistent with the other test. And Looking at the video the GUI it seems it is doing exactly what it suppose to do. Adding 5 secs per move and being shown on the clock. The first 8 moves (book) add 40 seconds to the clock at the start of the game. If there is a issue with this TC. It seems to be a K issue.

Right now Komodo is -91 Elo after 70 games. With a 95% -162 to -60 and 97.7% -203 to -43
Probably it is some combination of Komodo not being tuned or tested with pure increment plus error bars.
Komodo rules!
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Live Retest request Komodo 13.3 vs Stockfish 150220

Post by mwyoung »

End of Match.

DESKTOP-CORSAIR, Blitz 4min+1sec 0


1 Stockfish 150220 64 POPCNT +98 +24/=54/-2 63.75% 51.0/80
2 Komodo 13.3 64-bit -98 +2/=54/-24 36.25% 29.0/80
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
lkaufman
Posts: 5942
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Live Retest request Komodo 13.3 vs Stockfish 150220

Post by lkaufman »

mwyoung wrote: Fri Feb 21, 2020 5:51 am End of Match.

DESKTOP-CORSAIR, Blitz 4min+1sec 0


1 Stockfish 150220 64 POPCNT +98 +24/=54/-2 63.75% 51.0/80
2 Komodo 13.3 64-bit -98 +2/=54/-24 36.25% 29.0/80
I'm running both K13.3 and SF11 against the supposedly best Lc0 network with the supposedly best settings on my new 32 core threadripper, using 48 threads for K and SF as that seems best for both on my hardware, one 2080 ti for Lc0, at 2' + 1". SF beat Lc0 by 19 Elo after nearly 200 games, perhaps not surprising given the hardware which favors the CPU engine, and so far Komodo is down 34 elo after 113 games. If that holds, that would make Komodo just 53 elo worse than SF11 vs. Lc0. Perhaps the reason is that Komodo and Stockfish are more similar than either one is to Lc0, and in general running against dissimilar engines tends to show lower rating gaps.
Komodo rules!