Komodo 8 vs Stockfish dev’s (sept14)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Ron Langeveld
Posts: 140
Joined: Tue Jan 05, 2010 8:02 pm

Re: Komodo 8 vs Stockfish dev’s (sept14)

Post by Ron Langeveld »

My LTC tests actually show better result than 28 elo for SF but I don't have enough games finished considering the error margin.

Moreover, I was addressing the more cores aspect, not the LTC one.
Ron Langeveld
Posts: 140
Joined: Tue Jan 05, 2010 8:02 pm

Re: Komodo 8 vs Stockfish dev’s (sept14)

Post by Ron Langeveld »

Hugo wrote:
beram wrote:
lkaufman wrote:In general it seems that Stockfish scores better against Komodo than would be expected based on their results against other engines. Contempt may contribute to this, but our tests indicate that it's a pretty minor factor. Perhaps it's just something about their styles. My theory is that because Stockfish is so different from all the other top engines in terms of trading accuracy for depth, it tends to lose or draw a similar number of games to all strong opponents due to the occasional pruned good move, while Komodo just tends to see more than all other engines besides SF most of the time. So Komodo might win a three way event including Houdini or a 4 way event including Gull while still losing the direct match to SF. Of course this all depends on the details like time limit, hardware, etc. Whether strength is best measured by a match of the top two or by a RR of the top three or four is a matter of opinion. Both methods have been used to determine the human World Champion.

I prefer a final, head to head match to decide who is strongest if both come out on top in a big tournament (such as TCEC)
At this moment I think latest SFdev is ahead of about 54-55% over Komodo 8 in such a match at tournament TC
Hi Gentlemen

let me whisper some of my TLC match results :
Image

regards Clemens
I predict a big difference when you rerun this test with SF091114
yanquis1972
Posts: 1766
Joined: Wed Jun 03, 2009 12:14 am

Re: Komodo 8 vs Stockfish dev’s (sept14)

Post by yanquis1972 »

possibly, but i think TCEC will do a good job of showcasing the present state of affairs. komodo has improved as well of course, its just not public yet.
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: ..Komodo 8 vs Stockfish dev’s (nov14)

Post by beram »

lkaufman wrote:It is clear from the blitz (40/4) lists of CCRL and CEGT that SF5 and K8 are very close, with a slight edge to SF, so it's no surprise that newest SF, which is supposedly +23 over SF5 according to SF website, should beat K8 at blitz levels. But at the main levels of these two lists (40/40 and 40/20 respectively) Komodo 8 has a substantial lead over SF5 on both one core and 4 cores; the average lead for Komodo8 of these four ratings is 25 elo. So even if SF has gained 23 elo, K8 would have a 2 elo lead based on those numbers. It would be interesting to see some matches run at intermediate levels like maybe 20' plus 10" increment for example, slow enough to measure strength at a level typical of how they are used, but fast enough to get big samples on good hardware.

Difference between SFdev211114 and K8 at blitz and rapid TC nov 2014
I have played four 100 games matches of SF211114 against Komodo 8 on two PC’s with same openingbook 25 lines, colours reversed
On each PC a 100games match with TC 3m2s and another one with TC 15m10s (5times longer TC)
Avg game lenght approximate 8-9 minutes for TC 3m2s and 45-47 minutes TC 15m10s
One system AMD 1090T @3200Mhz, 4 cores, PB off and one system Intel i5 4200m @2500Mhz 2 cores, PB off
All four matches ended in a SF victory , overall 56,5% for SF (about 45 ELO difference)
On the i5 4200M At TC 3m2s the outcome was 57% for SF and for TC 15m10s it was 60% (!)
On the AMD 1090T at TC 3m2s the outcome was 53% for SF and for TC 15m10s it was 56%
Draw ratio from 54% at tc 3m2s on the i5 4200M and 66 % at tc 3m2s on the AMD 1090T
Aggregated 200 games blitz 3m2s, 55% for SF, while 200 games rapid 15m10s , 58% for SF

Conclusions: Latest SF241114 against Komodo 8 from blitz till rapid is clearly winning, with 3% better results at the (5x longer) rapid TC.

Code: Select all

SF 211114 – Komodo 8, Blitz 3m+2s       i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1                           
1   Stockfish 211114 64 BMI2   +49  +30/=54/-16 57.00%   57.0/100
2   Komodo 8 64-bit            -49  +16/=54/-30 43.00%   43.0/100

SF 2111 – Komodo 8, Rapid 15m+10s i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1                                                                
1   Stockfish 211114 64 BMI2   +70  +30/=60/-10 60.00%   60.0/100
2   Komodo 8 64-bit            -70  +10/=60/-30 40.00%   40.0/100

SF211114 - K8, Blitz 3m+2s             AMD 1090T @3200Mhz , 4cpu Fritzmark 15,6                         
1   Stockfish 211114 64  +20/=66/-14 53.00%   53.0/100
2   Komodo 8 64-bit      +14/=66/-20 47.00%   47.0/100

SF211114 – Komodo 8, Rapid 15m+10s     AMD 1090T @3200Mhz , 4cpu Fritzmark 15,6                
1   Stockfish 211114 64  +27/=58/-15 56.00%   56.0/100
2   Komodo 8 64-bit      +15/=58/-27 44.00%   44.0/100
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: ..Komodo 8 vs Stockfish dev’s (nov14)

Post by beram »

beram wrote:
lkaufman wrote:It is clear from the blitz (40/4) lists of CCRL and CEGT that SF5 and K8 are very close, with a slight edge to SF, so it's no surprise that newest SF, which is supposedly +23 over SF5 according to SF website, should beat K8 at blitz levels. But at the main levels of these two lists (40/40 and 40/20 respectively) Komodo 8 has a substantial lead over SF5 on both one core and 4 cores; the average lead for Komodo8 of these four ratings is 25 elo. So even if SF has gained 23 elo, K8 would have a 2 elo lead based on those numbers. It would be interesting to see some matches run at intermediate levels like maybe 20' plus 10" increment for example, slow enough to measure strength at a level typical of how they are used, but fast enough to get big samples on good hardware.

Difference between SFdev211114 and K8 at blitz and rapid TC nov 2014
I have played four 100 games matches of SF211114 against Komodo 8 on two PC’s with same openingbook 25 lines, colours reversed
On each PC a 100games match with TC 3m2s and another one with TC 15m10s (5times longer TC)
Avg game lenght approximate 8-9 minutes for TC 3m2s and 45-47 minutes TC 15m10s
One system AMD 1090T @3200Mhz, 4 cores, PB off and one system Intel i5 4200m @2500Mhz 2 cores, PB off
All four matches ended in a SF victory , overall 56,5% for SF (about 45 ELO difference)
On the i5 4200M At TC 3m2s the outcome was 57% for SF and for TC 15m10s it was 60% (!)
On the AMD 1090T at TC 3m2s the outcome was 53% for SF and for TC 15m10s it was 56%
Draw ratio from 54% at tc 3m2s on the i5 4200M and 66 % at tc 3m2s on the AMD 1090T
Aggregated 200 games blitz 3m2s, 55% for SF, while 200 games rapid 15m10s , 58% for SF

Conclusions: Latest SF241114 against Komodo 8 from blitz till rapid is clearly winning, with 3% better results at the (5x longer) rapid TC.

Code: Select all

SF 211114 – Komodo 8, Blitz 3m+2s       i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1                           
1   Stockfish 211114 64 BMI2   +49  +30/=54/-16 57.00%   57.0/100
2   Komodo 8 64-bit            -49  +16/=54/-30 43.00%   43.0/100

SF 2111 – Komodo 8, Rapid 15m+10s i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1                                                                
1   Stockfish 211114 64 BMI2   +70  +30/=60/-10 60.00%   60.0/100
2   Komodo 8 64-bit            -70  +10/=60/-30 40.00%   40.0/100

SF211114 - K8, Blitz 3m+2s             AMD 1090T @3200Mhz , 4cpu Fritzmark 15,6                         
1   Stockfish 211114 64  +20/=66/-14 53.00%   53.0/100
2   Komodo 8 64-bit      +14/=66/-20 47.00%   47.0/100

SF211114 – Komodo 8, Rapid 15m+10s     AMD 1090T @3200Mhz , 4cpu Fritzmark 15,6                
1   Stockfish 211114 64  +27/=58/-15 56.00%   56.0/100
2   Komodo 8 64-bit      +15/=58/-27 44.00%   44.0/100
And yet the result for first cycle of 50 games at TC 30m20s on i5 4200M @2500Mhz, 2 cpu. It took about 4 days for 50 games
Only three losses for Stockfish, drawratio has increased to 76% (!)
9 wins 3 losses and 38 draws with same openingsbook as used in the other matches at faster TC.
The same 56% winning percentage as for the 100 games match on the AMD 1090T at TC 15m10s
While when leaving the draws behind a 9-3 win is more convincing than the 27-15 win at the 100 games match on the AMD

Code: Select all

SF 2111 - K8, Rapid 30m+20s                                     
1   Stockfish 211114 64 BMI2   +42  +9/=38/-3 56.00%   28.0/50
2   Komodo 8 64-bit            -42  +3/=38/-9 44.00%   22.0/50
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: ..Komodo 8 vs Stockfish dev’s (nov14)

Post by beram »

Some match highlights of the last 30m20s fifty games match

First position K8 favours this position just as +0,12 for black (depth25), while SF has allready +1,3 (depth33)
It seems to be that the generally known good eval for K8 in positions having the rooks for a queen, sometimes backfires in positions where the queen has more attacking power than the rooks.
[D] 3r2k1/4qpp1/4b2p/8/p2Q3P/2P1PB2/1N3nP1/R3R1K1 b - - 0 31

Second position from the match. Position seems blocked but SF unlocks, wins the exchange and wins in convincing way the endgame.
K8 as black thinks +0,22 having played Ne8 at depth 25, while SF as white plays 23 Bc2 and gives +0,85 at depth 27. Later on SF opens with g3 and f4 and exchanges the pawn on e5, black takes back and white advances its d-pawn to d6, threatens c7 to fork the rooks as so happens and wins the game in technical way
[D] r3n1k1/2qbbp2/rp1p2pp/2pPp3/P1P1Pn2/1BN1NP2/3B2PP/R3QRK1 w - - 0 22

Third position is a nice king attacking knight sac with ..Nh5! in a Spanish game
K8 has just played 25 ..Na6 with +0,16 at depth 24. Sf plays Nh5 ! a knight sac with +1,5 at depth 30 because of unstoppable attack on the king 26 Nh5 – gxh5 27 e5+ Kg8 28 e6 !
[D] 2b1r3/3n1pbk/nq1p2pp/1ppP4/4P3/R4NNP/1P1B1PP1/1BQ3K1 w - - 0 26

The three Komodo wins out of the 50 games match, where a black win in a kings gambit after 1 e4 e5 2 f4 exf4 3 Nf3 . K8 played Be7 +0,25 and won in nice matter.
The second win was right out of the opening, +057 in a sharp Sicilian opening line.
And the last win was with white in an almost equal middle game position it showed SF how to elegantly outplay your opponent with the advantage of an advanced centre pawn.
K8 played 32 Nd4 +0,31 depth 22. SF doesnt aknowledge playing Rh5 +0.00 at dept 30. K8 exhanges knights on e6 exhanges the bishops with Bd4 and (although one pawn down) plays with its heavy pieces around the d6 pawn, which kind of paralyses blacks play and finally breaking through using its kingside pawns, and that all because of this important strong advantage, a pawn passer on d6
[D] 2r3k1/1p1q1pbp/3Pn1p1/1p3r2/8/Q3BN1P/3R1PP1/3R2K1 w - - 0 32
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: ..Komodo 8 vs Stockfish dev’s (nov14)

Post by beram »

beram wrote:
beram wrote:
lkaufman wrote:It is clear from the blitz (40/4) lists of CCRL and CEGT that SF5 and K8 are very close, with a slight edge to SF, so it's no surprise that newest SF, which is supposedly +23 over SF5 according to SF website, should beat K8 at blitz levels. But at the main levels of these two lists (40/40 and 40/20 respectively) Komodo 8 has a substantial lead over SF5 on both one core and 4 cores; the average lead for Komodo8 of these four ratings is 25 elo. So even if SF has gained 23 elo, K8 would have a 2 elo lead based on those numbers. It would be interesting to see some matches run at intermediate levels like maybe 20' plus 10" increment for example, slow enough to measure strength at a level typical of how they are used, but fast enough to get big samples on good hardware.

Difference between SFdev211114 and K8 at blitz and rapid TC nov 2014
I have played four 100 games matches of SF211114 against Komodo 8 on two PC’s with same openingbook 25 lines, colours reversed
On each PC a 100games match with TC 3m2s and another one with TC 15m10s (5times longer TC)
Avg game lenght approximate 8-9 minutes for TC 3m2s and 45-47 minutes TC 15m10s
One system AMD 1090T @3200Mhz, 4 cores, PB off and one system Intel i5 4200m @2500Mhz 2 cores, PB off
All four matches ended in a SF victory , overall 56,5% for SF (about 45 ELO difference)
On the i5 4200M At TC 3m2s the outcome was 57% for SF and for TC 15m10s it was 60% (!)
On the AMD 1090T at TC 3m2s the outcome was 53% for SF and for TC 15m10s it was 56%
Draw ratio from 54% at tc 3m2s on the i5 4200M and 66 % at tc 3m2s on the AMD 1090T
Aggregated 200 games blitz 3m2s, 55% for SF, while 200 games rapid 15m10s , 58% for SF

Conclusions: Latest SF241114 against Komodo 8 from blitz till rapid is clearly winning, with 3% better results at the (5x longer) rapid TC.

Code: Select all

SF 211114 – Komodo 8, Blitz 3m+2s       i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1                           
1   Stockfish 211114 64 BMI2   +49  +30/=54/-16 57.00%   57.0/100
2   Komodo 8 64-bit            -49  +16/=54/-30 43.00%   43.0/100

SF 2111 – Komodo 8, Rapid 15m+10s i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1                                                                
1   Stockfish 211114 64 BMI2   +70  +30/=60/-10 60.00%   60.0/100
2   Komodo 8 64-bit            -70  +10/=60/-30 40.00%   40.0/100

SF211114 - K8, Blitz 3m+2s             AMD 1090T @3200Mhz , 4cpu Fritzmark 15,6                         
1   Stockfish 211114 64  +20/=66/-14 53.00%   53.0/100
2   Komodo 8 64-bit      +14/=66/-20 47.00%   47.0/100

SF211114 – Komodo 8, Rapid 15m+10s     AMD 1090T @3200Mhz , 4cpu Fritzmark 15,6                
1   Stockfish 211114 64  +27/=58/-15 56.00%   56.0/100
2   Komodo 8 64-bit      +15/=58/-27 44.00%   44.0/100
And yet the result for first cycle of 50 games at TC 30m20s on i5 4200M @2500Mhz, 2 cpu. It took about 4 days for 50 games
Only three losses for Stockfish, drawratio has increased to 76% (!)
9 wins 3 losses and 38 draws with same openingsbook as used in the other matches at faster TC.
The same 56% winning percentage as for the 100 games match on the AMD 1090T at TC 15m10s
While when leaving the draws behind a 9-3 win is more convincing than the 27-15 win at the 100 games match on the AMD

Code: Select all

SF 2111 - K8, Rapid 30m+20s                                     
1   Stockfish 211114 64 BMI2   +42  +9/=38/-3 56.00%   28.0/50
2   Komodo 8 64-bit            -42  +3/=38/-9 44.00%   22.0/50
50 games match TC 30m20s on the AMD 1090T has ended in a
54% win for SF.
Stockfish 211114 64 +11/=32/-7 54.00% 27.0/50

The score of 54% is in between the previous 100 games matches with TC 3m2s 53% and TC 15m10s 56%
So it confirms the trend of steady performance for SF against K8 at longer TC

Code: Select all

SF211114 - K8, Blitz 3m+2s         AMD 1090T @3200Mhz , 4cpu Fritzmark 15,6                                     
1   Stockfish 211114 64  +20/=66/-14 53.00%   53.0/100
2   Komodo 8 64-bit      +14/=66/-20 47.00%   47.0/100

SF211114 - K8, Rapid 15m+10s          AMD 1090T @3200Mhz , 4cpu Fritzmark 15,6                         (5X longer TC)
1   Stockfish 211114 64  +27/=58/-15 56.00%   56.0/100
2   Komodo 8 64-bit      +15/=58/-27 44.00%   44.0/100

SF211114 - K8, LTC 30m+20s         AMD 1090T @3200Mhz , 4cpu Fritzmark 15,6                          (10x longer TC)
1   Stockfish 211114 64  +11/=32/-7 54.00%   27.0/50
2   Komodo 8 64-bit      +7/=32/-11 46.00%   23.0/50
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: ..Komodo 8 vs Stockfish dev’s (nov14)

Post by beram »

The second 50 games match on i5 4200M has ended so now for all TC's 100 games are played
The trend of steady performance is confirmed.
The SF 21114 version against Komodo 8 scores from blitz till LTC about the same. A win percentage of 57, 60, and 59,5 % respectively for TC 3m2s, 15m10s and 30m20s
It is a pity that this version doesnt play on TCEC ;-)

Code: Select all

SF 211114 – Komodo 8, Blitz 3m+2s       i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1                           
1   Stockfish 211114 64 BMI2   +49  +30/=54/-16 57.00%   57.0/100
2   Komodo 8 64-bit            -49  +16/=54/-30 43.00%   43.0/100

SF 2111 – Komodo 8, Rapid 15m+10s i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1 (5x longer TC)                                                           
1   Stockfish 211114 64 BMI2   +70  +30/=60/-10 60.00%   60.0/100
2   Komodo 8 64-bit            -70  +10/=60/-30 40.00%   40.0/100

SF 2111 – Komodo 8, Rapid 30m+20s	i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1 (10x longer TC)
1   Stockfish 211114 64 BMI2   +67  +24/=71/-5 59.50%   59.5/100
2   Komodo 8 64-bit            -67  +5/=71/-24 40.50%   40.5/100
whereagles
Posts: 565
Joined: Thu Nov 13, 2014 12:03 pm

Re: Komodo 8 vs Stockfish dev’s (sept14)

Post by whereagles »

lkaufman wrote:It is clear from the blitz (40/4) lists of CCRL and CEGT that SF5 and K8 are very close, with a slight edge to SF, so it's no surprise that newest SF, which is supposedly +23 over SF5 according to SF website, should beat K8 at blitz levels. But at the main levels of these two lists (40/40 and 40/20 respectively) Komodo 8 has a substantial lead over SF5 on both one core and 4 cores; the average lead for Komodo8 of these four ratings is 25 elo. So even if SF has gained 23 elo, K8 would have a 2 elo lead based on those numbers. It would be interesting to see some matches run at intermediate levels like maybe 20' plus 10" increment for example, slow enough to measure strength at a level typical of how they are used, but fast enough to get big samples on good hardware.
Can one really speak of ELO in an absolute way? I mean.. Nakamura is like +200 elo at rapid and blitz :)
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: ..Komodo 8 vs Stockfish dev’s (nov14)

Post by beram »

Latest SF101214 in comparison with SF 21114 over 100 games same conditions same book TC on i5 4200M @2500Mhz, 2 cpu
1 win more and one loss less 58%

SF 211114 – Komodo 8, Blitz 3m+2s i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1
1 Stockfish 211114 64 BMI2 +49 +30/=54/-16 57.00% 57.0/100
2 Komodo 8 64-bit -49 +16/=54/-30 43.00% 43.0/100

SF 101214 - Komodo 8, Blitz 3m+2s i5 4200M @2500Mhz, 2 cpu Fritzmark 5,1
1 Stockfish 101214 64 BMI2 +56 +31/=54/-15 58.00% 58.0/100
2 Komodo 8 64-bit -56 +15/=54/-31 42.00% 42.0/100