Stockfish 3 running for the IPON

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Stockfish 3 running for the IPON

Post by IWB »

here: http://www.inwoba.de

Have fun
Ingo
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: Stockfish 3 running for the IPON

Post by gladius »

IWB wrote:here: http://www.inwoba.de

Have fun
Ingo
Great, thanks Ingo! Always fun to watch the results :).
Vinvin
Posts: 5312
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Stockfish 3 running for the IPON

Post by Vinvin »

IWB wrote:here: http://www.inwoba.de

Have fun
Ingo
Not online ? :(
User avatar
Eelco de Groot
Posts: 4694
Joined: Sun Mar 12, 2006 2:40 am
Full name:   Eelco de Groot

Re: Stockfish 3 running for the IPON

Post by Eelco de Groot »

Code: Select all

Stock3

Stockfish 3 JA SSE4.2 - Houdini 3 STD (3082)  7.0 - 7.0  50.00%  Perf=3082 
Stockfish 3 JA SSE4.2 - Komodo 5 (3001)  5.5 - 7.5  42.31%  Perf=2948 
Stockfish 3 JA SSE4.2 - Critter 1.4a (2978)  7.5 - 6.5  53.57%  Perf=3002 
Stockfish 3 JA SSE4.2 - Deep Rybka 4.1 (2958)  9.0 - 5.0  64.29%  Perf=3060 
Stockfish 3 JA SSE4.2 - Gull II (2928)  9.0 - 4.0  69.23%  Perf=3068 
Stockfish 3 JA SSE4.2 - Chiron 1.5 (2843)  10.5 - 2.5  80.77%  Perf=3092 
Stockfish 3 JA SSE4.2 - Protector 1.5.0 (2839)  9.0 - 4.0  69.23%  Perf=2979 
Stockfish 3 JA SSE4.2 - Naum 4.2 (2831)  10.5 - 2.5  80.77%  Perf=3080 
Stockfish 3 JA SSE4.2 - Hannibal 1.3 (2827)  9.5 - 4.5  67.86%  Perf=2956 
Stockfish 3 JA SSE4.2 - HIARCS 14 WCSC 32b (2817)  8.5 - 4.5  65.38%  Perf=2927 
Stockfish 3 JA SSE4.2 - Deep Shredder 12 (2800)  8.5 - 4.5  65.38%  Perf=2910 
Stockfish 3 JA SSE4.2 - Deep Sjeng c't 2010 32b (2783)  8.5 - 3.5  70.83%  Perf=2937 
Stockfish 3 JA SSE4.2 - Spike 1.4 32b (2772)  9.5 - 3.5  73.08%  Perf=2945 
Stockfish 3 JA SSE4.2 - spark-1.0 (2761)  10.0 - 3.0  76.92%  Perf=2970 
Stockfish 3 JA SSE4.2 - Deep Junior 13.3 (2748)  11.0 - 2.0  84.62%  Perf=3044 
Stockfish 3 JA SSE4.2 - Quazar 0.4 (2730)  12.5 - 1.5  89.29%  Perf=3098 
Stockfish 3 JA SSE4.2 - Toga II 3.0 32b (2715)  11.5 - 1.5  88.46%  Perf=3068 
Stockfish 3 JA SSE4.2 - Zappa Mexico II (2704)  9.0 - 3.0  75.00%  Perf=2894 
  166.5 - 70.5  70.25%  Perf=2991 




237 out of 2700 games played 
Level: 5 Minutes/Game + 3 sec/Move 

I can see the list here, currently Stockfish 3 is at 2991, after less than ten percent of the games played, but since old Stockfish 2.3.1 is 2960 elo I think Ingo's test might show some improvement finally over 2.3.1 :)

Eelco
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Stockfish 3 running for the IPON

Post by IWB »

Eelco de Groot wrote: I can see the list here, currently Stockfish 3 is at 2991, after less than ten percent of the games played, but since old Stockfish 2.3.1 is 2960 elo I think Ingo's test might show some improvement finally over 2.3.1 :)
And do not forget that the online calculation is "preliminary".
The real calculation with Bayeselo might go up or down a few elo. This one is just to show a trend. If two versions are not to far off of each other it might be misleading.

If someone can't see it please push F5 a few times.

Bye
Ingo
Modern Times
Posts: 3799
Joined: Thu Jun 07, 2012 11:02 pm

Re: Stockfish 3 running for the IPON

Post by Modern Times »

It is not going well for me at all in my chess960 matches. After 54 games against Houdini it is scoring 20%, whereas Stockfish 2.3.1 scored 30% against Houdini. Maybe it will improve, fingers crossed.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Stockfish 3 running for the IPON

Post by mcostalba »

Thanks Ingo for running this !

It is too early to draw conclusions, but at the moment I am quite satisfied, if we look at past release performance against the same opponents we have:

Code: Select all

From Stockfish 2.3.1 JA -> Stockfish 3 JA (1552 out of 2700 games played)

Houdini 3 STD                 : 28.7 %  -> 34.48
Komodo 5                      : 48.0 %  -> 50.00
Critter 1.4a                  : 46.3 %  -> 51.72
Deep Rybka 4.1                : 47.0 %  -> 56.25
Gull II                       : 56.3 %  -> 55.23
Chiron 1.5                    : 66.7 %  -> 66.86
Hannibal 1.3                  : 71.3 %  -> 71.84
HIARCS 14 WCSC 32b            : 74.3 %  -> 69.41
Deep Junior 13.3              : 74.0 %  -> 80.59
Deep Shredder 12              : 69.7 %  -> 67.82
Deep Sjeng c't 2010 32b       : 74.0 %  -> 74.42
spark-1.0                     : 78.0 %  -> 78.74
Zappa Mexico II               : 81.0 %  -> 82.53
Spike 1.4 32b                 : 73.7 %  -> 82.56
Quazar 0.4                    : 81.3 %  -> 84.12
Protector 1.5.0               : 70.3 %  -> 67.24
Toga II 3.0 32b               : 78.3 %  -> 86.21
Naum 4.2                      : 73.0 %  -> 71.26
So more or less an increase and against strong engines SF 3 seems to perform better than 2.3.1
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Stockfish 3 running for the IPON

Post by IWB »

mcostalba wrote:Thanks Ingo for running this !

It is too early to draw conclusions, but at the moment I am quite satisfied, if we look at past release performance against the same opponents we have:

Code: Select all

From Stockfish 2.3.1 JA -> Stockfish 3 JA (1552 out of 2700 games played)

Houdini 3 STD                 : 28.7 %  -> 34.48
Komodo 5                      : 48.0 %  -> 50.00
Critter 1.4a                  : 46.3 %  -> 51.72
Deep Rybka 4.1                : 47.0 %  -> 56.25
Gull II                       : 56.3 %  -> 55.23
Chiron 1.5                    : 66.7 %  -> 66.86
Hannibal 1.3                  : 71.3 %  -> 71.84
HIARCS 14 WCSC 32b            : 74.3 %  -> 69.41
Deep Junior 13.3              : 74.0 %  -> 80.59
Deep Shredder 12              : 69.7 %  -> 67.82
Deep Sjeng c't 2010 32b       : 74.0 %  -> 74.42
spark-1.0                     : 78.0 %  -> 78.74
Zappa Mexico II               : 81.0 %  -> 82.53
Spike 1.4 32b                 : 73.7 %  -> 82.56
Quazar 0.4                    : 81.3 %  -> 84.12
Protector 1.5.0               : 70.3 %  -> 67.24
Toga II 3.0 32b               : 78.3 %  -> 86.21
Naum 4.2                      : 73.0 %  -> 71.26
So more or less an increase and against strong engines SF 3 seems to perform better than 2.3.1
You are welcome, but ...

Looking at the individual results is misleading for two reasons.

1. It is just 150 games over all. A statistic out of that is a bit "weak" ...
2. If you have to compare percentages you should do it when he run is completed. It might be that the difficult games are yet to come and the percentage goes down ... or up.

With a propper bayeselo calculation the top engines usually go up a few elo. Therefore I assume that the final result might be 15 to 20 Elo better than 2.3.1 (which was basicaly at the same level as 2.2.2). Usually my results are quite in line with the other major list so I would guess a +20 Elo at best (probably lower)!

Time will tell :-)

Bye
Ingo
User avatar
Ajedrecista
Posts: 2176
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Stockfish 3 running for the IPON.

Post by Ajedrecista »

Hello Ingo:

After 2700 games:

Code: Select all

Stock3

Stockfish 3 JA SSE4.2 - Houdini 3 STD (3082)             56.0 - 94.0   37.33%  Perf=2993 
Stockfish 3 JA SSE4.2 - Komodo 5 (3001)                  70.0 - 80.0   46.67%  Perf=2978 
Stockfish 3 JA SSE4.2 - Critter 1.4a (2978)              77.5 - 72.5   51.67%  Perf=2989 
Stockfish 3 JA SSE4.2 - Deep Rybka 4.1 (2958)            82.5 - 67.5   55.00%  Perf=2992 
Stockfish 3 JA SSE4.2 - Gull II (2928)                   85.5 - 64.5   57.00%  Perf=2976 
Stockfish 3 JA SSE4.2 - Chiron 1.5 (2843)               102.0 - 48.0   68.00%  Perf=2973 
Stockfish 3 JA SSE4.2 - Protector 1.5.0 (2839)          107.5 - 42.5   71.67%  Perf=3000 
Stockfish 3 JA SSE4.2 - Naum 4.2 (2831)                 107.0 - 43.0   71.33%  Perf=2989 
Stockfish 3 JA SSE4.2 - Hannibal 1.3 (2827)             109.0 - 41.0   72.67%  Perf=2996 
Stockfish 3 JA SSE4.2 - HIARCS 14 WCSC 32b (2817)       107.0 - 43.0   71.33%  Perf=2975 
Stockfish 3 JA SSE4.2 - Deep Shredder 12 (2800)         105.0 - 45.0   70.00%  Perf=2947 
Stockfish 3 JA SSE4.2 - Deep Sjeng c't 2010 32b (2783)  113.5 - 36.5   75.67%  Perf=2980 
Stockfish 3 JA SSE4.2 - Spike 1.4 32b (2772)            123.0 - 27.0   82.00%  Perf=3035 
Stockfish 3 JA SSE4.2 - spark-1.0 (2761)                122.5 - 27.5   81.67%  Perf=3020 
Stockfish 3 JA SSE4.2 - Deep Junior 13.3 (2748)         121.0 - 29.0   80.67%  Perf=2996 
Stockfish 3 JA SSE4.2 - Quazar 0.4 (2730)               127.0 - 23.0   84.67%  Perf=3026 
Stockfish 3 JA SSE4.2 - Toga II 3.0 32b (2715)          124.0 - 26.0   82.67%  Perf=2986 
Stockfish 3 JA SSE4.2 - Zappa Mexico II (2704)          121.5 - 28.5   81.00%  Perf=2955 
                                                       1861.5 - 838.5  68.94%  Perf=2977 

2700 out of 2700 games played
I know that this performance of 2977 is a mere estimation that usually does not match with the definitive rating given by BayesElo. Doing strange interpolations, I get around 2979 to the definitive list (I know that games against Fritz are still missing). Could you provide a provisional rating in BayesElo after those 2700 games, please? Thanks in advance.

Congratulations to SF team! ;)

Regards from Spain.

Ajedrecista.
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Stockfish 3 running for the IPON.

Post by IWB »

Ajedrecista wrote:Hello Ingo:

I know that this performance of 2977 is a mere estimation that usually does not match with the definitive rating given by BayesElo. Doing strange interpolations, I get around 2979 to the definitive list (I know that games against Fritz are still missing). Could you provide a provisional rating in BayesElo after those 2700 games, please? Thanks in advance.

Congratulations to SF team! ;)

Regards from Spain.

Ajedrecista.
The games vs Fritz are already finished (I can see that via a remote connection) and I will give a final result in a few hours as soon as I am home. Your guess is not too bad, I am sure it will be in hat region.

Thx for your interest
Ingo