Stockfish 2.1 running for the IPON

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
Graham Banks
Posts: 45235
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: ELO-LaOla

Post by Graham Banks »

Frank Quisinsky wrote:Hi Graham,

could you send me all the CCRL Stockfish games (older versions, 1.7, 1.8, 1.9, 2.0 too, x86 and x64 with 1 Core, or an URL :-) ... I will search a little bit start of next week.

Have a nice weekend!

Best
Frank
Hi Frank,

the downloads by engine only gets updated once each month. Here is the latest:

http://computerchess.org.uk/ccrl/4040/games.html

Hope you have a great weekend too. :)

Cheers,
Graham.
gbanksnz at gmail.com
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Stockfish 2.1 running for the IPON

Post by mcostalba »

IWB wrote:as usual:

http://www.inwoba.de

Have fun
Ingo
Hi Ingo,

We had some report about unusual high number of time losses, did you experienced losses on time ?

Also, do you completely restart the engines between one game and the other or let the process (the executable) go and just send "new game" command ?

I have to admit results are a bit disappointing because our internal tests gave around +30 ELO, but I think your results are more accurate because are run at longer TC and using much more opponents, so perhaps we have to rethink our testing strategy :-(

Thanks a lot for your tests ! And very sorry for wasting your time: if we knew we were just at +10 very probably we didn't release.

Marco
ernest
Posts: 2053
Joined: Wed Mar 08, 2006 8:30 pm

Re: Stockfish 2.1 running for the IPON

Post by ernest »

mcostalba wrote:I have to admit results are a bit disappointing because our internal tests gave around +30 ELO
In direct matches against DeepRybka 4.1 I find that Stockfish 2.1 has reached equality, which is quite an improvement!
Here is a 24-move execution by Stockfish on opposite castlings... :D

Code: Select all

[Event "DR4.1 x64 - SF21 x64 (PB5m), Blitz:2'+1"]
[Site "EB-PC"]
[Date "2011.05.05"]
[Round "30"]
[White "Stockfish 2.1 JA 64bit"]
[Black "Deep Rybka 4.1 x64"]
[Result "1-0"]
[ECO "B01"]
[Annotator "0.84;0.60"]
[PlyCount "47"]
[TimeControl "120+1"]

{Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz 3000 MHz  W=17.7 plies; 1
 883kN/s; PB5moves.ctg  B=12.3 plies; 158kN/s; PB5moves.ctg} 1. e4 {[%eval 0,0]
[%emt 0:00:00]} d5 {[%eval 0,0] [%emt 0:00:00]} 2. exd5 {[%eval 0,0] [%emt 0:
00:00]} Qxd5 {[%eval 0,0] [%emt 0:00:00]} 3. Nc3 {[%eval 0,0] [%emt 0:00:00]}
Qa5 {[%eval 0,0] [%emt 0:00:00]} 4. d4 {[%eval 0,0] [%emt 0:00:00]} Nf6 {
[%eval 0,0] [%emt 0:00:00]} 5. Nf3 {[%eval 0,0] [%emt 0:00:00]} Bf5 {[%eval 0,
0] [%emt 0:00:00] Both last book move} 6. Ne5 {[%eval 84,16] [%emt 0:00:03]}
Be6 {[%eval 60,13] [%emt 0:00:04]} 7. Rb1 {[%eval 84,17] [%emt 0:00:04]} Nd5 {
[%eval 69,13] [%emt 0:00:06] (Nbd7)} 8. Bd2 {[%eval 101,18] [%emt 0:00:06] (b4)
} Qb6 {[%eval 48,14] [%emt 0:00:08] (Nd7)} 9. Nf3 {[%eval 96,19] [%emt 0:00:05]
} Nd7 {[%eval 48,14] [%emt 0:00:03]} 10. Be2 {[%eval 96,19] [%emt 0:00:10] 
(Bd3)} O-O-O {[%eval 58,13] [%emt 0:00:05] (c6)} 11. Na4 {[%eval 133,17] [%emt
0:00:03]} Qd6 {[%eval 76,13] [%emt 0:00:04]} 12. c4 {[%eval 169,18] [%emt 0:00:
08]} N5f6 {[%eval 76,14] [%emt 0:00:06]} 13. O-O {[%eval 185,18] [%emt 0:00:06]
} Ne4 {[%eval 92,12] [%emt 0:00:03] (Bf5)} 14. Be3 {[%eval 197,17] [%emt 0:00:
03]} Bf5 {[%eval 104,13] [%emt 0:00:03]} 15. c5 {[%eval 282,17] [%emt 0:00:04]
(Rc1)} Qf6 {[%eval 108,13] [%emt 0:00:10] (Qd5)} 16. Rc1 {[%eval 298,19] [%emt
0:00:04] (Bd3)} c6 {[%eval 186,11] [%emt 0:00:06] (Nb8)} 17. Qe1 {[%eval 476,
18] [%emt 0:00:05]} e5 {[%eval 194,11] [%emt 0:00:04] (Re8)} 18. Qa5 {[%eval
525,17] [%emt 0:00:03]} Kb8 {[%eval 194,11] [%emt 0:00:05]} 19. dxe5 {[%eval
573,17] [%emt 0:00:03]} Qe7 {[%eval 294,11] [%emt 0:00:04]} 20. Nd4 {[%eval
557,18] [%emt 0:00:04] (Bd3)} g6 {[%eval 315,12] [%emt 0:00:05] (Re8)} 21. Nb5
{[%eval 840,17] [%emt 0:00:06] (f3)} cxb5 {[%eval 792,10] [%emt 0:00:06] (a6)}
22. c6 {[%eval 1333,18] [%emt 0:00:03]} Nb6 {[%eval 994,11] [%emt 0:00:05]} 23.
c7+ {[%eval 1410,18] [%emt 0:00:02]} Qxc7 {[%eval 1025,12] [%emt 0:00:02]} 24.
Rxc7 {[%eval 1426,18] [%emt 0:00:01]} 1-0
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Stockfish 2.1 running for the IPON

Post by IWB »

Stockfish 2.1 Test finished.

http://www.inwoba.de

Initial Elo increase of 7 Elo. I consider this mainly as an engine code cleaning/bug fix.

Bye
Ingo
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Stockfish 2.1 running for the IPON

Post by IWB »

Hello Marco,
mcostalba wrote:
We had some report about unusual high number of time losses, did you experienced losses on time ?
First I have to mention that I do not correct losses on time for my list as I consider this as a feature, or at least something that would be 'no point' during a WC. So I do not see any reason to correct this. (Especially with an increment of +3 I do not see any reason to help an engine getting points!)

In the 2000 games I have 26 losses on time. 14 in favor FOR Stockfish, 12 AGAINST it. So Stockfish is overall the worst engine in that 2000 game set ...! (I did not check if they where draws in reality, just the pure fact.) (There is another engine which has 4 losses, and just has 100 games ...)
mcostalba wrote: Also, do you completely restart the engines between one game and the other or let the process (the executable) go and just send "new game" command ?
I play with the Shredder Classic GUI, which is restarting the engines after every single game! (Out of memory and reload!)
mcostalba wrote: I have to admit results are a bit disappointing because our internal tests gave around +30 ELO, but I think your results are more accurate because are run at longer TC and using much more opponents, so perhaps we have to rethink our testing strategy :-(
I remember as well that someone of your team was suprised to see an increase in 1.6 or 1.7 while your testing showed nothing. I guess an increase is accepted while nothing causes trouble :-)
I personaly have some doubts in this ultra short time testing. I have the feeling that there is a lower limit (which is much lower tham my 5 + 3) to realize a real increase in Elo.
mcostalba wrote: Thanks a lot for your tests ! And very sorry for wasting your time: if we knew we were just at +10 very probably we didn't release.
Thanks for the support, I do not have a problem, but I have to check what Stockfish to throw out as I want to avoid too much of one flavor. This is definately spoiling the soup :-)


Anyhow - keep on working, I like Stockfish very much!

Have a nice weekend
Ingo
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Stockfish 2.1 running for the IPON

Post by Laskos »

IWB wrote:Hello Marco,

In the 2000 games I have 26 losses on time. 14 in favor FOR Stockfish, 12 AGAINST it. So Stockfish is overall the worst engine in that 2000 game set ...! (I did not check if they where draws in reality, just the pure fact.) (There is another engine which has 4 losses, and just has 100 games ...)


I play with the Shredder Classic GUI, which is restarting the engines after every single game! (Out of memory and reload!)

Ingo
Do you generally have this rate of time losses at 5m + 3s? I tested SF 2.1 at 1s + 0.1s in 6600 games, not a single time loss, not a single illegal move loss. Did you check for illegal move losses? Usually, when I have this rate of time losses, I stop testing at that TC or in that GUI.

Kai
Albert Silver
Posts: 3026
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: Stockfish 2.1 running for the IPON

Post by Albert Silver »

Laskos wrote:
IWB wrote:Hello Marco,

In the 2000 games I have 26 losses on time. 14 in favor FOR Stockfish, 12 AGAINST it. So Stockfish is overall the worst engine in that 2000 game set ...! (I did not check if they where draws in reality, just the pure fact.) (There is another engine which has 4 losses, and just has 100 games ...)


I play with the Shredder Classic GUI, which is restarting the engines after every single game! (Out of memory and reload!)

Ingo
Do you generally have this rate of time losses at 5m + 3s? I tested SF 2.1 at 1s + 0.1s in 6600 games, not a single time loss, not a single illegal move loss. Did you check for illegal move losses? Usually, when I have this rate of time losses, I stop testing at that TC or in that GUI.

Kai
I agree that the rate of time loss is huge, and made even more incomprehensible by the fact a generous increment was used too. For me a time loss in this case means there is something seriously wrong.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Stockfish 2.1 running for the IPON

Post by IWB »

Hi

Yes, I have that rate of time losses and 1% is actually nothing I worrry about.

In generall ALL engines have time losses (or at least the majority of them). Some engines more, some less. I play so many games that I am getting used to it. :-)

There are different kind of time losses. One kind, like here for Stockfish, is something I do not care about at all - because it is 0.6% AND the engine continues to play which is very important in my automatic setup. So I consider this as a feature of an engine which is a part of its playing strength. Another kind is a loss of time with a crash which prevents the GUI to continue. As I have to stop and restart some parts of tourney again and again.
As I accept time losses (I am not a developer, why should I) as something normal (as it happens to humans at 5+ 3 too) I start to worry at a certain level which is different for a 2600 or 3000 Elo engine.
In short, I don't see why to worry about time losses as a tester as long as the game series continues.

Things are different if I am betatesting or if I would develop an engine of course ...

Bye
Ingo
User avatar
Graham Banks
Posts: 45235
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: Stockfish 2.1 running for the IPON

Post by Graham Banks »

IWB wrote:Hi

Yes, I have that rate of time losses and 1% is actually nothing I worrry about.

In generall ALL engines have time losses (or at least the majority of them). Some engines more, some less. I play so many games that I am getting used to it. :-)

There are different kind of time losses. One kind, like here for Stockfish, is something I do not care about at all - because it is 0.6% AND the engine continues to play which is very important in my automatic setup. So I consider this as a feature of an engine which is a part of its playing strength. Another kind is a loss of time with a crash which prevents the GUI to continue. As I have to stop and restart some parts of tourney again and again.
As I accept time losses (I am not a developer, why should I) as something normal (as it happens to humans at 5+ 3 too) I start to worry at a certain level which is different for a 2600 or 3000 Elo engine.
In short, I don't see why to worry about time losses as a tester as long as the game series continues.

Things are different if I am betatesting or if I would develop an engine of course ...

Bye
Ingo
Hi Ingo,

are the games that are lost of time included in your ratings calculations?

Cheers,
Graham.
gbanksnz at gmail.com
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Stockfish 2.1 running for the IPON

Post by IWB »

Hi Graham,
Graham Banks wrote: are the games that are lost of time included in your ratings calculations?
I think this is clear by what I have already written but to make it even clearer: Of course! And I consider not to include these games as a mistake as this loss is something the engine made and it belongs to their playing strength. Correcting that manualy means to give engine points which they did not managed to get the first try which is distorting a rating - in short: correcting losses on time would be simply wrong!
Show me a tourney where someone losses on time and gets a second chance?

But I have an eye on that. If there is an engine which is too bad, because of an obvious bug, I do not include it (just happend a few weeks ago). None of the engines in my list would gain 10 Elo if all losses on time would be calulated corrrectly. Which means, that this engine feature is below my average reolution of the list.

Bye
Ingo