see http://cegt.foren-city.de/topic,236,-co ... h-1-3.html
Uri
first CEGT result of stockfish is not good
Moderator: Ras
-
- Posts: 10900
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
-
- Posts: 2994
- Joined: Wed Mar 08, 2006 10:09 pm
- Location: Germany
- Full name: Werner Schüle
Re: first CEGT result of stockfish is not good
Hi,
I hope these are only some statistical errors! To be sure I will change the GUI to Arena 2.01 for the next matches.
Here a short update:
Stockfish 1.3 x64 1CPU - Fruit 2.3.5m w32 1CPU = 15-15 (2897)
Stockfish 1.3 x64 2CPU - Fruit 2.3.5m x64 2CPU = 23,5-21,5 (2922)
Stockfish 1.3 x64 4CPU - Deep Sjeng WC 2008 x64 4CPU = 11-7 (2874)
I hope these are only some statistical errors! To be sure I will change the GUI to Arena 2.01 for the next matches.
Here a short update:
Stockfish 1.3 x64 1CPU - Fruit 2.3.5m w32 1CPU = 15-15 (2897)
Stockfish 1.3 x64 2CPU - Fruit 2.3.5m x64 2CPU = 23,5-21,5 (2922)
Stockfish 1.3 x64 4CPU - Deep Sjeng WC 2008 x64 4CPU = 11-7 (2874)

Werner
-
- Posts: 2684
- Joined: Sat Jun 14, 2008 9:17 pm
Re: first CEGT result of stockfish is not good
Stockfish is too slow for Arena...often loose on time.Werner wrote:Hi,
I hope these are only some statistical errors! To be sure I will change the GUI to Arena 2.01 for the next matches.
Here a short update:
Stockfish 1.3 x64 1CPU - Fruit 2.3.5m w32 1CPU = 15-15 (2897)
Stockfish 1.3 x64 2CPU - Fruit 2.3.5m x64 2CPU = 23,5-21,5 (2922)
Stockfish 1.3 x64 4CPU - Deep Sjeng WC 2008 x64 4CPU = 11-7 (2874)

Werner, I would like to ask you as a testing expert, one question I have on lost on time games.
In my internal tests I made up a trick to try to avoid loosing on time or loosing/winning by accident.
I have called it "Last Seconds Noise filtering" it simply works so that when at very few seconds from the time limit one of the two engine is very high in score (and the other very low) the match is adjudicated to the engine in advantage. This avoids situations where, in the final rush, the winning engine blunders and loose or draws a practically won match also if it played in a superior way.
And now the question.
In your tests do you take in account the % of lost/wins on time in the total amount of lost/wins?
Do you have some kind of trigger or warning to avoid these kind of (seemingly) artifacts ?
Thanks
Marco
-
- Posts: 2994
- Joined: Wed Mar 08, 2006 10:09 pm
- Location: Germany
- Full name: Werner Schüle
Re: first CEGT result of stockfish is not good
Hi Marco,mcostalba wrote:In your tests do you take in account the % of lost/wins on time in the total amount of lost/wins?
Do you have some kind of trigger or warning to avoid these kind of (seemingly) artifacts ?
Thanks
Marco
a) I do not count lost games on time - normally I delete these games and if it´s too much I stop test and contact the author. I try to avoid it of course.
b) the games sent to the CEGT Admin for the list are always controlled for
- doublettes, - lost on time, - line results; and sometimes the admin makes these tests for the whole database too. I think Leo makes such reports in his forum.
At the moment I am running the Stockfish matches inside Chessbase GUI. No lost on time till now. So if there are problems with Arena 2.01 I will not change. Perhaps I make a test with Shredder GUI?
Werner
-
- Posts: 719
- Joined: Thu Mar 09, 2006 1:21 am
- Location: Portland Oregon
Re: first CEGT result of stockfish is not good
Losses on time in my experience are rarely the fault of the engine. To compensate for windows try adding a small time buffer so that the engine believes that it has slightly less time. Best would be to use Linux and Cutechess-cli though.mcostalba wrote:Stockfish is too slow for Arena...often loose on time.Werner wrote:Hi,
I hope these are only some statistical errors! To be sure I will change the GUI to Arena 2.01 for the next matches.
Here a short update:
Stockfish 1.3 x64 1CPU - Fruit 2.3.5m w32 1CPU = 15-15 (2897)
Stockfish 1.3 x64 2CPU - Fruit 2.3.5m x64 2CPU = 23,5-21,5 (2922)
Stockfish 1.3 x64 4CPU - Deep Sjeng WC 2008 x64 4CPU = 11-7 (2874)
Werner, I would like to ask you as a testing expert, one question I have on lost on time games.
In my internal tests I made up a trick to try to avoid loosing on time or loosing/winning by accident.
I have called it "Last Seconds Noise filtering" it simply works so that when at very few seconds from the time limit one of the two engine is very high in score (and the other very low) the match is adjudicated to the engine in advantage. This avoids situations where, in the final rush, the winning engine blunders and loose or draws a practically won match also if it played in a superior way.
And now the question.
In your tests do you take in account the % of lost/wins on time in the total amount of lost/wins?
Do you have some kind of trigger or warning to avoid these kind of (seemingly) artifacts ?
Thanks
Marco
-
- Posts: 9773
- Joined: Wed Mar 08, 2006 8:44 pm
- Location: Amman,Jordan
Re: first CEGT result of stockfish is not good
This is what I am talking about years and years for now....Games lost on time must be deleted and not included in the rating list....a chess engine must not lose on time,this this typical for humans....Werner wrote:Hi Marco,mcostalba wrote:In your tests do you take in account the % of lost/wins on time in the total amount of lost/wins?
Do you have some kind of trigger or warning to avoid these kind of (seemingly) artifacts ?
Thanks
Marco
a) I do not count lost games on time - normally I delete these games and if it´s too much I stop test and contact the author. I try to avoid it of course.
b) the games sent to the CEGT Admin for the list are always controlled for
- doublettes, - lost on time, - line results; and sometimes the admin makes these tests for the whole database too. I think Leo makes such reports in his forum.
At the moment I am running the Stockfish matches inside Chessbase GUI. No lost on time till now. So if there are problems with Arena 2.01 I will not change. Perhaps I make a test with Shredder GUI?
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
-
- Posts: 9773
- Joined: Wed Mar 08, 2006 8:44 pm
- Location: Amman,Jordan
Re: first CEGT result of stockfish is not good
Exactly my pointRyan Benitez wrote:Losses on time in my experience are rarely the fault of the engine. To compensate for windows try adding a small time buffer so that the engine believes that it has slightly less time. Best would be to use Linux and Cutechess-cli though.mcostalba wrote:Stockfish is too slow for Arena...often loose on time.Werner wrote:Hi,
I hope these are only some statistical errors! To be sure I will change the GUI to Arena 2.01 for the next matches.
Here a short update:
Stockfish 1.3 x64 1CPU - Fruit 2.3.5m w32 1CPU = 15-15 (2897)
Stockfish 1.3 x64 2CPU - Fruit 2.3.5m x64 2CPU = 23,5-21,5 (2922)
Stockfish 1.3 x64 4CPU - Deep Sjeng WC 2008 x64 4CPU = 11-7 (2874)
Werner, I would like to ask you as a testing expert, one question I have on lost on time games.
In my internal tests I made up a trick to try to avoid loosing on time or loosing/winning by accident.
I have called it "Last Seconds Noise filtering" it simply works so that when at very few seconds from the time limit one of the two engine is very high in score (and the other very low) the match is adjudicated to the engine in advantage. This avoids situations where, in the final rush, the winning engine blunders and loose or draws a practically won match also if it played in a superior way.
And now the question.
In your tests do you take in account the % of lost/wins on time in the total amount of lost/wins?
Do you have some kind of trigger or warning to avoid these kind of (seemingly) artifacts ?
Thanks
Marco

Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
-
- Posts: 10900
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: first CEGT result of stockfish is not good
You can take engines that lose on time(more often than 1 game out of 1000) out of the rating list butDr.Wael Deeb wrote:This is what I am talking about years and years for now....Games lost on time must be deleted and not included in the rating list....a chess engine must not lose on time,this this typical for humans....Werner wrote:Hi Marco,mcostalba wrote:In your tests do you take in account the % of lost/wins on time in the total amount of lost/wins?
Do you have some kind of trigger or warning to avoid these kind of (seemingly) artifacts ?
Thanks
Marco
a) I do not count lost games on time - normally I delete these games and if it´s too much I stop test and contact the author. I try to avoid it of course.
b) the games sent to the CEGT Admin for the list are always controlled for
- doublettes, - lost on time, - line results; and sometimes the admin makes these tests for the whole database too. I think Leo makes such reports in his forum.
At the moment I am running the Stockfish matches inside Chessbase GUI. No lost on time till now. So if there are problems with Arena 2.01 I will not change. Perhaps I make a test with Shredder GUI?
Dr.D
deleting losses on time is not fair because in this case authors can tell their engine to lose on time instead of resigning to earn rating points.
Uri
-
- Posts: 10900
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: first CEGT result of stockfish is not good
I think that cases when windows is quilty on losses on time rarely happens except very fast time control that is not used by CEGT.Ryan Benitez wrote:Losses on time in my experience are rarely the fault of the engine. To compensate for windows try adding a small time buffer so that the engine believes that it has slightly less time. Best would be to use Linux and Cutechess-cli though.mcostalba wrote:Stockfish is too slow for Arena...often loose on time.Werner wrote:Hi,
I hope these are only some statistical errors! To be sure I will change the GUI to Arena 2.01 for the next matches.
Here a short update:
Stockfish 1.3 x64 1CPU - Fruit 2.3.5m w32 1CPU = 15-15 (2897)
Stockfish 1.3 x64 2CPU - Fruit 2.3.5m x64 2CPU = 23,5-21,5 (2922)
Stockfish 1.3 x64 4CPU - Deep Sjeng WC 2008 x64 4CPU = 11-7 (2874)
Werner, I would like to ask you as a testing expert, one question I have on lost on time games.
In my internal tests I made up a trick to try to avoid loosing on time or loosing/winning by accident.
I have called it "Last Seconds Noise filtering" it simply works so that when at very few seconds from the time limit one of the two engine is very high in score (and the other very low) the match is adjudicated to the engine in advantage. This avoids situations where, in the final rush, the winning engine blunders and loose or draws a practically won match also if it played in a superior way.
And now the question.
In your tests do you take in account the % of lost/wins on time in the total amount of lost/wins?
Do you have some kind of trigger or warning to avoid these kind of (seemingly) artifacts ?
Thanks
Marco
I also think that it is the responsibility of the authors to care that
the engine is going to believe that it has slightly less time when slightly is at least 1% of the remaining time(you are not going to lose even 1 elo if you play 1% faster in the first moves espacially when 1% faster for the first moves mean that you have more time later).
Uri
-
- Posts: 2994
- Joined: Wed Mar 08, 2006 10:09 pm
- Location: Germany
- Full name: Werner Schüle
Re: first CEGT result of stockfish is not good
Hi Uri,Uri Blass wrote:You can take engines that lose on time(more often than 1 game out of 1000) out of the rating list but
deleting losses on time is not fair because in this case authors can tell their engine to lose on time instead of resigning to earn rating points.
Uri
good idea

but I think it does not work because the test will be stopped and of course I have a look at all games lost in time.
And I forgot to mention: If an engine has a won position and the other engine looses on time - I do not delete this game, it counts.
Werner