Kempelen wrote:Strange things happen in my test computer: For a few month now, around all my test tournaments ends in 46%/47% score for my engine. Then there were a point when I start to repeat those tournaments (because wrong behaviour suspiction) and all them show now a score of 44%-45% result. So, doing nothing, my engine is scoring 15 or 20 elo points less.
The facts is I have done nothing in the computer, it is an isolated one, no antivirus, no background services, no use for other things, nothing rare in process list, etc.....
this happend one time for me in the past and I had to repeat my last tournament and take that results as a base for the next, but my question is, how can this possible? and why other engines are scoring better and mine not? Have you experience this type of bahaviour and how do deal with it?
Thanks.
P.S.: Tournaments are for 15.000 games
Can you describe the actual conditions? How many games? How many opponents? How many starting positions? etc... If you are talking 30K games, that is a strange happening. If you are talking 300 it is not nearly so strange.
The number of games is indicated in the quote you mention, namely 15k
What about the rest?
Always use the same starting positions against the same opponents? Or are these really tournaments, where a bit of randomness can shuffle pairings?
I don't do a "tournament format" for testing. I use the same set of positions, same set of opponents, and see reasonable stability in the Elo numbers. Otherwise I could not trust anything at all.
One thing I do have to do on occasion. If I upgrade to a new compiler, I re-run all the old version matches since a compiler change can influence games in significant ways.
In my testing, which is highly reproducible within the expected Elo variance, everything stays the same except for the new version of my code I am testing. Same compiler, same hardware, same time controls, nothing running whatsoever except for my code, etc...
Thanks Robert for this insides. I also use the same set of positions (400) x 25 different opponents (x 2 different colors) = 20.000 games, but I usually end the test about 15.000 games because it is usually enough and it usually takes 24 hours with it is good to think about next improvements. Time is 30 secs for all game, all in a four cores machine with 6GB RAM. As you this is usually highly reproducible with the expected elo variance.
As Marcel, I think I am going to backup also compiler settings.
Kempelen wrote:Strange things happen in my test computer: For a few month now, around all my test tournaments ends in 46%/47% score for my engine. Then there were a point when I start to repeat those tournaments (because wrong behaviour suspiction) and all them show now a score of 44%-45% result. So, doing nothing, my engine is scoring 15 or 20 elo points less.
The facts is I have done nothing in the computer, it is an isolated one, no antivirus, no background services, no use for other things, nothing rare in process list, etc.....
this happend one time for me in the past and I had to repeat my last tournament and take that results as a base for the next, but my question is, how can this possible? and why other engines are scoring better and mine not? Have you experience this type of bahaviour and how do deal with it?
Thanks.
P.S.: Tournaments are for 15.000 games
Can you describe the actual conditions? How many games? How many opponents? How many starting positions? etc... If you are talking 30K games, that is a strange happening. If you are talking 300 it is not nearly so strange.
The number of games is indicated in the quote you mention, namely 15k
What about the rest?
Always use the same starting positions against the same opponents? Or are these really tournaments, where a bit of randomness can shuffle pairings?
I don't do a "tournament format" for testing. I use the same set of positions, same set of opponents, and see reasonable stability in the Elo numbers. Otherwise I could not trust anything at all.
One thing I do have to do on occasion. If I upgrade to a new compiler, I re-run all the old version matches since a compiler change can influence games in significant ways.
In my testing, which is highly reproducible within the expected Elo variance, everything stays the same except for the new version of my code I am testing. Same compiler, same hardware, same time controls, nothing running whatsoever except for my code, etc...
Thanks Robert for this insides. I also use the same set of positions (400) x 25 different opponents (x 2 different colors) = 20.000 games, but I usually end the test about 15.000 games because it is usually enough and it usually takes 24 hours with it is good to think about next improvements. Time is 30 secs for all game, all in a four cores machine with 6GB RAM. As you this is usually highly reproducible with the expected elo variance.
As Marcel, I think I am going to backup also compiler settings.