As I've stated before, I get much more repeatable results from
my tests than Bob or some of the rest of you.
At least I used to.
Then I added two features: easy move and failing low timer extension.
I think the timer extension is the issue.
Bob,
You might try turning that off in Crafty and testing the statistical
significance again to see if that is indeed the culprit.
Statistical Significance
Moderator: Ras
-
- Posts: 2091
- Joined: Mon Mar 13, 2006 2:31 am
- Location: North Carolina, USA
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Statistical Significance
It isn't.CRoberson wrote:As I've stated before, I get much more repeatable results from
my tests than Bob or some of the rest of you.
At least I used to.
Then I added two features: easy move and failing low timer extension.
I think the timer extension is the issue.
Bob,
You might try turning that off in Crafty and testing the statistical
significance again to see if that is indeed the culprit.
In fact, I have run thousands of games where the search is limited by number of nodes. For example, limit the game to 3,000,000 nodes per search for crafty vs crafty. Then re-run the same 160 games with 3,001,000 (1,000 nodes more) and the results vary significantly game to game...
The issue is timing. If your program searches 1M nodes per second, the operating system can't provide anywhere near 1ms timing accuracy, so your search will vary by well over 1,000 nodes per search, which will lead to different results for one or more moves in the game,and that is all it takes to change the result.
-
- Posts: 28353
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Statistical Significance
It depends on the engine. I tested this for uMax 1.6 vs Eden 0.11, and on the average the first 40 moves of each game where the same. As many games (starting counting from the Silver positions) lasted shorter than 40 moves, it means many games were identically repeated.
Can't test it with Joker, as Joker randomizes its moves even if you would play with the same node limit.
Can't test it with Joker, as Joker randomizes its moves even if you would play with the same node limit.