I ran Prophet vs. GNU 5.05 at 10 5. The results were far from deterministic. I'm repeating the experiment at 15 10. Even if 15 10 does prove a bit more deterministic, to run matches at 15 10 vs. 5 or 6 opponents (which seems to be the recommended approach) will take about a week of CPU time.
What time controls do others use? Is there a "shortcut" I'm missing for testing eval changes? Seems to be as much art as science. :-/
Oh - before anyone says anything - 3/5 matches don't show a full 40 games. I'm still investigating why. On the last match, which shows 38 games, the PGN actually had 39 games. One of those was adjourned. It was GNU's turn in what looked like a three-fold rep. I'm not sure yet why the game was adjourned, or what happened to the 40th game. Figuring that out would obviously affect the score for that particular match (somewhat), but the fact that it's different than other matches still shows nondeterministic behaviour.
--
James
Code: Select all
james@smeagol ~/prophet/scripts/10_5 $ ../pgn prophet-gnu505-200708092115.pgn
==========================================================
Total played: 40 (unique games: 40)
Note, only unique games are scored.
==========================================================
Player Wins Losses Draws Score Percent
==========================================================
GNU Chess 5.05 16 16 8 20.0/40 (50.00%)
prophet 16 16 8 20.0/40 (50.00%)
ELO Diff: 0.00
james@smeagol ~/prophet/scripts/10_5 $ ../pgn prophet-gnu505-200708112116.pgn
==========================================================
Total played: 40 (unique games: 40)
Note, only unique games are scored.
==========================================================
Player Wins Losses Draws Score Percent
==========================================================
GNU Chess 5.05 18 13 9 22.5/40 (56.25%)
prophet 13 18 9 17.5/40 (43.75%)
ELO Diff: 43.66
james@smeagol ~/prophet/scripts/10_5 $ ../pgn prophet-gnu505-200708122159.pgn
==========================================================
Total played: 38 (unique games: 38)
Note, only unique games are scored.
==========================================================
Player Wins Losses Draws Score Percent
==========================================================
GNU Chess 5.05 19 10 9 23.5/38 (61.84%)
prophet 10 19 9 14.5/38 (38.16%)
ELO Diff: 83.88
james@smeagol ~/prophet/scripts/10_5 $ ../pgn prophet-gnu505-200708131502.pgn
==========================================================
Total played: 37 (unique games: 37)
Note, only unique games are scored.
==========================================================
Player Wins Losses Draws Score Percent
==========================================================
GNU Chess 5.05 15 15 7 18.5/37 (50.00%)
prophet 15 15 7 18.5/37 (50.00%)
ELO Diff: 0.00
james@smeagol ~/prophet/scripts/10_5 $ ../pgn prophet-gnu505-200708140915.pgn
==========================================================
Total played: 38 (unique games: 38)
Note, only unique games are scored.
==========================================================
Player Wins Losses Draws Score Percent
==========================================================
GNU Chess 5.05 20 14 4 22.0/38 (57.89%)
prophet 14 20 4 16.0/38 (42.11%)
ELO Diff: 55.32