Robert Hyatt
Joined: 27 Feb 2006 Posts: 15819 Location: Birmingham, AL
|
Post subject: Re: What your opinion about this testing methodology? Posted: Wed Apr 18, 2012 7:27 pm |
|
|
| Kempelen wrote: |
Should it work?
I have been thinking about ways to improve testing time results. People usually use tournaments from startup position, or tournaments with a set of very limited set of position (i.e. 32), or tournaments with a lot of random positions. I asumme all people is doing this with a minimum of 1000 to 4000 games.
.... but ....
what about repeating the same tournament, with the same opponents, with the same positions per opponent?. Assuming a set of positions would be very large....
example:
Game 1, agains Crafty, black, posicion from FEN file 'myfenpositions.epd', number of position 540
Game 2, agains Critter, white, position from FEN file 'myfenpositions.epd', number of position 3251
....
etc
the idea is that the number of position would be always the same and not choosed ramdomly, without repeating any FEN, but enought varied.
The tournament file from the tournament manager would always be the same, without the need to recreate the tournament. The test would always repeat the same.
Would be results between tests more accurate than randomly choose the startup position.? |
This is what many of us have been doing for years. I even posted a large set of starting positions on my ftp box, the very positions I use in my cluster testing. For each new version of Crafty, I play against the same opponents, using the same set of positions, playing each position with black and white to make sure there is no biased position that has an unwanted influence.
If you repeat for each version, the only thing that changes between tests is your program, which makes comparison pretty easy. You do need enough games. I get an error bar of +/- 4 Elo playing 30,000 game matches... |
|