Hi Adam,Adam Hair wrote:I screwed up. Sigma is a function of the inverse square root of the number of games. What HGM is pointing out is that you have to play twice the number of games per version to make the error equal to that when you self-test.
If the error is y Elo after x games, then the error when comparing gauntlet results is y*√2. To reduce that to y, you have play 2x games against the gauntlet for each version. So, when you are starting from scratch you have to play 4x games to equal the error bars of x games of self testing. For each new version, you will have to play 2x games against the gauntlet.
The error after x games self-testing is y
The error when comparing two versions via a gauntlet after x games each is √(y²+y²) = y*√2
To reduce this to y, the individual errors must be y/√2.
Since y is proportional to √(1/x), then y/√2 is proportional to √(1/2x)
So each version has to play 2x games against the gauntlet to make the error when comparing be y Elo.
can you explain to me what "self-testing" means for you, other than playing a "trivial gauntlet" against only one opponent which is another (previous) version of the same engine? I don't understand that different error bar calculation at all (gauntlet with N=1 opponents vs. gauntlet with N>1 opponents, why should it make a difference?), so there must be some basic detail that I missed.
Sven

