The main point is that the self play elo is not intended to compare different networks of the same training run. Even less it is an absolute measure of strength between different training runs. It is just a general parameter which gives an idea if the "training run" is progressing or is stuck due to net capacity or some bugs.Laskos wrote: ↑Thu Jul 05, 2018 12:16 pm Their self-plays by now are almost completely useless. If I am not doing something very wrong, or my time control is too short, I am getting for ID10030 almost identical result to ID10028 in 200 games gauntlets against AB engine. Still only some 50-60 Elo points improvement compared to ID10017, while their self-play ratings indicate some 330 Elo points progress. And still, in 40/4' CCRl conditions, about a 2600+ engine on GTX 1060 GPU. I am a bit worried, something odd is going on, which apparently did not happen with earlier fast runs using 6x64 nets.
The compounding of the error of the single tests guarantees that after hundreds of nets there will be a shift of hundreds of elo up or down (check "random walk").
And a single lucky win or draw in the early phase of the training run (when gains are of the order of 500-800 elo) can shift the whole run by 50-100 "self-elo" points with respect to another run.
The only way to compare networks is by "tests" like Aloril, you and others are doing.
To answer to another question, currently LcZero does not use gating, while LeelaZero does.