I'm a strong advocate of including more weaker opponents in test-runs for this reason.Frank Quisinsky wrote: ↑Wed Nov 17, 2021 7:36 am And much clones seems to be stronger as Stockfish is.
Same problem for the current dev. in the still running tourney.
A good chance for Komodo to be the new number 1 with longer time controls and contempt!!
For some years I test it out two times (SF with and without contempt) vs. much weaker engines.
- 19 Elo in test-1 after 1.500 games
- 17 Elo in test-2 after 1.500 games
Stockfish lost around 18 Elo without contempt in test-runs with more as 30 opponents.
Maybe the same today?!!
Fact ist, that Engines like Stockfish and Komodo lost a lot of Elo if we put more weaker opponents
in a test-run.
Can be see if we compare CEGT / CCRL Elo with FCP Tourney-2021, FCP Tourney-2020 and of course
with FCP Tourney-2022. With the Excel from Klaus we can deleted engines in the field of oppoents.
With that feature it should be more clear.
It will also alleviate the draw death that plagues tourneys and matches with only the strongest engines
and also reward those getting more wins with more Elo.
The counter-argument is that including much weaker engines will cause rating distortions, but I ask, wouldn't avoiding these engines cause a rating distortion for a top engine SF, also, by over-inflating its rating?
The fact that SF draws too much against weaker opposition should be reflected in its rating, or else that rating is inaccurate. An engine with smart contempt like Komodo should definitely benefit, as long as its contempt is set at a reasonable level.
You could test that hypothesis in your tournaments, Frank!