Repeating games with switched colors reduces Elo error. All matches should be done like this

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Repeating games with switched colors reduces Elo error. All matches should be done like this

Post by corres » Mon Mar 02, 2020 11:07 am

+1.

corres
Posts: 3657
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Repeating games with switched colors reduces Elo error. All matches should be done like this

Post by corres » Mon Mar 02, 2020 11:17 am

Ovyron wrote:
Tue Feb 25, 2020 10:44 pm
...
I disagree. Why isn't the performance of 1.g4 or 1.f3 tested for the engine? It's a subset where most engines underperform so we prune those variations as bad. If an engine really underperforms in some variation of the Semi-Slav, then forcing this engine to play it makes as much sense as making it play from 1.f3.
It's like Margnus Carlsen the world champion and his bad performance on Chess960. Imagine he was only as good as his rating indicates for the openings he wants to play. Would you force him to play Chess960 from positions he's not good at? No, we let him maximize his performance with opening selection.
Why don't we allow engines to do the same? We have the technology. An engine has no reason to play from a position it doesn't like (just like they have no reason to play 1.f3), so their ratings are distorted when we make them play openings they wouldn't play if they had a choice.
Please explain me what is the connection between Carlsen and any chess engine??
The chess engines are used a lot of chess players not only Carlsen and they have different favorite opening
so we want to know how behave our engine in wide range of openings.
I make tests for this but you can not expect from everybody making tests.

User avatar
Ovyron
Posts: 4410
Joined: Tue Jul 03, 2007 2:30 am

Re: Repeating games with switched colors reduces Elo error. All matches should be done like this

Post by Ovyron » Mon Mar 02, 2020 5:02 pm

corres wrote:
Mon Mar 02, 2020 11:17 am
we want to know how behave our engine in wide range of openings.
That's what you want to know. 99 of those openings are useless to me so how the engine performs is uninteresting. I'd rather know what are the openings the engine best performs at, as the engine doesn't need to deviate from those.

corres
Posts: 3657
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Repeating games with switched colors reduces Elo error. All matches should be done like this

Post by corres » Mon Mar 02, 2020 5:46 pm

Ovyron wrote:
Mon Mar 02, 2020 5:02 pm
corres wrote:
Mon Mar 02, 2020 11:17 am
we want to know how behave our engine in wide range of openings.
That's what you want to know. 99 of those openings are useless to me so how the engine performs is uninteresting. I'd rather know what are the openings the engine best performs at, as the engine doesn't need to deviate from those.
If you look thru a test with well selected series of openings you can get the Info
If you want to get more detailed info make your own tests to get it.

User avatar
Ovyron
Posts: 4410
Joined: Tue Jul 03, 2007 2:30 am

Re: Repeating games with switched colors reduces Elo error. All matches should be done like this

Post by Ovyron » Mon Mar 02, 2020 9:57 pm

corres wrote:
Mon Mar 02, 2020 5:46 pm
If you look thru a test with well selected series of openings you can get the Info
Yes, and the best selected test is avoiding those openings the engine would never play by itself.
corres wrote:
Mon Mar 02, 2020 5:46 pm
If you want to get more detailed info make your own tests to get it.
Yeah, but people wouldn't need to do their own tests if the people doing the largest tests did it correctly.

It's like you and me have identical hardware, and we each run our test. You come up with some elo prediction for your engines, I come up with another. Then we match them against each other to see who's right. If you come up with generic openings and I come up with the strongest ones I found for engines, guess who's going to under-perform in their predictions?

corres
Posts: 3657
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Repeating games with switched colors reduces Elo error. All matches should be done like this

Post by corres » Mon Mar 02, 2020 10:07 pm

Ovyron wrote:
Mon Mar 02, 2020 9:57 pm
corres wrote:
Mon Mar 02, 2020 5:46 pm
If you look thru a test with well selected series of openings you can get the Info
Yes, and the best selected test is avoiding those openings the engine would never play by itself.
corres wrote:
Mon Mar 02, 2020 5:46 pm
If you want to get more detailed info make your own tests to get it.
Yeah, but people wouldn't need to do their own tests if the people doing the largest tests did it correctly.
It's like you and me have identical hardware, and we each run our test. You come up with some elo prediction for your engines, I come up with another. Then we match them against each other to see who's right. If you come up with generic openings and I come up with the strongest ones I found for engines, guess who's going to under-perform in their predictions?
This is not a forum for playing CC games.
If you do not like my tests please show your own tests. It is pity but I can not find any tests from you.

User avatar
Ovyron
Posts: 4410
Joined: Tue Jul 03, 2007 2:30 am

Re: Repeating games with switched colors reduces Elo error. All matches should be done like this

Post by Ovyron » Tue Mar 03, 2020 12:10 am

corres wrote:
Mon Mar 02, 2020 10:07 pm
This is not a forum for playing CC games.
It's a forum for discussing chess engines and analyzing chess positions with them. You can either find the best engine for the position or look at the rating lists and just stick to Stockfish and Leela. If there's more than the best then it's worthwhile to test at what openings non-top engines excel, because it's possible the top ones are only so on average but not on all, and it'd be interesting to know who are the best ones on the exceptions, which wouldn't be discovered with generic openings.
If you do not like my tests please show your own tests. It is pity but I can not find any tests from you.
You haven't looked hard enough. Here's one of my tests from 2008. Nowadays I run calibrated tests, where I intentionally WEAKEN the stronger engine so it plays at the same elo as another one, and take a LOOK at the games played, instead of just extracting some elo from them and locking them down in some game collection.

corres
Posts: 3657
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Repeating games with switched colors reduces Elo error. All matches should be done like this

Post by corres » Tue Mar 03, 2020 6:42 am

Ovyron wrote:
Tue Mar 03, 2020 12:10 am
corres wrote:
Mon Mar 02, 2020 10:07 pm
If you do not like my tests please show your own tests. It is pity but I can not find any tests from you.
You haven't looked hard enough. Here's one of my tests from 2008. Nowadays I run calibrated tests, where I intentionally WEAKEN the stronger engine so it plays at the same elo as another one, and take a LOOK at the games played, instead of just extracting some elo from them and locking them down in some game collection.
It is a humor? For what is good a test from 2008?
I think you use too many electricity for analyzed games. Those games played weakened engine you can not be used against a fully powered Stockfish even if Stockfish is used by a "parrot".
If you watch my tests I do not define any Elo only the qualitative difference between engines and nets basing on the number of the won games. The relative few number of games are good only for this. But practically it is enough.

User avatar
Ovyron
Posts: 4410
Joined: Tue Jul 03, 2007 2:30 am

Re: Repeating games with switched colors reduces Elo error. All matches should be done like this

Post by Ovyron » Tue Mar 03, 2020 8:22 am

corres wrote:
Tue Mar 03, 2020 6:42 am
I think you use too many electricity for analyzed games. Those games played weakened engine you can not be used against a fully powered Stockfish even if Stockfish is used by a "parrot".
In some cases they are. There's positions where Houdini 6 or even Fritz 15 with Mindbreaker settings still suggest something better than the elo giants, usually in quiet positions where there's nothing to do and they require strategy, and the fastest way to find them is to use a different engine (of course Stockfish can find them if you use exclude moves or MultiPV or other techniques, but if you smell another engine could be useful, it's fastest to switch.)

As Stockfish versions keep appearing these positions have become extremely rare, so maybe by Stockfish 12 you'll be right (0% positions where weaker engine can still be used.)

corres
Posts: 3657
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Repeating games with switched colors reduces Elo error. All matches should be done like this

Post by corres » Wed Mar 04, 2020 11:04 am

Ovyron wrote:
Tue Mar 03, 2020 8:22 am
There's positions where Houdini 6 or even Fritz 15 with Mindbreaker settings still suggest something better than the elo giants, usually in quiet positions where there's nothing to do and they require strategy, and the fastest way to find them is to use a different engine (of course Stockfish can find them if you use exclude moves or MultiPV or other techniques, but if you smell another engine could be useful, it's fastest to switch.)
As Stockfish versions keep appearing these positions have become extremely rare, so maybe by Stockfish 12 you'll be right (0% positions where weaker engine can still be used.)
If you use other engine instead of Stockfish to find rare moves it would be an issue Stockfish can continue it well or not. This is also an issue if you use a good opening book. Engines often can not find the appropriate continuation.

Post Reply