Proposal for a extended Elo and tournament rating system.

Discussion of chess software programming and technical issues.

Moderators: hgm, chrisw, Rebel

chrisw
Posts: 4555
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Proposal for a extended Elo and tournament rating system.

Post by chrisw »

This idea, I borrow from the times of playing blitz chess in cafes for money. It’s plausible, used in tournaments, it will have the effect of widening out the resultant Elos and force engines to use more of their intelligence. It may also, if used in training, result in more aggressive engines.

Contra and re-contra, or double and re-double if you prefer. Contra = a win is now worth 2 points, but the side that gives contra loses if the game is drawn. Re-contra is worth 4 or 0, likewise. The player receiving the contra may accept to resign instead of playing on.

UCI options required “contra” and “recontra” and "resigns"
Addition to engine smartness to be able to “correctly” assess positions to better employ the art of doubling, or resigning
Adjustments to Elo calculation to account for possible results, 0, 0.5, 1, 2, 4
Adjustment to cutechess, possibly.

I leave it to readers to argue whether
a) this widens out the Elo scale
b) used in training, generates more aggressive, unbalanced play style
c) used in tournaments, generates more unbalanced, aggressive games. Penalises, in effect, endless draw sequences.

If (b) is true, it’s likely engine programmers will implement, even if only for self-play.
chrisw
Posts: 4555
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: Proposal for a extended Elo and tournament rating system.

Post by chrisw »

Replying to self ....

Possible results for Elo calculations (and training adjusts): -4, -2, -1, 0, 0.5, 1, 2, 4
Something (possibly the UI) needs to keep track of contra status. Black/White/nobody holding. Value (2 or 4)
Contra/recontra get posed as part of BestMove. eg e2e4 contra
If there's a pending contra offer, acceptance/decline from next BestMove. eg e7e5, or "resigns", if no "resigns" then accept is implied.

engine mod: if holding contra status then draw eval = loss, if challenged by contra, then draw = win
User avatar
Ras
Posts: 2628
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Proposal for a extended Elo and tournament rating system.

Post by Ras »

chrisw wrote: Mon Oct 14, 2024 11:29 amContra and re-contra, or double and re-double if you prefer.
This is somewhat similar to the doubling cube in backgammon where it is hugely important. The problem is that crucial elements that make it so relevant in backgammon are missing in chess.

In backgammon,
  • you must not offer too early because the element of luck is involved. In chess, an equal position does not suddenly turn into a win/loss, so you can basically offer if you have minor advantage.
  • you must not offer too late, or else you give your opponent the opportunity to resign into a simple loss (1 point) when you could have gotten to a gammon (2 points) or even a backgammon (3 points). In chess, a win is always 1 point regardless, so offering "too late" won't cost you points, it merely shortens the game.
The main change would be that the so far only cosmetic search score becomes meaningful because you would compare that against offer/resignation thresholds. Simple example: many mid-range engines score KN:K correctly as draw, but fail to do so with KN:KP. In actual gameplay, KP will ofc hold the draw, but if KN offers a double, KP will mistakenly resign because -200 cps is usually a loss. For top range engines, the difference is e.g. between just holding a fortress or also understanding (via scoring) that it is a fortress.
Rasmus Althoff
https://www.ct800.net
chrisw
Posts: 4555
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: Proposal for a extended Elo and tournament rating system.

Post by chrisw »

Ras wrote: Mon Oct 14, 2024 1:47 pm
chrisw wrote: Mon Oct 14, 2024 11:29 amContra and re-contra, or double and re-double if you prefer.
This is somewhat similar to the doubling cube in backgammon where it is hugely important. The problem is that crucial elements that make it so relevant in backgammon are missing in chess.

In backgammon,
  • you must not offer too early because the element of luck is involved. In chess, an equal position does not suddenly turn into a win/loss, so you can basically offer if you have minor advantage.

    Not quite, you need to be careful, because a minor advantage may still only be a draw, which would turn into a loss score (-2) for the side that gave the contra.
  • you must not offer too late, or else you give your opponent the opportunity to resign into a simple loss (1 point) when you could have gotten to a gammon (2 points) or even a backgammon (3 points). In chess, a win is always 1 point regardless, so offering "too late" won't cost you points, it merely shortens the game.
correct observation, although if he resigns rather than plays on and possibly draws, then that's a potential loss for him

The main change would be that the so far only cosmetic search score becomes meaningful because you would compare that against offer/resignation thresholds. Simple example: many mid-range engines score KN:K correctly as draw, but fail to do so with KN:KP. In actual gameplay, KP will ofc hold the draw, but if KN offers a double, KP will mistakenly resign because -200 cps is usually a loss. For top range engines, the difference is e.g. between just holding a fortress or also understanding (via scoring) that it is a fortress.
The thresholds and the eval accuracy are going to be critical, fortress detection for sure, but there will be others, probably. Anyway, the engine will need to show more intelligence and will be rewarded by more ExtendedElo.

Do you have an opinion about the game scoring change effect on self-play or gauntlet learning? I intuit more wild gameplay, but possibly only experiments would confirm or not.
User avatar
Ras
Posts: 2628
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Proposal for a extended Elo and tournament rating system.

Post by Ras »

chrisw wrote: Mon Oct 14, 2024 2:10 pmNot quite, you need to be careful, because a minor advantage may still only be a draw, which would turn into a loss score (-2) for the side that gave the contra.
You're right, I misunderstood that part.
correct observation, although if he resigns rather than plays on and possibly draws, then that's a potential loss for him
With "too late", I mean at a point where it's already clear what the result will be. There's no way to hold KQ:K to a draw, as extreme example.
Do you have an opinion about the game scoring change effect on self-play or gauntlet learning? I intuit more wild gameplay, but possibly only experiments would confirm or not.
You now can gain more than one point per won game and set that off against several losses. However, that requires superior handling of the doubling (which is why the doubling cube is so important in backgammon). Otherwise, you're back to square one. Assuming equal proficiency in doubling, I don't see aggressiveness as advantage in itself unless it also leads to superior gameplay, and we would have seen that in classic chess Elo already.
Rasmus Althoff
https://www.ct800.net
chrisw
Posts: 4555
Joined: Tue Apr 03, 2012 4:28 pm
Location: Midi-Pyrénées
Full name: Christopher Whittington

Re: Proposal for a extended Elo and tournament rating system.

Post by chrisw »

Ras wrote: Mon Oct 14, 2024 2:37 pm
chrisw wrote: Mon Oct 14, 2024 2:10 pmNot quite, you need to be careful, because a minor advantage may still only be a draw, which would turn into a loss score (-2) for the side that gave the contra.
You're right, I misunderstood that part.
correct observation, although if he resigns rather than plays on and possibly draws, then that's a potential loss for him
With "too late", I mean at a point where it's already clear what the result will be. There's no way to hold KQ:K to a draw, as extreme example.
Do you have an opinion about the game scoring change effect on self-play or gauntlet learning? I intuit more wild gameplay, but possibly only experiments would confirm or not.
You now can gain more than one point per won game and set that off against several losses. However, that requires superior handling of the doubling (which is why the doubling cube is so important in backgammon). Otherwise, you're back to square one. Assuming equal proficiency in doubling, I don't see aggressiveness as advantage in itself unless it also leads to superior gameplay, and we would have seen that in classic chess Elo already.

Proficiency in doubling will presumably be down to threshold tuning and an “accurate” game prediction eval. Wild engines may get a bit flamboyant with the contra-cube? Also maybe useful would be a complexity score component (as per SF) and using fn(complexity,eval) as some kind of contra decision mechanism. Either way, it looks like fun - contra adds to PGN analysis. I’m on holiday, but might try implementing a crude contra into UCI and as first shot try testing out how an always accept, always decline and random accept decline work out.