armageddon in norway chess

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: armageddon in norway chess

Post by lkaufman »

Laskos wrote: Wed Jun 05, 2019 8:50 am
lkaufman wrote: Wed Jun 05, 2019 3:57 am
Laskos wrote: Wed Jun 05, 2019 12:29 am
lkaufman wrote: Tue Jun 04, 2019 11:34 pm So Norway chess round 1 concluded with 5 draws, but White won 4-1 in the 10-7 (+3 sec inc from move 60) Armageddon game, Black getting less time but draw oddss. This suggests that Armageddon is valid in longer games than blitz, as was also shown in some U.S. Championship playoff games at odds like one hour to 22 or so minutes. Perhaps some organizer will propose dispensing with the normal games and just playing Armageddon with White having the normal 2 hours plus 30" inc and Black getting just half of those numbers, or even 1/3, whatever makes the results close to 50-50. This could be the answer to draws!
Naturally, this could work in computer events as well, although the time odds might have to be a bit steeper. Perhaps someone will conduct some tests, although they wouldn't be quite accurate since the engines wouldn't know about the draw odds.
Can't the desire for draws be simulated in engines like Komodo by negative contempt? Imagine LTC 1/8 time and contempt -30 for Black.

For fast games (bullet), maybe 1/2 time and same -30 contempt. Just speculations.

And something like viceversa for White
Yes, of course I thought of that too. It's not perfect since a small value won't do enough while a big one weakens play too much, but I agree it's better than doing nothing. But this only works for engines that have Contempt and do it well, so it's not something we could do to create a rating list for Armageddon, but certainly it's okay for testing the idea with Komodo. I don't think we would have to cut the time as low as 1/8 for Black With White getting 2 hours + 30", at least with typical PCs with 4 cores only. It may be different for Komodo vs. Komodo than for Komodo vs Houdini or an older Stockfish of similar rating; self-play is more drawish.
Yes, I was thinking of 1/8 for LTC TCEC conditions.

But it's a hard pick, I don't think a serious rating list is possible even solely between SF, Komodo and Houdini. The time odds and contempt have to be picked very carefully case by case, as this "method" might even invert ratings and do many undesired things and hard to control distortions. Engine contempt is in itself a distortion of what humans do in these Armageddon conditions. It's very hard to find what are "fair conditions" and how to translate this to humans and viceversa, and "unfairness" here is ubiquitous (there is only one sweet point of "fairness" in a continuum of "unfairness", and it depends anyway on particular conditions).

For engines probably the best thing would be to pick unbalanced openings in the range of 0.7-0.8 (Komodo) eval, play side and reversed and use pentanomial variance for the paired games, which gives in these conditions some 1.5-1.6 smaller error margins than the naive trinomial variance. The draw rate would decrease from say 75% (or 85% in TCEC conditions) with balanced openings to some 45%-50%, the pentanomial error margins won't increase compared to the balanced case despite having much fewer draws, and the Elo difference would almost double. This methodology is sound, easy to apply fairly and is pretty rigorous, and my above speculations are much more well founded. The number of games needed to discern the better engine outside error margins is theoretically close to the minimum (it can be shown with some rigor). Also, the unbalance in the range of 0.7-0.9 Komodo eval is almost universally optimal across time controls and conditions where draw rates are high from balanced openings.

But I will check a simulation with Komodo of this Armageddon solution to the draw problem.
I totally agree with your solution in principle, and I also agree that with top engines time odds needed to make Armageddon fair would be too large to be reasonable at serious time limits. The problem is that we don't want engines to be playing something too far removed from what humans call chess at the highest level. I think that top humans would consider positions in the .7 to .8 range to be practically winning, and would not consider this "fair" even if it proven so with engine vs engine matches. Also, the initial symmetry of chess is aesthetically appealing, and if you have to remove or blunder a pawn or make silly moves for one side to get the unbalanced score it won't catch on for human play. So suppose it is proven that 3 to 1 time odds (2 hours + 30" to 40 min + 10") is roughly even with Black getting draw odds in top level human play, and suppose that this catches on and becomes popular (we need a name for it!). Then I think it might be ok to run engine competitions the same way, even though Black might score 60 or 65%, as long as all events were double round with each engine getting Black once from the same opening, as is the case in almost all computer events or rating lists now. Each game would not be fair (they aren't now anyway, White is better!), but each pair of games would be fair. Anyway, this only matters if the idea catches on.
Komodo rules!
whereagles
Posts: 565
Joined: Thu Nov 13, 2014 12:03 pm

Re: armageddon in norway chess

Post by whereagles »

I like the altibox chess rules, but they kind of make the play a bit bipolar. Games start out slow, but at some stage they begin to speed up quite a lot due the unusual time controls.. and if there is a draw, armaggeddon makes play go from bipolar to outright schizophrenic :roll:

No wonder some players are trying to finish the (so-called) classic game asap :D
User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: armageddon in norway chess

Post by Nordlandia »

In the armageddon games, my impression is that the games speed up relative to exponential growth. Simply because of no increment. Usually most games are decided before move 60. 3 second is added after move 61 afaik.

Levon Aronian vs Alexander Grischuk || Altibox Norway Chess 2019 || Armageddon

https://youtu.be/sjizmeNC8Bg?t=197
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: armageddon in norway chess

Post by Laskos »

lkaufman wrote: Thu Jun 06, 2019 12:34 am
Laskos wrote: Wed Jun 05, 2019 8:50 am
lkaufman wrote: Wed Jun 05, 2019 3:57 am
Laskos wrote: Wed Jun 05, 2019 12:29 am
lkaufman wrote: Tue Jun 04, 2019 11:34 pm So Norway chess round 1 concluded with 5 draws, but White won 4-1 in the 10-7 (+3 sec inc from move 60) Armageddon game, Black getting less time but draw oddss. This suggests that Armageddon is valid in longer games than blitz, as was also shown in some U.S. Championship playoff games at odds like one hour to 22 or so minutes. Perhaps some organizer will propose dispensing with the normal games and just playing Armageddon with White having the normal 2 hours plus 30" inc and Black getting just half of those numbers, or even 1/3, whatever makes the results close to 50-50. This could be the answer to draws!
Naturally, this could work in computer events as well, although the time odds might have to be a bit steeper. Perhaps someone will conduct some tests, although they wouldn't be quite accurate since the engines wouldn't know about the draw odds.
Can't the desire for draws be simulated in engines like Komodo by negative contempt? Imagine LTC 1/8 time and contempt -30 for Black.

For fast games (bullet), maybe 1/2 time and same -30 contempt. Just speculations.

And something like viceversa for White
Yes, of course I thought of that too. It's not perfect since a small value won't do enough while a big one weakens play too much, but I agree it's better than doing nothing. But this only works for engines that have Contempt and do it well, so it's not something we could do to create a rating list for Armageddon, but certainly it's okay for testing the idea with Komodo. I don't think we would have to cut the time as low as 1/8 for Black With White getting 2 hours + 30", at least with typical PCs with 4 cores only. It may be different for Komodo vs. Komodo than for Komodo vs Houdini or an older Stockfish of similar rating; self-play is more drawish.
Yes, I was thinking of 1/8 for LTC TCEC conditions.

But it's a hard pick, I don't think a serious rating list is possible even solely between SF, Komodo and Houdini. The time odds and contempt have to be picked very carefully case by case, as this "method" might even invert ratings and do many undesired things and hard to control distortions. Engine contempt is in itself a distortion of what humans do in these Armageddon conditions. It's very hard to find what are "fair conditions" and how to translate this to humans and viceversa, and "unfairness" here is ubiquitous (there is only one sweet point of "fairness" in a continuum of "unfairness", and it depends anyway on particular conditions).

For engines probably the best thing would be to pick unbalanced openings in the range of 0.7-0.8 (Komodo) eval, play side and reversed and use pentanomial variance for the paired games, which gives in these conditions some 1.5-1.6 smaller error margins than the naive trinomial variance. The draw rate would decrease from say 75% (or 85% in TCEC conditions) with balanced openings to some 45%-50%, the pentanomial error margins won't increase compared to the balanced case despite having much fewer draws, and the Elo difference would almost double. This methodology is sound, easy to apply fairly and is pretty rigorous, and my above speculations are much more well founded. The number of games needed to discern the better engine outside error margins is theoretically close to the minimum (it can be shown with some rigor). Also, the unbalance in the range of 0.7-0.9 Komodo eval is almost universally optimal across time controls and conditions where draw rates are high from balanced openings.

But I will check a simulation with Komodo of this Armageddon solution to the draw problem.
I totally agree with your solution in principle, and I also agree that with top engines time odds needed to make Armageddon fair would be too large to be reasonable at serious time limits. The problem is that we don't want engines to be playing something too far removed from what humans call chess at the highest level. I think that top humans would consider positions in the .7 to .8 range to be practically winning, and would not consider this "fair" even if it proven so with engine vs engine matches. Also, the initial symmetry of chess is aesthetically appealing, and if you have to remove or blunder a pawn or make silly moves for one side to get the unbalanced score it won't catch on for human play. So suppose it is proven that 3 to 1 time odds (2 hours + 30" to 40 min + 10") is roughly even with Black getting draw odds in top level human play, and suppose that this catches on and becomes popular (we need a name for it!). Then I think it might be ok to run engine competitions the same way, even though Black might score 60 or 65%, as long as all events were double round with each engine getting Black once from the same opening, as is the case in almost all computer events or rating lists now. Each game would not be fair (they aren't now anyway, White is better!), but each pair of games would be fair. Anyway, this only matters if the idea catches on.
Yes, side and reversed any chosen time odds + scoring is still fair. For example, in individually unfair 2x time odds (and scoring) at bullet, "strong" Komodo got 75:25 one side and 50:50 the other side of the odds against the "weak" Komodo, all in all 125/200, which is not bad. Pentanomial variance can be used in these paired games, and even if the draw rate is greatly diminished, the error margins are not larger than in the drawish normal match. That normal match ended in +18 -6 =76, or 56/100. We see that the Elo difference increased two fold in these paired handicap games (using that scoring). It is crucial that the games are paired (now they are not, right?).
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: armageddon in norway chess

Post by lkaufman »

Laskos wrote: Thu Jun 06, 2019 12:04 pm
lkaufman wrote: Thu Jun 06, 2019 12:34 am
Laskos wrote: Wed Jun 05, 2019 8:50 am
lkaufman wrote: Wed Jun 05, 2019 3:57 am
Laskos wrote: Wed Jun 05, 2019 12:29 am
lkaufman wrote: Tue Jun 04, 2019 11:34 pm So Norway chess round 1 concluded with 5 draws, but White won 4-1 in the 10-7 (+3 sec inc from move 60) Armageddon game, Black getting less time but draw oddss. This suggests that Armageddon is valid in longer games than blitz, as was also shown in some U.S. Championship playoff games at odds like one hour to 22 or so minutes. Perhaps some organizer will propose dispensing with the normal games and just playing Armageddon with White having the normal 2 hours plus 30" inc and Black getting just half of those numbers, or even 1/3, whatever makes the results close to 50-50. This could be the answer to draws!
Naturally, this could work in computer events as well, although the time odds might have to be a bit steeper. Perhaps someone will conduct some tests, although they wouldn't be quite accurate since the engines wouldn't know about the draw odds.
Can't the desire for draws be simulated in engines like Komodo by negative contempt? Imagine LTC 1/8 time and contempt -30 for Black.

For fast games (bullet), maybe 1/2 time and same -30 contempt. Just speculations.

And something like viceversa for White
Yes, of course I thought of that too. It's not perfect since a small value won't do enough while a big one weakens play too much, but I agree it's better than doing nothing. But this only works for engines that have Contempt and do it well, so it's not something we could do to create a rating list for Armageddon, but certainly it's okay for testing the idea with Komodo. I don't think we would have to cut the time as low as 1/8 for Black With White getting 2 hours + 30", at least with typical PCs with 4 cores only. It may be different for Komodo vs. Komodo than for Komodo vs Houdini or an older Stockfish of similar rating; self-play is more drawish.
Yes, I was thinking of 1/8 for LTC TCEC conditions.

But it's a hard pick, I don't think a serious rating list is possible even solely between SF, Komodo and Houdini. The time odds and contempt have to be picked very carefully case by case, as this "method" might even invert ratings and do many undesired things and hard to control distortions. Engine contempt is in itself a distortion of what humans do in these Armageddon conditions. It's very hard to find what are "fair conditions" and how to translate this to humans and viceversa, and "unfairness" here is ubiquitous (there is only one sweet point of "fairness" in a continuum of "unfairness", and it depends anyway on particular conditions).

For engines probably the best thing would be to pick unbalanced openings in the range of 0.7-0.8 (Komodo) eval, play side and reversed and use pentanomial variance for the paired games, which gives in these conditions some 1.5-1.6 smaller error margins than the naive trinomial variance. The draw rate would decrease from say 75% (or 85% in TCEC conditions) with balanced openings to some 45%-50%, the pentanomial error margins won't increase compared to the balanced case despite having much fewer draws, and the Elo difference would almost double. This methodology is sound, easy to apply fairly and is pretty rigorous, and my above speculations are much more well founded. The number of games needed to discern the better engine outside error margins is theoretically close to the minimum (it can be shown with some rigor). Also, the unbalance in the range of 0.7-0.9 Komodo eval is almost universally optimal across time controls and conditions where draw rates are high from balanced openings.

But I will check a simulation with Komodo of this Armageddon solution to the draw problem.
I totally agree with your solution in principle, and I also agree that with top engines time odds needed to make Armageddon fair would be too large to be reasonable at serious time limits. The problem is that we don't want engines to be playing something too far removed from what humans call chess at the highest level. I think that top humans would consider positions in the .7 to .8 range to be practically winning, and would not consider this "fair" even if it proven so with engine vs engine matches. Also, the initial symmetry of chess is aesthetically appealing, and if you have to remove or blunder a pawn or make silly moves for one side to get the unbalanced score it won't catch on for human play. So suppose it is proven that 3 to 1 time odds (2 hours + 30" to 40 min + 10") is roughly even with Black getting draw odds in top level human play, and suppose that this catches on and becomes popular (we need a name for it!). Then I think it might be ok to run engine competitions the same way, even though Black might score 60 or 65%, as long as all events were double round with each engine getting Black once from the same opening, as is the case in almost all computer events or rating lists now. Each game would not be fair (they aren't now anyway, White is better!), but each pair of games would be fair. Anyway, this only matters if the idea catches on.
Yes, side and reversed any chosen time odds + scoring is still fair. For example, in individually unfair 2x time odds (and scoring) at bullet, "strong" Komodo got 75:25 one side and 50:50 the other side of the odds against the "weak" Komodo, all in all 125/200, which is not bad. Pentanomial variance can be used in these paired games, and even if the draw rate is greatly diminished, the error margins are not larger than in the drawish normal match. That normal match ended in +18 -6 =76, or 56/100. We see that the Elo difference increased two fold in these paired handicap games (using that scoring). It is crucial that the games are paired (now they are not, right?).
In the Norway event they are only used as tiebreakers when the normal game is drawn, same color repeated. Probably they expected it to favor Black a bit (although it hasn't so far) which is "fair" as an offset to White's advantage in the first game. Actually this is a test of an idea I promoted about 25 years ago, to break draws by playing a second game with same colors with Black winning draws. The only difference is that I imagined both games would be played at the same time control with no time odds; here they shorten the time drastically and give 10 to 7 odds. My argument was that if we assume 50% draws, 30% White wins, and 20% Black wins, a rough estimate of the results in closely matched master level play, White would win 30% the first time and 30% of 50% the second time for a total of 45%, no more unfair than White's normal edge. With the time odds, it's probably quite close to 50-50 overall. But I would like the time limits to be equal for the two games, or at least not less than half time for White in the second game compared to the first. But for a pure Armageddon event, two game matches are essential.
Komodo rules!
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: armageddon in norway chess

Post by Ovyron »

Something I haven't seen mentioned in all these dicussions is, do players like it? It shouldn't matter how many draws are there or the games produced for the audience, I think the future of Armageddon chess will depend on the opinions of the players involved, and yet, I don't know what they think about it.
Your beliefs create your reality, so be careful what you wish for.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: armageddon in norway chess

Post by lkaufman »

Ovyron wrote: Fri Jun 07, 2019 5:53 am Something I haven't seen mentioned in all these dicussions is, do players like it? It shouldn't matter how many draws are there or the games produced for the audience, I think the future of Armageddon chess will depend on the opinions of the players involved, and yet, I don't know what they think about it.
It

I'm pretty sure Carlsen likes it, it appears to have been designed to showcase his talents. None of the players has any experience with Armageddon except 5' games, so their opinion will be interesting only after the event. I'm sure some will like the format, some won't, and some will like the idea but not some of the details. After seeing a few rounds and having been involved with it in the past, I think that a very good format for a top event would be for the players to play two Armageddon games each day, with total time per game equal to 2 hours plus 30" per move, the standard time per side now. White gets twice the time of Black, who gets draw odds. If you do the math, each game is 80' + 20" for White and 40' + 10" for Black. Probably Black will win more than White, but since each pairing is one of each color, that's fine. Total playing time same as normal now, every game decisive, close to standard time for White, rapid for Black. For top engines this would favor Black rather heavily, but we'll cross that bridge when we come to it. We would see quite different openings than in standard games now, White would avoid anything drawish.
Komodo rules!
User avatar
Ozymandias
Posts: 1532
Joined: Sun Oct 25, 2009 2:30 am

Re: armageddon in norway chess

Post by Ozymandias »

lkaufman wrote: Tue Jun 04, 2019 11:34 pm Perhaps some organizer will propose dispensing with the normal games and just playing Armageddon with White having the normal 2 hours plus 30" inc and Black getting just half of those numbers, or even 1/3, whatever makes the results close to 50-50. This could be the answer to draws!
Naturally, this could work in computer events as well, although the time odds might have to be a bit steeper.
If we assume that chess is a draw, and that we're already there (both assumptions seem reasonable), then you obviously need to make black weaker to increase the number of wins. By how much? As much as possible, while maintaining the options, for the best engine/book, to finish an event undefeated.
User avatar
Guenther
Posts: 4605
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: armageddon in norway chess

Post by Guenther »

After 13 games in 4 rounds of Armageddon it is 7:6 in favour of White now, still a very tiny sample of course.

Code: Select all

Round 1: 4:1
Round 2: 1:2
Round 3: 0:1
Round 4: 2:2
What we could say by now is that may be adding inc already in move 50 instead of move 60 would be a bit smarter.
https://rwbc-chess.de

trollwatch:
Chessqueen + chessica + AlexChess + Eduard + Sylwy
User avatar
Guenther
Posts: 4605
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: armageddon in norway chess

Post by Guenther »

Guenther wrote: Sun Jun 09, 2019 9:53 am After 13 games in 4 rounds of Armageddon it is 7:6 in favour of White now, still a very tiny sample of course.
...
after 6 rounds:

Code: Select all

R:    W:B
----------------
1:    4:1
2:    1:2
3:    0:1
4:    2:2
5:    1:4
6:    2:3
----------------
all: 10:13
https://rwbc-chess.de

trollwatch:
Chessqueen + chessica + AlexChess + Eduard + Sylwy