Chess324

lkaufman · Post by **lkaufman** » Thu Sep 01, 2022 6:39 pm

Chessqueen wrote: ↑Thu Sep 01, 2022 6:29 pm
lkaufman wrote: ↑Thu Sep 01, 2022 6:07 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 5:59 pm
lkaufman wrote: ↑Thu Sep 01, 2022 5:12 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 12:56 pm
lkaufman wrote: ↑Thu Sep 01, 2022 4:19 am
Chessqueen wrote: ↑Thu Sep 01, 2022 1:54 am
lkaufman wrote: ↑Wed Aug 31, 2022 7:57 pm Chess324 now being used in CCC tournament! https://www.chess.com/computer-chess-championship. Dragon currently in first place (8 player multiple round robin), but it's early. Plenty of decisive games already, as hoped, though none yet between engines in the top half.
It will be either Stockfish or Dragon the winner as usual, but I only see 5 engines competing, Stockfish, Dragon LCO, Berserk and Koivisto
There are eight engines competing, you are probably just not scrolling down to the bottom of the list. So far about 75% draws, not too bad for super hardware and top eight engines.
But the majority of the wins are with White , since White has the initial first move https://www.chess.com/computer-chess-championship
Why the word "But"? This is normal and exactly as intended, same as with UHO openings in normal chess. Some of the positions are close to equal and therefore likely to end in draws, some are fairly near the win/draw line and therefore nearly as likely to be decisive as drawn. Naturally most of the positions near the win/draw line are favorable for the first player, White. Why is that a problem?
If the main intention of chess324 is to make it better than Chess960, you have accomplished it.
It is certainly less drawish than 960 without giving either player a won position initially except for a very few cases. But it does put much more emphasis on the need to play two game matches from each position, as some of them are quite favorable for one side. It is the only variant that I know of which solves both the preparation and draw problems of high-level chess with no rule changes from normal chess, only the start positions.
This position that is being played currently is -1.05 for Black but I believe that Dragon can draw
https://www.chess.com/computer-chess-championship

[pgn][Event "Chess324"]
[Site "DESKTOP-OFQ3C0P"]
[Date "2022.09.01"]
[Round "?"]
[White "Koivisto_8.0-x64-windows-avx2"]
[Black "Dragon-2.6.1-64bit-avx2"]
[Result "*"]
[BlackElo "2000"][Time "11:30:21"]
[TimeControl "900+10"]
[SetUp "1"]
[FEN "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNNQKBBR w - - 7 1"]
[Termination "unterminated"]
[PlyCount "0"][/pgn]

You should state whether an eval is SF, Dragon, or something else. For SF an opening eval needs to be more than 1.5 (at least) to be winning. For Dragon somewhat less, depending on which version. Maybe 1.3 is typical. For "true" eval (i.e. a healthy pawn up = 1.00 as specified by UCI), a winning eval is above about 0.70 or 0.75.

Chessqueen · Post by **Chessqueen** » Fri Sep 02, 2022 12:51 am

lkaufman wrote: ↑Thu Sep 01, 2022 6:39 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 6:29 pm
lkaufman wrote: ↑Thu Sep 01, 2022 6:07 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 5:59 pm
lkaufman wrote: ↑Thu Sep 01, 2022 5:12 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 12:56 pm
lkaufman wrote: ↑Thu Sep 01, 2022 4:19 am
Chessqueen wrote: ↑Thu Sep 01, 2022 1:54 am
lkaufman wrote: ↑Wed Aug 31, 2022 7:57 pm Chess324 now being used in CCC tournament! https://www.chess.com/computer-chess-championship. Dragon currently in first place (8 player multiple round robin), but it's early. Plenty of decisive games already, as hoped, though none yet between engines in the top half.
It will be either Stockfish or Dragon the winner as usual, but I only see 5 engines competing, Stockfish, Dragon LCO, Berserk and Koivisto
There are eight engines competing, you are probably just not scrolling down to the bottom of the list. So far about 75% draws, not too bad for super hardware and top eight engines.
But the majority of the wins are with White , since White has the initial first move https://www.chess.com/computer-chess-championship
Why the word "But"? This is normal and exactly as intended, same as with UHO openings in normal chess. Some of the positions are close to equal and therefore likely to end in draws, some are fairly near the win/draw line and therefore nearly as likely to be decisive as drawn. Naturally most of the positions near the win/draw line are favorable for the first player, White. Why is that a problem?
If the main intention of chess324 is to make it better than Chess960, you have accomplished it.
It is certainly less drawish than 960 without giving either player a won position initially except for a very few cases. But it does put much more emphasis on the need to play two game matches from each position, as some of them are quite favorable for one side. It is the only variant that I know of which solves both the preparation and draw problems of high-level chess with no rule changes from normal chess, only the start positions.

You should state whether an eval is SF, Dragon, or something else. For SF an opening eval needs to be more than 1.5 (at least) to be winning. For Dragon somewhat less, depending on which version. Maybe 1.3 is typical. For "true" eval (i.e. a healthy pawn up = 1.00 as specified by UCI), a winning eval is above about 0.70 or 0.75.
I noticed that Stockfish is rated 3862 and Dragon only 3832, but where did they get those rating in Chess324? Both of those rating should have questions marks since it is the first time that all those engines play Chess24. Another ridiculous ratings are LCO = 3822 and Berserk = 3682, but we all can witness that Berserk can play Chess324 much better than LCO, since LCO has NOT done NN training with Chess324 YET==> https://www.chess.com/computer-chess-championship

lkaufman · Post by **lkaufman** » Fri Sep 02, 2022 2:14 am

Chessqueen wrote: ↑Fri Sep 02, 2022 12:51 am
lkaufman wrote: ↑Thu Sep 01, 2022 6:39 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 6:29 pm
lkaufman wrote: ↑Thu Sep 01, 2022 6:07 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 5:59 pm
lkaufman wrote: ↑Thu Sep 01, 2022 5:12 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 12:56 pm
lkaufman wrote: ↑Thu Sep 01, 2022 4:19 am
Chessqueen wrote: ↑Thu Sep 01, 2022 1:54 am
lkaufman wrote: ↑Wed Aug 31, 2022 7:57 pm Chess324 now being used in CCC tournament! https://www.chess.com/computer-chess-championship. Dragon currently in first place (8 player multiple round robin), but it's early. Plenty of decisive games already, as hoped, though none yet between engines in the top half.
It will be either Stockfish or Dragon the winner as usual, but I only see 5 engines competing, Stockfish, Dragon LCO, Berserk and Koivisto
There are eight engines competing, you are probably just not scrolling down to the bottom of the list. So far about 75% draws, not too bad for super hardware and top eight engines.
But the majority of the wins are with White , since White has the initial first move https://www.chess.com/computer-chess-championship
Why the word "But"? This is normal and exactly as intended, same as with UHO openings in normal chess. Some of the positions are close to equal and therefore likely to end in draws, some are fairly near the win/draw line and therefore nearly as likely to be decisive as drawn. Naturally most of the positions near the win/draw line are favorable for the first player, White. Why is that a problem?
If the main intention of chess324 is to make it better than Chess960, you have accomplished it.
It is certainly less drawish than 960 without giving either player a won position initially except for a very few cases. But it does put much more emphasis on the need to play two game matches from each position, as some of them are quite favorable for one side. It is the only variant that I know of which solves both the preparation and draw problems of high-level chess with no rule changes from normal chess, only the start positions.

You should state whether an eval is SF, Dragon, or something else. For SF an opening eval needs to be more than 1.5 (at least) to be winning. For Dragon somewhat less, depending on which version. Maybe 1.3 is typical. For "true" eval (i.e. a healthy pawn up = 1.00 as specified by UCI), a winning eval is above about 0.70 or 0.75.
I noticed that Stockfish is rated 3862 and Dragon only 3832, but where did they get those rating in Chess324? Both of those rating should have questions marks since it is the first time that all those engines play Chess24. Another ridiculous ratings are LCO = 3822 and Berserk = 3682, but we all can witness that Berserk can play Chess324 much better than LCO, since LCO has NOT done NN training with Chess324 YET==> https://www.chess.com/computer-chess-championship
They are obviously not chess324 ratings, since they had those ratings at the start and this was the first chess324 event ever. I think they combine all events on CCC (standard and FRC), but I'm not certain of that. It would make some sense to have chess960 ratings and merge chess324 with them, although that too could be criticized. Your comment implies that Berserk has trained on Chess324 data, but that seems very unlikely to me since it was very obscure until I posted about it here. It is strange that Lc0 is performing so poorly relative to engines that are clearly weaker in normal chess. Maybe it means that the giant nets in Lc0 act sort of like a super-opening book, and that without that benefit the A/B engines with smaller nets are really stronger than giant nets with just MCTS and no A/B.

Chessqueen · Post by **Chessqueen** » Fri Sep 02, 2022 2:27 am

lkaufman wrote: ↑Fri Sep 02, 2022 2:14 am
Chessqueen wrote: ↑Fri Sep 02, 2022 12:51 am
lkaufman wrote: ↑Thu Sep 01, 2022 6:39 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 6:29 pm
lkaufman wrote: ↑Thu Sep 01, 2022 6:07 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 5:59 pm
lkaufman wrote: ↑Thu Sep 01, 2022 5:12 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 12:56 pm
lkaufman wrote: ↑Thu Sep 01, 2022 4:19 am
Chessqueen wrote: ↑Thu Sep 01, 2022 1:54 am
lkaufman wrote: ↑Wed Aug 31, 2022 7:57 pm Chess324 now being used in CCC tournament! https://www.chess.com/computer-chess-championship. Dragon currently in first place (8 player multiple round robin), but it's early. Plenty of decisive games already, as hoped, though none yet between engines in the top half.
It will be either Stockfish or Dragon the winner as usual, but I only see 5 engines competing, Stockfish, Dragon LCO, Berserk and Koivisto
There are eight engines competing, you are probably just not scrolling down to the bottom of the list. So far about 75% draws, not too bad for super hardware and top eight engines.
But the majority of the wins are with White , since White has the initial first move https://www.chess.com/computer-chess-championship
Why the word "But"? This is normal and exactly as intended, same as with UHO openings in normal chess. Some of the positions are close to equal and therefore likely to end in draws, some are fairly near the win/draw line and therefore nearly as likely to be decisive as drawn. Naturally most of the positions near the win/draw line are favorable for the first player, White. Why is that a problem?
If the main intention of chess324 is to make it better than Chess960, you have accomplished it.
It is certainly less drawish than 960 without giving either player a won position initially except for a very few cases. But it does put much more emphasis on the need to play two game matches from each position, as some of them are quite favorable for one side. It is the only variant that I know of which solves both the preparation and draw problems of high-level chess with no rule changes from normal chess, only the start positions.

You should state whether an eval is SF, Dragon, or something else. For SF an opening eval needs to be more than 1.5 (at least) to be winning. For Dragon somewhat less, depending on which version. Maybe 1.3 is typical. For "true" eval (i.e. a healthy pawn up = 1.00 as specified by UCI), a winning eval is above about 0.70 or 0.75.
I noticed that Stockfish is rated 3862 and Dragon only 3832, but where did they get those rating in Chess324? Both of those rating should have questions marks since it is the first time that all those engines play Chess24. Another ridiculous ratings are LCO = 3822 and Berserk = 3682, but we all can witness that Berserk can play Chess324 much better than LCO, since LCO has NOT done NN training with Chess324 YET==> https://www.chess.com/computer-chess-championship
They are obviously not chess324 ratings, since they had those ratings at the start and this was the first chess324 event ever. I think they combine all events on CCC (standard and FRC), but I'm not certain of that. It would make some sense to have chess960 ratings and merge chess324 with them, although that too could be criticized. Your comment implies that Berserk has trained on Chess324 data, but that seems very unlikely to me since it was very obscure until I posted about it here. It is strange that Lc0 is performing so poorly relative to engines that are clearly weaker in normal chess. Maybe it means that the giant nets in Lc0 act sort of like a super-opening book, and that without that benefit the A/B engines with smaller nets are really stronger than giant nets with just MCTS and no A/B.
It was better if all engines started with the same rating of 3550, and depending if they win or lose their ratings should continuously be adjusted to their new Chess324 ratings

Plutie · Post by **Plutie** » Fri Sep 02, 2022 5:39 am

Chessqueen wrote: ↑Fri Sep 02, 2022 12:51 am
lkaufman wrote: ↑Thu Sep 01, 2022 6:39 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 6:29 pm
lkaufman wrote: ↑Thu Sep 01, 2022 6:07 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 5:59 pm
lkaufman wrote: ↑Thu Sep 01, 2022 5:12 pm
Chessqueen wrote: ↑Thu Sep 01, 2022 12:56 pm
lkaufman wrote: ↑Thu Sep 01, 2022 4:19 am
Chessqueen wrote: ↑Thu Sep 01, 2022 1:54 am
lkaufman wrote: ↑Wed Aug 31, 2022 7:57 pm Chess324 now being used in CCC tournament! https://www.chess.com/computer-chess-championship. Dragon currently in first place (8 player multiple round robin), but it's early. Plenty of decisive games already, as hoped, though none yet between engines in the top half.
It will be either Stockfish or Dragon the winner as usual, but I only see 5 engines competing, Stockfish, Dragon LCO, Berserk and Koivisto
There are eight engines competing, you are probably just not scrolling down to the bottom of the list. So far about 75% draws, not too bad for super hardware and top eight engines.
But the majority of the wins are with White , since White has the initial first move https://www.chess.com/computer-chess-championship
Why the word "But"? This is normal and exactly as intended, same as with UHO openings in normal chess. Some of the positions are close to equal and therefore likely to end in draws, some are fairly near the win/draw line and therefore nearly as likely to be decisive as drawn. Naturally most of the positions near the win/draw line are favorable for the first player, White. Why is that a problem?
If the main intention of chess324 is to make it better than Chess960, you have accomplished it.
It is certainly less drawish than 960 without giving either player a won position initially except for a very few cases. But it does put much more emphasis on the need to play two game matches from each position, as some of them are quite favorable for one side. It is the only variant that I know of which solves both the preparation and draw problems of high-level chess with no rule changes from normal chess, only the start positions.

You should state whether an eval is SF, Dragon, or something else. For SF an opening eval needs to be more than 1.5 (at least) to be winning. For Dragon somewhat less, depending on which version. Maybe 1.3 is typical. For "true" eval (i.e. a healthy pawn up = 1.00 as specified by UCI), a winning eval is above about 0.70 or 0.75.
I noticed that Stockfish is rated 3862 and Dragon only 3832, but where did they get those rating in Chess324? Both of those rating should have questions marks since it is the first time that all those engines play Chess24. Another ridiculous ratings are LCO = 3822 and Berserk = 3682, but we all can witness that Berserk can play Chess324 much better than LCO, since LCO has NOT done NN training with Chess324 YET==> https://www.chess.com/computer-chess-championship
They are obviously not chess324 ratings, since they had those ratings at the start and this was the first chess324 event ever. I think they combine all events on CCC (standard and FRC), but I'm not certain of that. It would make some sense to have chess960 ratings and merge chess324 with them, although that too could be criticized. Your comment implies that Berserk has trained on Chess324 data, but that seems very unlikely to me since it was very obscure until I posted about it here. It is strange that Lc0 is performing so poorly relative to engines that are clearly weaker in normal chess. Maybe it means that the giant nets in Lc0 act sort of like a super-opening book, and that without that benefit the A/B engines with smaller nets are really stronger than giant nets with just MCTS and no A/B.

current working theory - the bad performance so far would be because we submitted an untested branch which ended up having a pretty bad bug. whether that's actually why leela has performed so poorly remains to be seen, but it's the most probable explanation, considering the analysis I was running on the side during games with a known good version. - the playing leela was updated to a fixed version around game 154.

dkappe · Post by **dkappe** » Fri Sep 02, 2022 6:28 am

Plutie wrote: ↑Fri Sep 02, 2022 5:39 am current working theory - the bad performance so far would be because we submitted an untested branch which ended up having a pretty bad bug. whether that's actually why leela has performed so poorly remains to be seen, but it's the most probable explanation, considering the analysis I was running on the side during games with a known good version. - the playing leela was updated to a fixed version around game 154.

Why was an untested branch submitted?

Chessqueen · Post by **Chessqueen** » Fri Sep 02, 2022 11:59 am

dkappe wrote: ↑Fri Sep 02, 2022 6:28 am
Plutie wrote: ↑Fri Sep 02, 2022 5:39 am current working theory - the bad performance so far would be because we submitted an untested branch which ended up having a pretty bad bug. whether that's actually why leela has performed so poorly remains to be seen, but it's the most probable explanation, considering the analysis I was running on the side during games with a known good version. - the playing leela was updated to a fixed version around game 154.
Why was an untested branch submitted?

LCO should replay all those games that it played before game 154

==> https://www.chess.com/computer-chess-championship

pohl4711 · Post by **pohl4711** » Fri Sep 02, 2022 1:10 pm

lkaufman wrote: ↑Fri Aug 12, 2022 5:47 am There has been some discussion about how to improve chess960 (Fischerandom Chess) to address the fact that when top engines play against each other on good hardware at Rapid or slower time controls almost all the games end in draws, just as in normal chess (without forced unbalanced openings). Scrapping the symmetry requirement leads to some positions where one side is quite clearly winning.
I believe I have found a solution that is aesthetically pleasing, doesn't require special castling rules, and will dramatically lower draw percentages without any clearly won positions. I call it "Chess324". All rules are the same as in normal chess, including castling, only the start position is modified. The kings and rooks are placed on their normal positions. All the other pieces for White and Black are placed randomly, with no symmetry requirement, with the only restriction being that for each side the bishops must be on opposite colored squares. Unless I have miscalculated, there are 18 permutations for each side, making 324 total possible positions (including 18 symmetrical ones that are legal in chess960 of which 1 is the normal start position of chess).
In order to determine whether these positions are playable, I checked out the most promising-looking ones for White by checking whether White's advantage ever exceeds Black's advantage in normal chess after the Grob (1g4?) is played. There has been much discussion in the past over whether the Grob is losing or not, and I doubt that anyone really knows the answer; the Hiarcs database has Black winning 49% of the games, Lc0 gives Black 54% winning chance, and Stockfish and Dragon give evals suggesting that it is more likely to be a win than a draw but is very near the line. I checked all the promising positions I could think of with recent versions of Stockfish, Dragon, and Lc0, and in no case did I find one that produced an advantage larger than Black gets with the Grob (one position was tied per Lc0 but less per SF and Dragon). Of course the evals are all over the place, sometimes even Black is better, sometimes it's about even but not "balanced", sometimes one side is much better, but never clearly winning (at least not as clearly winning as the Grob as far as I was able to tell). Since many evals clearly favor one side, chess324 should be played in pairs of games, each side having White from the same position once. With humans, that's not essential, just recommended; with engines it would be necessary.
This version has huge advantages over chess960. First, no special castling rules, any engine or GUI or human can play with no instruction after seeing the initial position. Second, since all but 18 of the 324 positions are asymmetrical, opening play should be much more interesting and complex. Third, the normal positioning of the rooks and kings and normal castling makes the game feel closer to normal chess. Fourth, matches of up to 648 games can be played with no repeat positions, generally enough for most purposes. Most important, no matter how many cores or how much time the engines get, there should be plenty of decisive games for the foreseeable future since many positions are at least not too far from the win/draw line. The stronger engine will score 1.5 out of 2 in many of these positions for many years to come, unless chess is truly solved some day.
It is quite possible that a few of the initial positions may ultimately be judged to be won for White, but I am confident that even if they are "won", they will be near enough to the draw line to be playable with any current hardware or engines.

I am working on a new version: All 324 positions combined with 2 pawn-plies (one step). Example: 1.a3 h6 or 1.c3 f6 and so on...
This allows 8*8=64 combinations, which means 64x324= 20736 different opening-lines. I built this file already and mixed it by random, using SCID. Now, I evaluate all endpositions with KomodoDragon (pgnscanner-tool) with 20 secs/position on my Ryzen 12core machine. That will take around 5 days to finish. With these many (and evaluated) lines, it should be possible to build (for example) a 500 lines file with unbalanced lines (in the eval-range of my UHO-openings). Or other files (balanced, or better for black, or whatever). I will deliver the raw-data, too, of course, with all lines containing searchdepth and eval of the endposition in the annotator-tag, like I always do. So everybody can filter lines by eval, like he wants to.

If all works as expected, the release will be in 8-10 days from now on my website.

Stay tuned.

Plutie · Post by **Plutie** » Fri Sep 02, 2022 3:02 pm

dkappe wrote: ↑Fri Sep 02, 2022 6:28 am
Plutie wrote: ↑Fri Sep 02, 2022 5:39 am current working theory - the bad performance so far would be because we submitted an untested branch which ended up having a pretty bad bug. whether that's actually why leela has performed so poorly remains to be seen, but it's the most probable explanation, considering the analysis I was running on the side during games with a known good version. - the playing leela was updated to a fixed version around game 154.
Why was an untested branch submitted?

honestly not sure, I wasn't in charge of the submission, but in the end, this is just a bonus (albeit a pretty interesting one). in the past, we have submitted some untested versions (technically, all of our DAG submissions were experimental, to a degree), but this specific submission was a change to nodes with the goal to reduce memory usage. turns out it had some weird issues with random eval flips (seen in games 4, 39, and a few others), as well as potential missed wins? (game 166).
either way, it's been a good way to discover issues, but this has been the worst result we've seen in a while for sure. previously, the worst issue we had was leela missing 3fold draws.

AndrewGrant · Post by **AndrewGrant** » Fri Sep 02, 2022 3:04 pm

The mid-event "update", was actually a mid-event "revert one commit", so I felt happy to do it. Not a super serious event, so no harm done really by having some buggyness.

Chess324 is ... very drawish it seems. Now an open question: compare the exit values of Chess324 to FRC to DFRC. I feel like FRC is far less drawish than Chess324 -- which makes me think either the rook/king placement is _very_ important, or that more randomization just equals more noise to produce bad openings...

Chess324

Re: Chess324

Re: Chess324

Re: Chess324

Re: Chess324

Re: Chess324

Re: Chess324

Re: Chess324

Re: Chess324

Re: Chess324

Re: Chess324