Jouni wrote: ↑Mon Sep 05, 2022 7:59 pm
According to Discord playing version is buggy!
According to Plutie it was updated after game 154, but it is still playing horribly in Chess324 here is what he wrote
Current working theory - the bad performance so far would be because we submitted an untested branch which ended up having a pretty bad bug. whether that's actually why leela has performed so poorly remains to be seen, but it's the most probable explanation, considering the analysis I was running on the side during games with a known good version. - the playing Leela was updated to a fixed version around game 154.
for the record, that was a bit of a hasty statement - it's definitely still not fixed, but the update was for one bug that was found in the meantime. it's still possible to reproduce the issue with a few positions where leela blundered though.
this position is from game 166, leela went from an easily won position to a draw with 45.. Ng3. first image is from dag-bord-lf-se-2 - known working version, second is from dag-size, which was submitted.
Thanks for showing.
Tried the position with the two compiles I have still installed at my PC
Both play the correct 46...Bxd6 at once and keep it in output with clearly winning eval for the few minutes I did let run, in MultiPV=2 46...Ng3? as second best has drawish eval from the very start,
regards
Last edited by peter on Tue Sep 06, 2022 10:17 am, edited 1 time in total.
peter wrote: ↑Tue Sep 06, 2022 9:47 am
Both play the correct 46...Bxd6 at once and keep it in output with clearly winning eval for the few minutes I did let run, in MultiPV=2 46...Ng3? as second best has drawish eval from the very start,
regards
Lazy_Frank wrote: ↑Tue Sep 06, 2022 9:50 am
CPU Leela gives close to 80% confidence for Bxd6.
Therefore. Leela CPU should be given another chance against the top 6 to see how it play in comparison. Another thing that I noticed is that the percentage of Draws should be lowered at the very top, for instance between Stockfish Vs Dragon there should be more decisive wins and less Draws whereas between Dragon vs Ethereal the percentage of Draws should increased considerably since in positions where Ethereal has the initial first move which create a space advantage Dragon will start from a negative evaluation and will have to fight hard to get Draws even if it does NOT lose any games
Lazy_Frank wrote: ↑Tue Sep 06, 2022 9:50 am
CPU Leela gives close to 80% confidence for Bxd6.
Therefore. Leela CPU should be given another chance against the top 6 to see how it play in comparison. Another thing that I noticed is that the percentage of Draws should be lowered at the very top, for instance between Stockfish Vs Dragon there should be more decisive wins and less Draws whereas between Dragon vs Ethereal the percentage of Draws should increased considerably since in positions where Ethereal has the initial first move which create a space advantage Dragon will start from a negative evaluation and will have to fight hard to get Draws even if it does NOT lose any games
Sorry, for stating that stupid comment, it is too late for LCO now
Lazy_Frank wrote: ↑Tue Sep 06, 2022 9:50 am
CPU Leela gives close to 80% confidence for Bxd6.
Therefore. Leela CPU should be given another chance against the top 6 to see how it play in comparison. Another thing that I noticed is that the percentage of Draws should be lowered at the very top, for instance between Stockfish Vs Dragon there should be more decisive wins and less Draws whereas between Dragon vs Ethereal the percentage of Draws should increased considerably since in positions where Ethereal has the initial first move which create a space advantage Dragon will start from a negative evaluation and will have to fight hard to get Draws even if it does NOT lose any games
Sorry, for stating that stupid comment, it is too late for LCO now
Lazy_Frank wrote: ↑Tue Sep 06, 2022 9:50 am
CPU Leela gives close to 80% confidence for Bxd6.
Therefore. Leela CPU should be given another chance against the top 6 to see how it play in comparison. Another thing that I noticed is that the percentage of Draws should be lowered at the very top, for instance between Stockfish Vs Dragon there should be more decisive wins and less Draws whereas between Dragon vs Ethereal the percentage of Draws should increased considerably since in positions where Ethereal has the initial first move which create a space advantage Dragon will start from a negative evaluation and will have to fight hard to get Draws even if it does NOT lose any games
Sorry, for stating that stupid comment, it is too late for LCO now
It's up to CCC organizers and admins i think.
I believe that LCO CPU vs Ethereal match would be interesting in Chess324
Lazy_Frank wrote: ↑Tue Sep 06, 2022 9:50 am
CPU Leela gives close to 80% confidence for Bxd6.
Therefore. Leela CPU should be given another chance against the top 6 to see how it play in comparison. Another thing that I noticed is that the percentage of Draws should be lowered at the very top, for instance between Stockfish Vs Dragon there should be more decisive wins and less Draws whereas between Dragon vs Ethereal the percentage of Draws should increased considerably since in positions where Ethereal has the initial first move which create a space advantage Dragon will start from a negative evaluation and will have to fight hard to get Draws even if it does NOT lose any games
Sorry, for stating that stupid comment, it is too late for LCO now
It's up to CCC organizers and admins i think.
I believe that LCO CPU vs Ethereal match would be interesting in Chess324
Lazy_Frank wrote: ↑Tue Sep 06, 2022 9:50 am
CPU Leela gives close to 80% confidence for Bxd6.
Therefore. Leela CPU should be given another chance against the top 6 to see how it play in comparison. Another thing that I noticed is that the percentage of Draws should be lowered at the very top, for instance between Stockfish Vs Dragon there should be more decisive wins and less Draws whereas between Dragon vs Ethereal the percentage of Draws should increased considerably since in positions where Ethereal has the initial first move which create a space advantage Dragon will start from a negative evaluation and will have to fight hard to get Draws even if it does NOT lose any games
Sorry, for stating that stupid comment, it is too late for LCO now
It's up to CCC organizers and admins i think.
I believe that LCO CPU vs Ethereal match would be interesting in Chess324
After 93 games in this playoff between the second and third place finishers in the main event, so far the score is 19 wins for Dragon to zero for Ethereal, which is a tad under 80% draws. This is a remarkable improvement over the 98+% draws in the 600 game FRC match a year ago between Dragon and Stockfish. It seems that chess324 is much less drawish than chess960 at this top level. Furthermore, the fact that Ethereal has not won a single game means that this excellent 20+% decisive rate is not due to any of the positions being easily won, at least none of the ones played so far.
Lazy_Frank wrote: ↑Tue Sep 06, 2022 9:50 am
CPU Leela gives close to 80% confidence for Bxd6.
Therefore. Leela CPU should be given another chance against the top 6 to see how it play in comparison. Another thing that I noticed is that the percentage of Draws should be lowered at the very top, for instance between Stockfish Vs Dragon there should be more decisive wins and less Draws whereas between Dragon vs Ethereal the percentage of Draws should increased considerably since in positions where Ethereal has the initial first move which create a space advantage Dragon will start from a negative evaluation and will have to fight hard to get Draws even if it does NOT lose any games
Sorry, for stating that stupid comment, it is too late for LCO now
It's up to CCC organizers and admins i think.
I believe that LCO CPU vs Ethereal match would be interesting in Chess324
After 93 games in this playoff between the second and third place finishers in the main event, so far the score is 19 wins for Dragon to zero for Ethereal, which is a tad under 80% draws. This is a remarkable improvement over the 98+% draws in the 600 game FRC match a year ago between Dragon and Stockfish. It seems that chess324 is much less drawish than chess960 at this top level. Furthermore, the fact that Ethereal has not won a single game means that this excellent 20+% decisive rate is not due to any of the positions being easily won, at least none of the ones played so far.
I believe that the percentage of Draws between Stockfish Vs Dragon get lower with the near top 2 engines, probably to 72% with more decisive wins.