Dragon 3.1 Released at KomodoChess.com

Eduard · Post by **Eduard** » Thu Aug 04, 2022 9:43 pm

You can also choose normal openings that are not easy to play. For example Scandinavian, Pirc, Philidor, Alekhine just to name a few.

lkaufman · Post by **lkaufman** » Fri Aug 05, 2022 6:06 am

Eduard wrote: ↑Thu Aug 04, 2022 9:43 pm You can also choose normal openings that are not easy to play. For example Scandinavian, Pirc, Philidor, Alekhine just to name a few.

Such openings are in the category of "unbalanced human openings"; openings that some human masters play on occasion (more so in the previous century than now), but that engines consider very bad, even if not quite losing. They were once "normal" in human play but are clearly no longer normal in engine or 2800 elo human play. So engine tests using them are ideal for showing up (magnifying) elo differences, but are not at all representative of what elo differences would be if the engines chose their own openings or else chose only from ones played often at top level in year 2022. This is why it is no longer meaningful to speak of elo differences without specifying test conditions, especially the nature of the opening book used for the test. The gap between Stockfish and Dragon in Rapid on four threads might be only five elo or so now with only "best" openings in the book, but with openings such as those you name it might be more like fifty elo. Same further down the list.

Eduard · Post by **Eduard** » Fri Aug 05, 2022 8:22 am

The engines have evolved. Above a certain search depth, and with the most played openings by humans, the TOP 3 engines will draw 95% of the time. What else do you want to test?

By the way, there are only a few thousand grandmasters in the world, but many millions of amateurs. If an amateur wants to copy a GM, it won't go well. Which engine is best for an amateur?

On InfinityChess we ran an engine prizes tournament in June 2022 called the "WhiteBlackChallengeEngineTournament". The idea was to give White an advantage of around +0.80 (Stockfish analysis). 12 variants were allowed, further moves may be implemented in the book:

Position 1 - 1.e4 d5 +0.81/40 - SF 15
Position 2 - 1.e4 a6 +0.61/40 - SF 15
Position 3 - 1.e4 Nf6 +0.77/40 - SF 15
Position 4 - 1.d4 g6 +0.66/40 - SF 15
Position 5 - 1.d4 Nc6 +0.86/40 - SF 15
Position 6 - 1.d4 c5 +0.77/40 - SF 15
Position 7 - 1.c4 Nc6 +0.65/40 - SF 15
Position 8 - 1.c4 d5 +0.79/40 - SF 15
Position 9 - 1.c4 b6 +0.61/40 - SF 15
Position 10 - 1.Nf3 b6 +0.85/40 - SF 15
Position 11 - 1.Nf3 f5 +0.78/40 - SF 15
Position 12 - 1.Nf3 h6 +0.72/40 - SF 15

Unfortunately, each of us did our job very well! Almost nobody could win. There have been few wins where gross opening errors have been made. Two machines with 64 cores took part. Both machines couldn't win a single game!! A +0.80 advantage wasn't enough to win against 4 cores!

You are welcome to participate in one of the upcoming tournaments (I hope in autumn) with the Dragon engine and show us how good it is.

All WBCET games in PGN:

https://filehorst.de/d/eHueHnry

I just want to confirm that even a good advantage in such unbalanced openings is hardly enough to win. Known variants lead to 98% draws in InfinityChess tournaments! That's why I only play in the WBCET there.

lkaufman · Post by **lkaufman** » Fri Aug 05, 2022 4:11 pm

Eduard wrote: ↑Fri Aug 05, 2022 8:22 am The engines have evolved. Above a certain search depth, and with the most played openings by humans, the TOP 3 engines will draw 95% of the time. What else do you want to test?

By the way, there are only a few thousand grandmasters in the world, but many millions of amateurs. If an amateur wants to copy a GM, it won't go well. Which engine is best for an amateur?

On InfinityChess we ran an engine prizes tournament in June 2022 called the "WhiteBlackChallengeEngineTournament". The idea was to give White an advantage of around +0.80 (Stockfish analysis). 12 variants were allowed, further moves may be implemented in the book:

Position 1 - 1.e4 d5 +0.81/40 - SF 15
Position 2 - 1.e4 a6 +0.61/40 - SF 15
Position 3 - 1.e4 Nf6 +0.77/40 - SF 15
Position 4 - 1.d4 g6 +0.66/40 - SF 15
Position 5 - 1.d4 Nc6 +0.86/40 - SF 15
Position 6 - 1.d4 c5 +0.77/40 - SF 15
Position 7 - 1.c4 Nc6 +0.65/40 - SF 15
Position 8 - 1.c4 d5 +0.79/40 - SF 15
Position 9 - 1.c4 b6 +0.61/40 - SF 15
Position 10 - 1.Nf3 b6 +0.85/40 - SF 15
Position 11 - 1.Nf3 f5 +0.78/40 - SF 15
Position 12 - 1.Nf3 h6 +0.72/40 - SF 15

Unfortunately, each of us did our job very well! Almost nobody could win. There have been few wins where gross opening errors have been made. Two machines with 64 cores took part. Both machines couldn't win a single game!! A +0.80 advantage wasn't enough to win against 4 cores!

You are welcome to participate in one of the upcoming tournaments (I hope in autumn) with the Dragon engine and show us how good it is.

All WBCET games in PGN:

https://filehorst.de/d/eHueHnry

This primarily demonstrates that Stockfish evals are nearly double the truth (Komodo evals are also way too high, less so than SF); a clean pawn up in the opening is definitely winning at top engine level, so the win/draw line is about 3/4 or maybe 0.7 of a pawn. So these "0.80" evals are really roughly 0.40 plus (in true pawn units), not close to the win/draw line. I think you need a slightly worse choice of Black defenses, maybe averaging around +1.00 in SF eval (so about 0.50 in true eval) to get the win percentage up to a decent number (still way below half). If Stockfish would just divide displayed eval by 2 everyone would see the truth and not be fooled into thinking that such "+0.80" openings are nearly winning.

I just want to confirm that even a good advantage in such unbalanced openings is hardly enough to win. Known variants lead to 98% draws in InfinityChess tournaments! That's why I only play in the WBCET there.

Eduard · Post by **Eduard** » Fri Aug 05, 2022 5:20 pm

Unfortunately, Stockfish's eval are too high. Unfortunately it's getting worse and worse. Stockfish 15 was still moderate. I could show many games from the server every day where Stockfish dev. avaluated well over +1.50 and still doesn't win. The hardware is 16 to 64 fast cores and the opponents are slower.

I wanted to ask here in the forum whether it is possible to change the eval display of Stockfish without changing the search. Then I would create my own fish. There is Shashchess and Brainlearn, but unfortunately these engines evaluate in the other extreme and too low.

CornfedForever · Post by **CornfedForever** » Fri Aug 05, 2022 8:13 pm

Eduard wrote: ↑Fri Aug 05, 2022 5:20 pm Unfortunately, Stockfish's eval are too high. Unfortunately it's getting worse and worse. Stockfish 15 was still moderate. I could show many games from the server every day where Stockfish dev. avaluated well over +1.50 and still doesn't win. The hardware is 16 to 64 fast cores and the opponents are slower.

I wanted to ask here in the forum whether it is possible to change the eval display of Stockfish without changing the search. Then I would create my own fish. There is Shashchess and Brainlearn, but unfortunately these engines evaluate in the other extreme and too low.

Not sure exactly what you mean, so, question: Isn't the actual evaluation 'number' ONLY relevant in relation to the same engines eval of other lines it is considering?

lkaufman · Post by **lkaufman** » Fri Aug 05, 2022 11:09 pm

CornfedForever wrote: ↑Fri Aug 05, 2022 8:13 pm
Eduard wrote: ↑Fri Aug 05, 2022 5:20 pm Unfortunately, Stockfish's eval are too high. Unfortunately it's getting worse and worse. Stockfish 15 was still moderate. I could show many games from the server every day where Stockfish dev. avaluated well over +1.50 and still doesn't win. The hardware is 16 to 64 fast cores and the opponents are slower.

I wanted to ask here in the forum whether it is possible to change the eval display of Stockfish without changing the search. Then I would create my own fish. There is Shashchess and Brainlearn, but unfortunately these engines evaluate in the other extreme and too low.
Not sure exactly what you mean, so, question: Isn't the actual evaluation 'number' ONLY relevant in relation to the same engines eval of other lines it is considering?

Of course that is true in principle, but for human users we expect that a clear pawn up in the middlegame with no positional advantage to either side should show an eval near +1.00, as is the UCI specification. But SF will show roughly +2.00 now for such positions. I'm sure that they could cut the displayed eval by dividing by 2 quite easily with no effect on search or speed, and it would conform to UCI specs and to human expectations. It only matters for less sophisticated users who don't realize that +2 means one pawn advantage!

CornfedForever · Post by **CornfedForever** » Sun Aug 07, 2022 4:29 am

lkaufman wrote: ↑Fri Aug 05, 2022 11:09 pm
Of course that is true in principle, but for human users we expect that a clear pawn up in the middlegame with no positional advantage to either side should show an eval near +1.00, as is the UCI specification. But SF will show roughly +2.00 now for such positions. I'm sure that they could cut the displayed eval by dividing by 2 quite easily with no effect on search or speed, and it would conform to UCI specs and to human expectations. It only matters for less sophisticated users who don't realize that +2 means one pawn advantage!

Yes, thinking of the fighting units (even individual pawns) as expressions of a single pawn valued at 1.0, is merely a convention to help guide chess players...at least until they know better.

In no real way is a single pawn ever intrinsically worth 1.0. This because a units value is largely dependent on others and the squares they (and it) control now and in the foreseeable future...and all that which their working together may entail.

As a position with no positional compensation/advantage is more likely to exist than the starting position...minus 1 pawn, I removed the b7 pawn and switched on the latest Stockfish development version with White to move. It is giving +2.57. I doubt Dragon (which I do not have) would say much different...probably over +2.0 as well. Either way, they are just numbers, possibly neither more likely 'true' than the other.

The contours of the game have yet to take shape, but in addition to the physical b7 pawn being gone, a6 and all the squares down to a1 are (potentially, at the very least) weakened. The same for the c file, but to a lesser extent). Queenside castling is likely out. How can one begin to value the loss as merely 1.0 when the future is so uncertain(?).

It's discussions like these which always draws me back to...a certain 'mocumentary' of a fictional rock band; where the guitarist's amplifier has markings 1 thru...11, for when he needs that extra little 'umph'

lkaufman · Post by **lkaufman** » Sun Aug 07, 2022 6:12 am

CornfedForever wrote: ↑Sun Aug 07, 2022 4:29 am
lkaufman wrote: ↑Fri Aug 05, 2022 11:09 pm
Of course that is true in principle, but for human users we expect that a clear pawn up in the middlegame with no positional advantage to either side should show an eval near +1.00, as is the UCI specification. But SF will show roughly +2.00 now for such positions. I'm sure that they could cut the displayed eval by dividing by 2 quite easily with no effect on search or speed, and it would conform to UCI specs and to human expectations. It only matters for less sophisticated users who don't realize that +2 means one pawn advantage!
Yes, thinking of the fighting units (even individual pawns) as expressions of a single pawn valued at 1.0, is merely a convention to help guide chess players...at least until they know better.

In no real way is a single pawn ever intrinsically worth 1.0. This because a units value is largely dependent on others and the squares they (and it) control now and in the foreseeable future...and all that which their working together may entail.

As a position with no positional compensation/advantage is more likely to exist than the starting position...minus 1 pawn, I removed the b7 pawn and switched on the latest Stockfish development version with White to move. It is giving +2.57. I doubt Dragon (which I do not have) would say much different...probably over +2.0 as well. Either way, they are just numbers, possibly neither more likely 'true' than the other.

The contours of the game have yet to take shape, but in addition to the physical b7 pawn being gone, a6 and all the squares down to a1 are (potentially, at the very least) weakened. The same for the c file, but to a lesser extent). Queenside castling is likely out. How can one begin to value the loss as merely 1.0 when the future is so uncertain(?).

It's discussions like these which always draws me back to...a certain 'mocumentary' of a fictional rock band; where the guitarist's amplifier has markings 1 thru...11, for when he needs that extra little 'umph'

The first move is a big deal in chess. I would say the "c" pawn is the most neutral one to remove; its removal aids the queen's development (and arguably also the knight's development, since its natural square c6 won't block the "c" pawn if it is not there!), but it does split the remaining pawns into two groups, so I would say it's very close to a pure one pawn loss. So averaging removing the "c2" pawn and the "c7" pawn evals (from the point of view of the superior side) is the most appropriate figure to compare to the ideal value of 1.00 (ideal by the definition in UCI). For Dragon 3.1 on 16 threads searching 30 ply, I get 1.58 and 2.18, averaging 1.88 advantage. For a very recent Stockfish (July 24) doing same, I get 1.68 and 2.74, averaging 2.21. So even dividing by 2 still puts the eval on the high side of the desired 1.0, but close enough, while for Dragon taking 60% would make the evals similar to SF evals divided by 2.

Nordlandia · Post by **Nordlandia** » Sun Aug 07, 2022 3:06 pm

Hello folks. Long since last time. I just checked C pawn for no black kingside castling is assessed about -1 according sf dev and D3.1 give -0.75 to 0.8 eval.

Something to try in engine games.

Remove c2 and disable kingside flag for black side.

Dragon 3.1 Released at KomodoChess.com

Re: Dragon 3.1 Released at KomodoChess.com

Re: Dragon 3.1 Released at KomodoChess.com

Re: Dragon 3.1 Released at KomodoChess.com

Re: Dragon 3.1 Released at KomodoChess.com

Re: Dragon 3.1 Released at KomodoChess.com

Re: Dragon 3.1 Released at KomodoChess.com

Re: Dragon 3.1 Released at KomodoChess.com

Re: Dragon 3.1 Released at KomodoChess.com

Re: Dragon 3.1 Released at KomodoChess.com

Re: Dragon 3.1 Released at KomodoChess.com