Dragon 3.1 Released at KomodoChess.com

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Eduard
Posts: 1439
Joined: Sat Oct 27, 2018 12:58 am
Location: Germany
Full name: N.N.

Re: Dragon 3.1 Released at KomodoChess.com

Post by Eduard »

You can also choose normal openings that are not easy to play. For example Scandinavian, Pirc, Philidor, Alekhine just to name a few.
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Dragon 3.1 Released at KomodoChess.com

Post by lkaufman »

Eduard wrote: Thu Aug 04, 2022 9:43 pm You can also choose normal openings that are not easy to play. For example Scandinavian, Pirc, Philidor, Alekhine just to name a few.
Such openings are in the category of "unbalanced human openings"; openings that some human masters play on occasion (more so in the previous century than now), but that engines consider very bad, even if not quite losing. They were once "normal" in human play but are clearly no longer normal in engine or 2800 elo human play. So engine tests using them are ideal for showing up (magnifying) elo differences, but are not at all representative of what elo differences would be if the engines chose their own openings or else chose only from ones played often at top level in year 2022. This is why it is no longer meaningful to speak of elo differences without specifying test conditions, especially the nature of the opening book used for the test. The gap between Stockfish and Dragon in Rapid on four threads might be only five elo or so now with only "best" openings in the book, but with openings such as those you name it might be more like fifty elo. Same further down the list.
Komodo rules!
Eduard
Posts: 1439
Joined: Sat Oct 27, 2018 12:58 am
Location: Germany
Full name: N.N.

Re: Dragon 3.1 Released at KomodoChess.com

Post by Eduard »

The engines have evolved. Above a certain search depth, and with the most played openings by humans, the TOP 3 engines will draw 95% of the time. What else do you want to test?

By the way, there are only a few thousand grandmasters in the world, but many millions of amateurs. If an amateur wants to copy a GM, it won't go well. Which engine is best for an amateur?

On InfinityChess we ran an engine prizes tournament in June 2022 called the "WhiteBlackChallengeEngineTournament". The idea was to give White an advantage of around +0.80 (Stockfish analysis). 12 variants were allowed, further moves may be implemented in the book:

Position 1 - 1.e4 d5 +0.81/40 - SF 15
Position 2 - 1.e4 a6 +0.61/40 - SF 15
Position 3 - 1.e4 Nf6 +0.77/40 - SF 15
Position 4 - 1.d4 g6 +0.66/40 - SF 15
Position 5 - 1.d4 Nc6 +0.86/40 - SF 15
Position 6 - 1.d4 c5 +0.77/40 - SF 15
Position 7 - 1.c4 Nc6 +0.65/40 - SF 15
Position 8 - 1.c4 d5 +0.79/40 - SF 15
Position 9 - 1.c4 b6 +0.61/40 - SF 15
Position 10 - 1.Nf3 b6 +0.85/40 - SF 15
Position 11 - 1.Nf3 f5 +0.78/40 - SF 15
Position 12 - 1.Nf3 h6 +0.72/40 - SF 15

Unfortunately, each of us did our job very well! Almost nobody could win. There have been few wins where gross opening errors have been made. Two machines with 64 cores took part. Both machines couldn't win a single game!! A +0.80 advantage wasn't enough to win against 4 cores!

You are welcome to participate in one of the upcoming tournaments (I hope in autumn) with the Dragon engine and show us how good it is.

All WBCET games in PGN:

https://filehorst.de/d/eHueHnry

I just want to confirm that even a good advantage in such unbalanced openings is hardly enough to win. Known variants lead to 98% draws in InfinityChess tournaments! That's why I only play in the WBCET there.
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Dragon 3.1 Released at KomodoChess.com

Post by lkaufman »

Eduard wrote: Fri Aug 05, 2022 8:22 am The engines have evolved. Above a certain search depth, and with the most played openings by humans, the TOP 3 engines will draw 95% of the time. What else do you want to test?

By the way, there are only a few thousand grandmasters in the world, but many millions of amateurs. If an amateur wants to copy a GM, it won't go well. Which engine is best for an amateur?

On InfinityChess we ran an engine prizes tournament in June 2022 called the "WhiteBlackChallengeEngineTournament". The idea was to give White an advantage of around +0.80 (Stockfish analysis). 12 variants were allowed, further moves may be implemented in the book:

Position 1 - 1.e4 d5 +0.81/40 - SF 15
Position 2 - 1.e4 a6 +0.61/40 - SF 15
Position 3 - 1.e4 Nf6 +0.77/40 - SF 15
Position 4 - 1.d4 g6 +0.66/40 - SF 15
Position 5 - 1.d4 Nc6 +0.86/40 - SF 15
Position 6 - 1.d4 c5 +0.77/40 - SF 15
Position 7 - 1.c4 Nc6 +0.65/40 - SF 15
Position 8 - 1.c4 d5 +0.79/40 - SF 15
Position 9 - 1.c4 b6 +0.61/40 - SF 15
Position 10 - 1.Nf3 b6 +0.85/40 - SF 15
Position 11 - 1.Nf3 f5 +0.78/40 - SF 15
Position 12 - 1.Nf3 h6 +0.72/40 - SF 15

Unfortunately, each of us did our job very well! Almost nobody could win. There have been few wins where gross opening errors have been made. Two machines with 64 cores took part. Both machines couldn't win a single game!! A +0.80 advantage wasn't enough to win against 4 cores!

You are welcome to participate in one of the upcoming tournaments (I hope in autumn) with the Dragon engine and show us how good it is.

All WBCET games in PGN:

https://filehorst.de/d/eHueHnry

This primarily demonstrates that Stockfish evals are nearly double the truth (Komodo evals are also way too high, less so than SF); a clean pawn up in the opening is definitely winning at top engine level, so the win/draw line is about 3/4 or maybe 0.7 of a pawn. So these "0.80" evals are really roughly 0.40 plus (in true pawn units), not close to the win/draw line. I think you need a slightly worse choice of Black defenses, maybe averaging around +1.00 in SF eval (so about 0.50 in true eval) to get the win percentage up to a decent number (still way below half). If Stockfish would just divide displayed eval by 2 everyone would see the truth and not be fooled into thinking that such "+0.80" openings are nearly winning.

I just want to confirm that even a good advantage in such unbalanced openings is hardly enough to win. Known variants lead to 98% draws in InfinityChess tournaments! That's why I only play in the WBCET there.
Komodo rules!
Eduard
Posts: 1439
Joined: Sat Oct 27, 2018 12:58 am
Location: Germany
Full name: N.N.

Re: Dragon 3.1 Released at KomodoChess.com

Post by Eduard »

Unfortunately, Stockfish's eval are too high. Unfortunately it's getting worse and worse. Stockfish 15 was still moderate. I could show many games from the server every day where Stockfish dev. avaluated well over +1.50 and still doesn't win. The hardware is 16 to 64 fast cores and the opponents are slower.

I wanted to ask here in the forum whether it is possible to change the eval display of Stockfish without changing the search. Then I would create my own fish. There is Shashchess and Brainlearn, but unfortunately these engines evaluate in the other extreme and too low.
CornfedForever
Posts: 648
Joined: Mon Jun 20, 2022 4:08 am
Full name: Brian D. Smith

Re: Dragon 3.1 Released at KomodoChess.com

Post by CornfedForever »

Eduard wrote: Fri Aug 05, 2022 5:20 pm Unfortunately, Stockfish's eval are too high. Unfortunately it's getting worse and worse. Stockfish 15 was still moderate. I could show many games from the server every day where Stockfish dev. avaluated well over +1.50 and still doesn't win. The hardware is 16 to 64 fast cores and the opponents are slower.

I wanted to ask here in the forum whether it is possible to change the eval display of Stockfish without changing the search. Then I would create my own fish. There is Shashchess and Brainlearn, but unfortunately these engines evaluate in the other extreme and too low.
Not sure exactly what you mean, so, question: Isn't the actual evaluation 'number' ONLY relevant in relation to the same engines eval of other lines it is considering?
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Dragon 3.1 Released at KomodoChess.com

Post by lkaufman »

CornfedForever wrote: Fri Aug 05, 2022 8:13 pm
Eduard wrote: Fri Aug 05, 2022 5:20 pm Unfortunately, Stockfish's eval are too high. Unfortunately it's getting worse and worse. Stockfish 15 was still moderate. I could show many games from the server every day where Stockfish dev. avaluated well over +1.50 and still doesn't win. The hardware is 16 to 64 fast cores and the opponents are slower.

I wanted to ask here in the forum whether it is possible to change the eval display of Stockfish without changing the search. Then I would create my own fish. There is Shashchess and Brainlearn, but unfortunately these engines evaluate in the other extreme and too low.
Not sure exactly what you mean, so, question: Isn't the actual evaluation 'number' ONLY relevant in relation to the same engines eval of other lines it is considering?
Of course that is true in principle, but for human users we expect that a clear pawn up in the middlegame with no positional advantage to either side should show an eval near +1.00, as is the UCI specification. But SF will show roughly +2.00 now for such positions. I'm sure that they could cut the displayed eval by dividing by 2 quite easily with no effect on search or speed, and it would conform to UCI specs and to human expectations. It only matters for less sophisticated users who don't realize that +2 means one pawn advantage!
Komodo rules!
CornfedForever
Posts: 648
Joined: Mon Jun 20, 2022 4:08 am
Full name: Brian D. Smith

Re: Dragon 3.1 Released at KomodoChess.com

Post by CornfedForever »

lkaufman wrote: Fri Aug 05, 2022 11:09 pm
Of course that is true in principle, but for human users we expect that a clear pawn up in the middlegame with no positional advantage to either side should show an eval near +1.00, as is the UCI specification. But SF will show roughly +2.00 now for such positions. I'm sure that they could cut the displayed eval by dividing by 2 quite easily with no effect on search or speed, and it would conform to UCI specs and to human expectations. It only matters for less sophisticated users who don't realize that +2 means one pawn advantage!
Yes, thinking of the fighting units (even individual pawns) as expressions of a single pawn valued at 1.0, is merely a convention to help guide chess players...at least until they know better.

In no real way is a single pawn ever intrinsically worth 1.0. This because a units value is largely dependent on others and the squares they (and it) control now and in the foreseeable future...and all that which their working together may entail.

As a position with no positional compensation/advantage is more likely to exist than the starting position...minus 1 pawn, I removed the b7 pawn and switched on the latest Stockfish development version with White to move. It is giving +2.57. I doubt Dragon (which I do not have) would say much different...probably over +2.0 as well. Either way, they are just numbers, possibly neither more likely 'true' than the other.

The contours of the game have yet to take shape, but in addition to the physical b7 pawn being gone, a6 and all the squares down to a1 are (potentially, at the very least) weakened. The same for the c file, but to a lesser extent). Queenside castling is likely out. How can one begin to value the loss as merely 1.0 when the future is so uncertain(?).

It's discussions like these which always draws me back to...a certain 'mocumentary' of a fictional rock band; where the guitarist's amplifier has markings 1 thru...11, for when he needs that extra little 'umph'
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Dragon 3.1 Released at KomodoChess.com

Post by lkaufman »

CornfedForever wrote: Sun Aug 07, 2022 4:29 am
lkaufman wrote: Fri Aug 05, 2022 11:09 pm
Of course that is true in principle, but for human users we expect that a clear pawn up in the middlegame with no positional advantage to either side should show an eval near +1.00, as is the UCI specification. But SF will show roughly +2.00 now for such positions. I'm sure that they could cut the displayed eval by dividing by 2 quite easily with no effect on search or speed, and it would conform to UCI specs and to human expectations. It only matters for less sophisticated users who don't realize that +2 means one pawn advantage!
Yes, thinking of the fighting units (even individual pawns) as expressions of a single pawn valued at 1.0, is merely a convention to help guide chess players...at least until they know better.

In no real way is a single pawn ever intrinsically worth 1.0. This because a units value is largely dependent on others and the squares they (and it) control now and in the foreseeable future...and all that which their working together may entail.

As a position with no positional compensation/advantage is more likely to exist than the starting position...minus 1 pawn, I removed the b7 pawn and switched on the latest Stockfish development version with White to move. It is giving +2.57. I doubt Dragon (which I do not have) would say much different...probably over +2.0 as well. Either way, they are just numbers, possibly neither more likely 'true' than the other.

The contours of the game have yet to take shape, but in addition to the physical b7 pawn being gone, a6 and all the squares down to a1 are (potentially, at the very least) weakened. The same for the c file, but to a lesser extent). Queenside castling is likely out. How can one begin to value the loss as merely 1.0 when the future is so uncertain(?).

It's discussions like these which always draws me back to...a certain 'mocumentary' of a fictional rock band; where the guitarist's amplifier has markings 1 thru...11, for when he needs that extra little 'umph'
The first move is a big deal in chess. I would say the "c" pawn is the most neutral one to remove; its removal aids the queen's development (and arguably also the knight's development, since its natural square c6 won't block the "c" pawn if it is not there!), but it does split the remaining pawns into two groups, so I would say it's very close to a pure one pawn loss. So averaging removing the "c2" pawn and the "c7" pawn evals (from the point of view of the superior side) is the most appropriate figure to compare to the ideal value of 1.00 (ideal by the definition in UCI). For Dragon 3.1 on 16 threads searching 30 ply, I get 1.58 and 2.18, averaging 1.88 advantage. For a very recent Stockfish (July 24) doing same, I get 1.68 and 2.74, averaging 2.21. So even dividing by 2 still puts the eval on the high side of the desired 1.0, but close enough, while for Dragon taking 60% would make the evals similar to SF evals divided by 2.
Komodo rules!
User avatar
Nordlandia
Posts: 2822
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: Dragon 3.1 Released at KomodoChess.com

Post by Nordlandia »

Hello folks. Long since last time. I just checked C pawn for no black kingside castling is assessed about -1 according sf dev and D3.1 give -0.75 to 0.8 eval.

Something to try in engine games.

Remove c2 and disable kingside flag for black side.