Komodo Dragon 2 released.

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Komodo Dragon 2 released.

Post by lkaufman »

Cornfed wrote: Fri May 07, 2021 8:58 pm
Werewolf wrote: Fri May 07, 2021 6:55 pm
pohl4711 wrote: Fri May 07, 2021 12:16 pm
lkaufman wrote: Tue May 04, 2021 5:53 pm Komodo Dragon 2 is released today at komodochess.com.
7000 games testrun of KomodoDragon 2.0 avx2 finished. Testrun of KomodoDragon 2.0 MCTS is running.

https://www.sp-cc.de

(Perhaps you have to clear your browsercache or reload the website)
Some nice new ideas with the multi PV but ultimately...6 elo over Dragon 1. :(
I am suprized they called it Dragon 2 instead of 1.5 or some such.
So far the elo gains quoted by testers are 6, 9, and 18 (using Ordo, which is "real elo"), averaging 11. But the main justification for calling it Dragon 2 is not these 11 elo points but the 100 elo point gains in both Standard Dragon using MultiPV and MCTS Dragon (windows version). For many users, these two features justify calling it a new version number. If you don't use either MultiPV or MCTS, then it's just a small upgrade.
Komodo rules!
Modern Times
Posts: 3749
Joined: Thu Jun 07, 2012 11:02 pm

Re: Komodo Dragon 2 released.

Post by Modern Times »

lkaufman wrote: Sat May 08, 2021 1:04 am
So far the elo gains quoted by testers are 6, 9, and 18 (using Ordo, which is "real elo"), averaging 11.
http://talkchess.com/forum3/viewtopic.php?f=7&t=73761

Ordo comes in for some criticism, from Daniel for example. I've yet to see a convincing argument that it is better then bayeselo although the very complex maths and statistics are beyond my knowledge. Ordo dramatically expands ratings, I hesitate to use the word "exaggerate" but that is how it feels. They are simply different, you take your pick, one is not more "real" than the other or better than the other. I certainly prefer bayeselo.
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: Komodo Dragon 2 released.

Post by gaard »

mjlef wrote: Fri May 07, 2021 11:54 pm
gaard wrote: Fri May 07, 2021 4:55 am
mjlef wrote: Thu May 06, 2021 2:23 pm
Cornfed wrote: Wed May 05, 2021 11:59 pm
mjlef wrote: Wed May 05, 2021 2:12 am
Werewolf wrote: Tue May 04, 2021 10:10 pm
Lion wrote: Tue May 04, 2021 10:04 pm Dragon 2 is now the favorite against Stockfish 13 on typical modern hardware.


Rgds
Is it? They seem to imply that with a pv of one the improvement is small, but the difference between Dragon 1 and SF 13 is 79 elo:

http://www.cegt.net/40_40%20Rating%20Li ... liste.html
The above is a fragment of the whole sentence. Larry wrote "We believe that the MultiPV search is now more effective than the Stockfish MultiPV search, and that if MultiPV is set to more than 4 Dragon 2 is now the favorite against Stockfish 13 on typical modern hardware."

So if you set both Dragon 2 and Stockfish to use MultiPV of 4 or more, Dragon 2 will be stronger in games, which basically means you will get better analysis quicker with Dragon 2 displaying 4 or more lines. Basically we improved MultiPV a lot in Dragon 2 (and frankly the old MultiPV scheme was not very efficient). I suppose the main reason people use strong programs is for analysis, so the improvements should help.

Mark
If during play (or analysis) an engine choses the move it thinks best...what does it matter that it's displaying 3,4,5 PV? Unless the extra lines displayed 'hurt' the overall evaluation of all 5 PV? I believe that is what you are sayin, yes (?) - that the extra computations hurt the choice of best move for Stockfish more so than KomodoDragon?

If so, is that due to search depth of each line or something with the evaluation function?
The main improvement is speed, so a MultiPV search now normally takes less time to reach a given depth. Note that setting MultiPV to 2 or higher can improve the move selection because sometimes the second search to get the second best move actually returns a move line with a score better than the first. Modern alpha-beta searches are highly selective and move order changes can produce better lines with additional search. So basically you get better analysis with the new Dragon MultiPV search mostly because it is is faster and on average a higher depth is reached in the same search time. Technically, I changed the old alpha-beta search scheme which basically kept the alpha beta window opened for the first MultiPV line to a scheme that does multiple exclusion searches, which ended up being faster.
Can you show that the 2nd, 3rd, 4th, 5th, etc., PV lines are any better than those produced by Stockfish? If not then the claim that Dragon 2 is stronger than Stockfish when MiltiPV >= 5 is argumentative at best.
We are only saying that if you set both programs for a high enough MultiPV, then Drgaon does better. The overall MultiPV is finding better first moves. Our existing tester does not support playing 2nd or 3rd bets moves to do the experiment you propose. But we do know MultiPV is much more efficient in Dragon 2 than in earlier versions. I think a lot of people use these engines in MultiPV mode for analysis, and this should be useful to them. I do not think there will ever be a MultiPV rating list. :-)
I understand what you're saying, which is that in head-to-head games versus Stockfish, Dragon comes out on top when MultiPV >= 5. But who uses engines in that way, and why would someone want to? Would it not be trivial for you to modify your tester to make Dragon and Stockfish play the Nth best move? Otherwise, I don't see how this is a meaningful comparison of Dragon and Stockfish.
Cornfed
Posts: 511
Joined: Sun Apr 26, 2020 11:40 pm
Full name: Brian D. Smith

Re: Komodo Dragon 2 released.

Post by Cornfed »

lkaufman wrote: Sat May 08, 2021 1:04 am
Cornfed wrote: Fri May 07, 2021 8:58 pm

I am suprized they called it Dragon 2 instead of 1.5 or some such.
So far the elo gains quoted by testers are 6, 9, and 18 (using Ordo, which is "real elo"), averaging 11. But the main justification for calling it Dragon 2 is not these 11 elo points but the 100 elo point gains in both Standard Dragon using MultiPV and MCTS Dragon (windows version). For many users, these two features justify calling it a new version number. If you don't use either MultiPV or MCTS, then it's just a small upgrade.
Yes, you make a very good point. Thanks!
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Komodo Dragon 2 released.

Post by lkaufman »

gaard wrote: Sat May 08, 2021 3:29 am
mjlef wrote: Fri May 07, 2021 11:54 pm
gaard wrote: Fri May 07, 2021 4:55 am
mjlef wrote: Thu May 06, 2021 2:23 pm
Cornfed wrote: Wed May 05, 2021 11:59 pm
mjlef wrote: Wed May 05, 2021 2:12 am
Werewolf wrote: Tue May 04, 2021 10:10 pm
Lion wrote: Tue May 04, 2021 10:04 pm Dragon 2 is now the favorite against Stockfish 13 on typical modern hardware.


Rgds
Is it? They seem to imply that with a pv of one the improvement is small, but the difference between Dragon 1 and SF 13 is 79 elo:

http://www.cegt.net/40_40%20Rating%20Li ... liste.html
The above is a fragment of the whole sentence. Larry wrote "We believe that the MultiPV search is now more effective than the Stockfish MultiPV search, and that if MultiPV is set to more than 4 Dragon 2 is now the favorite against Stockfish 13 on typical modern hardware."

So if you set both Dragon 2 and Stockfish to use MultiPV of 4 or more, Dragon 2 will be stronger in games, which basically means you will get better analysis quicker with Dragon 2 displaying 4 or more lines. Basically we improved MultiPV a lot in Dragon 2 (and frankly the old MultiPV scheme was not very efficient). I suppose the main reason people use strong programs is for analysis, so the improvements should help.

Mark
If during play (or analysis) an engine choses the move it thinks best...what does it matter that it's displaying 3,4,5 PV? Unless the extra lines displayed 'hurt' the overall evaluation of all 5 PV? I believe that is what you are sayin, yes (?) - that the extra computations hurt the choice of best move for Stockfish more so than KomodoDragon?

If so, is that due to search depth of each line or something with the evaluation function?
The main improvement is speed, so a MultiPV search now normally takes less time to reach a given depth. Note that setting MultiPV to 2 or higher can improve the move selection because sometimes the second search to get the second best move actually returns a move line with a score better than the first. Modern alpha-beta searches are highly selective and move order changes can produce better lines with additional search. So basically you get better analysis with the new Dragon MultiPV search mostly because it is is faster and on average a higher depth is reached in the same search time. Technically, I changed the old alpha-beta search scheme which basically kept the alpha beta window opened for the first MultiPV line to a scheme that does multiple exclusion searches, which ended up being faster.
Can you show that the 2nd, 3rd, 4th, 5th, etc., PV lines are any better than those produced by Stockfish? If not then the claim that Dragon 2 is stronger than Stockfish when MiltiPV >= 5 is argumentative at best.
We are only saying that if you set both programs for a high enough MultiPV, then Drgaon does better. The overall MultiPV is finding better first moves. Our existing tester does not support playing 2nd or 3rd bets moves to do the experiment you propose. But we do know MultiPV is much more efficient in Dragon 2 than in earlier versions. I think a lot of people use these engines in MultiPV mode for analysis, and this should be useful to them. I do not think there will ever be a MultiPV rating list. :-)
I understand what you're saying, which is that in head-to-head games versus Stockfish, Dragon comes out on top when MultiPV >= 5. But who uses engines in that way, and why would someone want to? Would it not be trivial for you to modify your tester to make Dragon and Stockfish play the Nth best move? Otherwise, I don't see how this is a meaningful comparison of Dragon and Stockfish.
It could be done, but it's not "trivial". Our tester was developed by our Webmaster Jesse Gersenson, so he would have to do it, but he works full time for chess.com so I doubt he'd have the time. While I agree that just comparing the top move doesn't tell the whole story, comparing the top five is also not fair, because getting the score of the best move right is far more important for most users than getting the score of the fifth move right, especially in positions where the fifth move might just be a blunder. Ideally you might want the weighted results of the top five moves, perhaps weighted 5-4-3-2-1 or something like that. But this is getting to be a project.
Komodo rules!
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Komodo Dragon 2 released.

Post by lkaufman »

Modern Times wrote: Sat May 08, 2021 3:05 am
lkaufman wrote: Sat May 08, 2021 1:04 am
So far the elo gains quoted by testers are 6, 9, and 18 (using Ordo, which is "real elo"), averaging 11.
http://talkchess.com/forum3/viewtopic.php?f=7&t=73761

Ordo comes in for some criticism, from Daniel for example. I've yet to see a convincing argument that it is better then bayeselo although the very complex maths and statistics are beyond my knowledge. Ordo dramatically expands ratings, I hesitate to use the word "exaggerate" but that is how it feels. They are simply different, you take your pick, one is not more "real" than the other or better than the other. I certainly prefer bayeselo.
I can't agree with this, because Ordo produces the same rating differences as the normal Elo formula for a match, whereas bayeselo contracts them and gives different results depending on draw percentages. Bayeselo may have some nice properties and I won't say it is inferior to Ordo, but Ordo is basically normal Elo, and Bayeselo is something else. Now it happens that because of the contraction of ratings by Bayeselo, CCRL, which uses Bayeselo, actually is closer to predicting the rating differences engines would have against humans, but that is just a coincidence. Engine vs engine ratings are expanded compared to the ratings they would get vs. humans, and since Bayeselo contracts them, it corrects for this. But that is just a lucky byproduct of using Bayeselo for engine vs engine ratings.
Komodo rules!
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: Komodo Dragon 2 released.

Post by gaard »

lkaufman wrote: Sat May 08, 2021 4:46 am
gaard wrote: Sat May 08, 2021 3:29 am
mjlef wrote: Fri May 07, 2021 11:54 pm
gaard wrote: Fri May 07, 2021 4:55 am
mjlef wrote: Thu May 06, 2021 2:23 pm
Cornfed wrote: Wed May 05, 2021 11:59 pm
mjlef wrote: Wed May 05, 2021 2:12 am
Werewolf wrote: Tue May 04, 2021 10:10 pm
Lion wrote: Tue May 04, 2021 10:04 pm Dragon 2 is now the favorite against Stockfish 13 on typical modern hardware.


Rgds
Is it? They seem to imply that with a pv of one the improvement is small, but the difference between Dragon 1 and SF 13 is 79 elo:

http://www.cegt.net/40_40%20Rating%20Li ... liste.html
The above is a fragment of the whole sentence. Larry wrote "We believe that the MultiPV search is now more effective than the Stockfish MultiPV search, and that if MultiPV is set to more than 4 Dragon 2 is now the favorite against Stockfish 13 on typical modern hardware."

So if you set both Dragon 2 and Stockfish to use MultiPV of 4 or more, Dragon 2 will be stronger in games, which basically means you will get better analysis quicker with Dragon 2 displaying 4 or more lines. Basically we improved MultiPV a lot in Dragon 2 (and frankly the old MultiPV scheme was not very efficient). I suppose the main reason people use strong programs is for analysis, so the improvements should help.

Mark
If during play (or analysis) an engine choses the move it thinks best...what does it matter that it's displaying 3,4,5 PV? Unless the extra lines displayed 'hurt' the overall evaluation of all 5 PV? I believe that is what you are sayin, yes (?) - that the extra computations hurt the choice of best move for Stockfish more so than KomodoDragon?

If so, is that due to search depth of each line or something with the evaluation function?
The main improvement is speed, so a MultiPV search now normally takes less time to reach a given depth. Note that setting MultiPV to 2 or higher can improve the move selection because sometimes the second search to get the second best move actually returns a move line with a score better than the first. Modern alpha-beta searches are highly selective and move order changes can produce better lines with additional search. So basically you get better analysis with the new Dragon MultiPV search mostly because it is is faster and on average a higher depth is reached in the same search time. Technically, I changed the old alpha-beta search scheme which basically kept the alpha beta window opened for the first MultiPV line to a scheme that does multiple exclusion searches, which ended up being faster.
Can you show that the 2nd, 3rd, 4th, 5th, etc., PV lines are any better than those produced by Stockfish? If not then the claim that Dragon 2 is stronger than Stockfish when MiltiPV >= 5 is argumentative at best.
We are only saying that if you set both programs for a high enough MultiPV, then Drgaon does better. The overall MultiPV is finding better first moves. Our existing tester does not support playing 2nd or 3rd bets moves to do the experiment you propose. But we do know MultiPV is much more efficient in Dragon 2 than in earlier versions. I think a lot of people use these engines in MultiPV mode for analysis, and this should be useful to them. I do not think there will ever be a MultiPV rating list. :-)
I understand what you're saying, which is that in head-to-head games versus Stockfish, Dragon comes out on top when MultiPV >= 5. But who uses engines in that way, and why would someone want to? Would it not be trivial for you to modify your tester to make Dragon and Stockfish play the Nth best move? Otherwise, I don't see how this is a meaningful comparison of Dragon and Stockfish.
It could be done, but it's not "trivial". Our tester was developed by our Webmaster Jesse Gersenson, so he would have to do it, but he works full time for chess.com so I doubt he'd have the time. While I agree that just comparing the top move doesn't tell the whole story, comparing the top five is also not fair, because getting the score of the best move right is far more important for most users than getting the score of the fifth move right, especially in positions where the fifth move might just be a blunder. Ideally you might want the weighted results of the top five moves, perhaps weighted 5-4-3-2-1 or something like that. But this is getting to be a project.
It's a project, no doubt. However, you provided no reason for why this comparison to Stockfish is meaningful nor how it will aid in analysis for those interested in anything but the first move when MultiPV >= 5. I don't mean to be argumentative (I am already a customer), I just don't think this marketing fluff is necessary or productive. Dragon is already a top 2 engine.
Krzysztof Grzelak
Posts: 1585
Joined: Tue Jul 15, 2014 12:47 pm

Re: Komodo Dragon 2 released.

Post by Krzysztof Grzelak »

One of the worst versions I've ever seen. Gentlemen from Komodo, think about whether you know how to program. Because in my opinion not. He will find another job and forget about programming.
Amstaff
Posts: 149
Joined: Thu Nov 19, 2009 4:58 pm
Location: College Station, Texas

Re: Komodo Dragon 2 released.

Post by Amstaff »

Hello I'm sorry but I do not see how to turn on MultiPV in the engine options for Komodo Dragon 2.
Thanks for any info.
Gerald
peter
Posts: 3410
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: Komodo Dragon 2 released.

Post by peter »

Amstaff wrote: Sat May 08, 2021 12:26 pm Hello I'm sorry but I do not see how to turn on MultiPV in the engine options for Komodo Dragon 2.
Not in UCI- options of most of the GUIs except as for analysis mode, there most GUIs offer a special command for more output- lines and primary variants, but not as for game- playing.

One exception is Arena,that has for all UCI- engines installed an option in engine- settings named MultiPV, in Shredder- GUI you can add manually a line

MultiPV=x

in eng.- file, x for the number of primary lines computed in analysis as well as in game playing mode.
Pity that doesn't work on the other hand e.gin Fritz- .uci- file regards
Peter.