So far the elo gains quoted by testers are 6, 9, and 18 (using Ordo, which is "real elo"), averaging 11. But the main justification for calling it Dragon 2 is not these 11 elo points but the 100 elo point gains in both Standard Dragon using MultiPV and MCTS Dragon (windows version). For many users, these two features justify calling it a new version number. If you don't use either MultiPV or MCTS, then it's just a small upgrade.Cornfed wrote: ↑Fri May 07, 2021 8:58 pmI am suprized they called it Dragon 2 instead of 1.5 or some such.Werewolf wrote: ↑Fri May 07, 2021 6:55 pmSome nice new ideas with the multi PV but ultimately...6 elo over Dragon 1.pohl4711 wrote: ↑Fri May 07, 2021 12:16 pm7000 games testrun of KomodoDragon 2.0 avx2 finished. Testrun of KomodoDragon 2.0 MCTS is running.
https://www.sp-cc.de
(Perhaps you have to clear your browsercache or reload the website)![]()
Komodo Dragon 2 released.
Moderator: Ras
-
- Posts: 6259
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Komodo Dragon 2 released.
Komodo rules!
-
- Posts: 3749
- Joined: Thu Jun 07, 2012 11:02 pm
Re: Komodo Dragon 2 released.
http://talkchess.com/forum3/viewtopic.php?f=7&t=73761
Ordo comes in for some criticism, from Daniel for example. I've yet to see a convincing argument that it is better then bayeselo although the very complex maths and statistics are beyond my knowledge. Ordo dramatically expands ratings, I hesitate to use the word "exaggerate" but that is how it feels. They are simply different, you take your pick, one is not more "real" than the other or better than the other. I certainly prefer bayeselo.
-
- Posts: 463
- Joined: Mon Jun 07, 2010 3:13 am
- Location: Holland, MI
- Full name: Martin W
Re: Komodo Dragon 2 released.
I understand what you're saying, which is that in head-to-head games versus Stockfish, Dragon comes out on top when MultiPV >= 5. But who uses engines in that way, and why would someone want to? Would it not be trivial for you to modify your tester to make Dragon and Stockfish play the Nth best move? Otherwise, I don't see how this is a meaningful comparison of Dragon and Stockfish.mjlef wrote: ↑Fri May 07, 2021 11:54 pmWe are only saying that if you set both programs for a high enough MultiPV, then Drgaon does better. The overall MultiPV is finding better first moves. Our existing tester does not support playing 2nd or 3rd bets moves to do the experiment you propose. But we do know MultiPV is much more efficient in Dragon 2 than in earlier versions. I think a lot of people use these engines in MultiPV mode for analysis, and this should be useful to them. I do not think there will ever be a MultiPV rating list.gaard wrote: ↑Fri May 07, 2021 4:55 amCan you show that the 2nd, 3rd, 4th, 5th, etc., PV lines are any better than those produced by Stockfish? If not then the claim that Dragon 2 is stronger than Stockfish when MiltiPV >= 5 is argumentative at best.mjlef wrote: ↑Thu May 06, 2021 2:23 pmThe main improvement is speed, so a MultiPV search now normally takes less time to reach a given depth. Note that setting MultiPV to 2 or higher can improve the move selection because sometimes the second search to get the second best move actually returns a move line with a score better than the first. Modern alpha-beta searches are highly selective and move order changes can produce better lines with additional search. So basically you get better analysis with the new Dragon MultiPV search mostly because it is is faster and on average a higher depth is reached in the same search time. Technically, I changed the old alpha-beta search scheme which basically kept the alpha beta window opened for the first MultiPV line to a scheme that does multiple exclusion searches, which ended up being faster.Cornfed wrote: ↑Wed May 05, 2021 11:59 pmIf during play (or analysis) an engine choses the move it thinks best...what does it matter that it's displaying 3,4,5 PV? Unless the extra lines displayed 'hurt' the overall evaluation of all 5 PV? I believe that is what you are sayin, yes (?) - that the extra computations hurt the choice of best move for Stockfish more so than KomodoDragon?mjlef wrote: ↑Wed May 05, 2021 2:12 amThe above is a fragment of the whole sentence. Larry wrote "We believe that the MultiPV search is now more effective than the Stockfish MultiPV search, and that if MultiPV is set to more than 4 Dragon 2 is now the favorite against Stockfish 13 on typical modern hardware."Werewolf wrote: ↑Tue May 04, 2021 10:10 pmIs it? They seem to imply that with a pv of one the improvement is small, but the difference between Dragon 1 and SF 13 is 79 elo:
http://www.cegt.net/40_40%20Rating%20Li ... liste.html
So if you set both Dragon 2 and Stockfish to use MultiPV of 4 or more, Dragon 2 will be stronger in games, which basically means you will get better analysis quicker with Dragon 2 displaying 4 or more lines. Basically we improved MultiPV a lot in Dragon 2 (and frankly the old MultiPV scheme was not very efficient). I suppose the main reason people use strong programs is for analysis, so the improvements should help.
Mark
If so, is that due to search depth of each line or something with the evaluation function?![]()
-
- Posts: 511
- Joined: Sun Apr 26, 2020 11:40 pm
- Full name: Brian D. Smith
Re: Komodo Dragon 2 released.
Yes, you make a very good point. Thanks!lkaufman wrote: ↑Sat May 08, 2021 1:04 amSo far the elo gains quoted by testers are 6, 9, and 18 (using Ordo, which is "real elo"), averaging 11. But the main justification for calling it Dragon 2 is not these 11 elo points but the 100 elo point gains in both Standard Dragon using MultiPV and MCTS Dragon (windows version). For many users, these two features justify calling it a new version number. If you don't use either MultiPV or MCTS, then it's just a small upgrade.
-
- Posts: 6259
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Komodo Dragon 2 released.
It could be done, but it's not "trivial". Our tester was developed by our Webmaster Jesse Gersenson, so he would have to do it, but he works full time for chess.com so I doubt he'd have the time. While I agree that just comparing the top move doesn't tell the whole story, comparing the top five is also not fair, because getting the score of the best move right is far more important for most users than getting the score of the fifth move right, especially in positions where the fifth move might just be a blunder. Ideally you might want the weighted results of the top five moves, perhaps weighted 5-4-3-2-1 or something like that. But this is getting to be a project.gaard wrote: ↑Sat May 08, 2021 3:29 amI understand what you're saying, which is that in head-to-head games versus Stockfish, Dragon comes out on top when MultiPV >= 5. But who uses engines in that way, and why would someone want to? Would it not be trivial for you to modify your tester to make Dragon and Stockfish play the Nth best move? Otherwise, I don't see how this is a meaningful comparison of Dragon and Stockfish.mjlef wrote: ↑Fri May 07, 2021 11:54 pmWe are only saying that if you set both programs for a high enough MultiPV, then Drgaon does better. The overall MultiPV is finding better first moves. Our existing tester does not support playing 2nd or 3rd bets moves to do the experiment you propose. But we do know MultiPV is much more efficient in Dragon 2 than in earlier versions. I think a lot of people use these engines in MultiPV mode for analysis, and this should be useful to them. I do not think there will ever be a MultiPV rating list.gaard wrote: ↑Fri May 07, 2021 4:55 amCan you show that the 2nd, 3rd, 4th, 5th, etc., PV lines are any better than those produced by Stockfish? If not then the claim that Dragon 2 is stronger than Stockfish when MiltiPV >= 5 is argumentative at best.mjlef wrote: ↑Thu May 06, 2021 2:23 pmThe main improvement is speed, so a MultiPV search now normally takes less time to reach a given depth. Note that setting MultiPV to 2 or higher can improve the move selection because sometimes the second search to get the second best move actually returns a move line with a score better than the first. Modern alpha-beta searches are highly selective and move order changes can produce better lines with additional search. So basically you get better analysis with the new Dragon MultiPV search mostly because it is is faster and on average a higher depth is reached in the same search time. Technically, I changed the old alpha-beta search scheme which basically kept the alpha beta window opened for the first MultiPV line to a scheme that does multiple exclusion searches, which ended up being faster.Cornfed wrote: ↑Wed May 05, 2021 11:59 pmIf during play (or analysis) an engine choses the move it thinks best...what does it matter that it's displaying 3,4,5 PV? Unless the extra lines displayed 'hurt' the overall evaluation of all 5 PV? I believe that is what you are sayin, yes (?) - that the extra computations hurt the choice of best move for Stockfish more so than KomodoDragon?mjlef wrote: ↑Wed May 05, 2021 2:12 amThe above is a fragment of the whole sentence. Larry wrote "We believe that the MultiPV search is now more effective than the Stockfish MultiPV search, and that if MultiPV is set to more than 4 Dragon 2 is now the favorite against Stockfish 13 on typical modern hardware."Werewolf wrote: ↑Tue May 04, 2021 10:10 pmIs it? They seem to imply that with a pv of one the improvement is small, but the difference between Dragon 1 and SF 13 is 79 elo:
http://www.cegt.net/40_40%20Rating%20Li ... liste.html
So if you set both Dragon 2 and Stockfish to use MultiPV of 4 or more, Dragon 2 will be stronger in games, which basically means you will get better analysis quicker with Dragon 2 displaying 4 or more lines. Basically we improved MultiPV a lot in Dragon 2 (and frankly the old MultiPV scheme was not very efficient). I suppose the main reason people use strong programs is for analysis, so the improvements should help.
Mark
If so, is that due to search depth of each line or something with the evaluation function?![]()
Komodo rules!
-
- Posts: 6259
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Komodo Dragon 2 released.
I can't agree with this, because Ordo produces the same rating differences as the normal Elo formula for a match, whereas bayeselo contracts them and gives different results depending on draw percentages. Bayeselo may have some nice properties and I won't say it is inferior to Ordo, but Ordo is basically normal Elo, and Bayeselo is something else. Now it happens that because of the contraction of ratings by Bayeselo, CCRL, which uses Bayeselo, actually is closer to predicting the rating differences engines would have against humans, but that is just a coincidence. Engine vs engine ratings are expanded compared to the ratings they would get vs. humans, and since Bayeselo contracts them, it corrects for this. But that is just a lucky byproduct of using Bayeselo for engine vs engine ratings.Modern Times wrote: ↑Sat May 08, 2021 3:05 amhttp://talkchess.com/forum3/viewtopic.php?f=7&t=73761
Ordo comes in for some criticism, from Daniel for example. I've yet to see a convincing argument that it is better then bayeselo although the very complex maths and statistics are beyond my knowledge. Ordo dramatically expands ratings, I hesitate to use the word "exaggerate" but that is how it feels. They are simply different, you take your pick, one is not more "real" than the other or better than the other. I certainly prefer bayeselo.
Komodo rules!
-
- Posts: 463
- Joined: Mon Jun 07, 2010 3:13 am
- Location: Holland, MI
- Full name: Martin W
Re: Komodo Dragon 2 released.
It's a project, no doubt. However, you provided no reason for why this comparison to Stockfish is meaningful nor how it will aid in analysis for those interested in anything but the first move when MultiPV >= 5. I don't mean to be argumentative (I am already a customer), I just don't think this marketing fluff is necessary or productive. Dragon is already a top 2 engine.lkaufman wrote: ↑Sat May 08, 2021 4:46 amIt could be done, but it's not "trivial". Our tester was developed by our Webmaster Jesse Gersenson, so he would have to do it, but he works full time for chess.com so I doubt he'd have the time. While I agree that just comparing the top move doesn't tell the whole story, comparing the top five is also not fair, because getting the score of the best move right is far more important for most users than getting the score of the fifth move right, especially in positions where the fifth move might just be a blunder. Ideally you might want the weighted results of the top five moves, perhaps weighted 5-4-3-2-1 or something like that. But this is getting to be a project.gaard wrote: ↑Sat May 08, 2021 3:29 amI understand what you're saying, which is that in head-to-head games versus Stockfish, Dragon comes out on top when MultiPV >= 5. But who uses engines in that way, and why would someone want to? Would it not be trivial for you to modify your tester to make Dragon and Stockfish play the Nth best move? Otherwise, I don't see how this is a meaningful comparison of Dragon and Stockfish.mjlef wrote: ↑Fri May 07, 2021 11:54 pmWe are only saying that if you set both programs for a high enough MultiPV, then Drgaon does better. The overall MultiPV is finding better first moves. Our existing tester does not support playing 2nd or 3rd bets moves to do the experiment you propose. But we do know MultiPV is much more efficient in Dragon 2 than in earlier versions. I think a lot of people use these engines in MultiPV mode for analysis, and this should be useful to them. I do not think there will ever be a MultiPV rating list.gaard wrote: ↑Fri May 07, 2021 4:55 amCan you show that the 2nd, 3rd, 4th, 5th, etc., PV lines are any better than those produced by Stockfish? If not then the claim that Dragon 2 is stronger than Stockfish when MiltiPV >= 5 is argumentative at best.mjlef wrote: ↑Thu May 06, 2021 2:23 pmThe main improvement is speed, so a MultiPV search now normally takes less time to reach a given depth. Note that setting MultiPV to 2 or higher can improve the move selection because sometimes the second search to get the second best move actually returns a move line with a score better than the first. Modern alpha-beta searches are highly selective and move order changes can produce better lines with additional search. So basically you get better analysis with the new Dragon MultiPV search mostly because it is is faster and on average a higher depth is reached in the same search time. Technically, I changed the old alpha-beta search scheme which basically kept the alpha beta window opened for the first MultiPV line to a scheme that does multiple exclusion searches, which ended up being faster.Cornfed wrote: ↑Wed May 05, 2021 11:59 pmIf during play (or analysis) an engine choses the move it thinks best...what does it matter that it's displaying 3,4,5 PV? Unless the extra lines displayed 'hurt' the overall evaluation of all 5 PV? I believe that is what you are sayin, yes (?) - that the extra computations hurt the choice of best move for Stockfish more so than KomodoDragon?mjlef wrote: ↑Wed May 05, 2021 2:12 amThe above is a fragment of the whole sentence. Larry wrote "We believe that the MultiPV search is now more effective than the Stockfish MultiPV search, and that if MultiPV is set to more than 4 Dragon 2 is now the favorite against Stockfish 13 on typical modern hardware."Werewolf wrote: ↑Tue May 04, 2021 10:10 pmIs it? They seem to imply that with a pv of one the improvement is small, but the difference between Dragon 1 and SF 13 is 79 elo:
http://www.cegt.net/40_40%20Rating%20Li ... liste.html
So if you set both Dragon 2 and Stockfish to use MultiPV of 4 or more, Dragon 2 will be stronger in games, which basically means you will get better analysis quicker with Dragon 2 displaying 4 or more lines. Basically we improved MultiPV a lot in Dragon 2 (and frankly the old MultiPV scheme was not very efficient). I suppose the main reason people use strong programs is for analysis, so the improvements should help.
Mark
If so, is that due to search depth of each line or something with the evaluation function?![]()
-
- Posts: 1585
- Joined: Tue Jul 15, 2014 12:47 pm
Re: Komodo Dragon 2 released.
One of the worst versions I've ever seen. Gentlemen from Komodo, think about whether you know how to program. Because in my opinion not. He will find another job and forget about programming.
-
- Posts: 149
- Joined: Thu Nov 19, 2009 4:58 pm
- Location: College Station, Texas
Re: Komodo Dragon 2 released.
Hello I'm sorry but I do not see how to turn on MultiPV in the engine options for Komodo Dragon 2.
Thanks for any info.
Gerald
Thanks for any info.
Gerald
-
- Posts: 3410
- Joined: Sat Feb 16, 2008 7:38 am
- Full name: Peter Martan
Re: Komodo Dragon 2 released.
Not in UCI- options of most of the GUIs except as for analysis mode, there most GUIs offer a special command for more output- lines and primary variants, but not as for game- playing.
One exception is Arena,that has for all UCI- engines installed an option in engine- settings named MultiPV, in Shredder- GUI you can add manually a line
MultiPV=x
in eng.- file, x for the number of primary lines computed in analysis as well as in game playing mode.
Pity that doesn't work on the other hand e.gin Fritz- .uci- file regards
Peter.