Thanks again for all the fantastic testing and information you are sharing !pohl4711 wrote: ↑Mon May 24, 2021 10:28 amI did a huge, experimental RoundRobin tournament with 3 engines (Stockfish 13, KomodoDragon 1.0 and KomodoDragon 2.0), each with 4 different MultiPV-settings (1-4, were 1 is the normal, default playing mode). Goal: Measure, how much Elo is lost by calculating more than on PV-line. And to measure, if Dragon 2.0 has less Elo-loss here, than Dragon 1.0, and if Dragon 2.0 is on one Elo-level with Stockfish 13, when both engines are running with MultiPV=4.
Tournament with 2'+1'' thinking-time on an old QuadcoreCPU, singelthread-mode, no ponder, no bases (except for cutechess-cli, to end a game), cutechess-cli, classical 4 moves deep openings, played by humans (out of Megabase 2020, both players 2400 Elo or better). 100 rounds = 6600 games played.
See the results (and download the games) on my website:
https://www.sp-cc.de/experiments.htm
Komodo Dragon 2 released.
Moderators: hgm, Rebel, chrisw
-
- Posts: 253
- Joined: Mon Nov 16, 2020 12:13 pm
- Full name: Manuel Rivera
Re: Komodo Dragon 2 released.
Raspberry Pi4 bot : https://lichess.org/@/BetterAnalyze
-
- Posts: 2460
- Joined: Sat Sep 03, 2011 7:25 am
- Location: Berlin, Germany
- Full name: Stefan Pohl
Re: Komodo Dragon 2 released.
My pleasure. To be honest, this idea was not mine. I was asked about a MultiPV-testing by the contact function of my website... After a sleepover, I thought, this was an interesting request, because a lot of people use engines in MultiPV-mode for analyzing human games. And I could not find valid testing results for this engine-mode. so, I did it by myself.
Interesting fact: The Elo-loss of KomodoDragon 2.0 is only around -30 Elo, if MultiPV is increased +1 in my test-tournament, the Elo-loss of SF 13 is around -60 Elo, so if MultiPV is set to 5-6, KomodoDragon 2.0 should be at the same Elo-level as SF 13. But in reality, most people use engines for analyzing with MultiPV 2,3 or 4, I believe. Nevertheless, the small Elo-loss of only -30 Elo of KomodoDragon 2.0 is quite impressive.
-
- Posts: 511
- Joined: Sun Apr 26, 2020 11:40 pm
- Full name: Brian D. Smith
Re: Komodo Dragon 2 released.
Nice to see actual 'flesh on the bone' of the general idea that KomDragon loses less ELO the greater the pv!pohl4711 wrote: ↑Mon May 24, 2021 5:39 pmMy pleasure. To be honest, this idea was not mine. I was asked about a MultiPV-testing by the contact function of my website... After a sleepover, I thought, this was an interesting request, because a lot of people use engines in MultiPV-mode for analyzing human games. And I could not find valid testing results for this engine-mode. so, I did it by myself.
Interesting fact: The Elo-loss of KomodoDragon 2.0 is only around -30 Elo, if MultiPV is increased +1 in my test-tournament, the Elo-loss of SF 13 is around -60 Elo, so if MultiPV is set to 5-6, KomodoDragon 2.0 should be at the same Elo-level as SF 13. But in reality, most people use engines for analyzing with MultiPV 2,3 or 4, I believe. Nevertheless, the small Elo-loss of only -30 Elo of KomodoDragon 2.0 is quite impressive.
So, is it fair to say
SF 13, 4pv = 3544
KDragon 2, 4pv = 3499
is the break and maybe 5pv and greater leads to an = or slight edge for KDragon 2? I mean, you do not try 5pv, but one might think that is the case given the trendline.
-
- Posts: 5966
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
Re: Komodo Dragon 2 released.
Our own testing had them dead even with MPV=5, but I tested on four threads, while this test is single thread. I don't know which program this would favor, but any difference in results would probably be due to this difference in the testing.Cornfed wrote: ↑Mon May 24, 2021 7:01 pmNice to see actual 'flesh on the bone' of the general idea that KomDragon loses less ELO the greater the pv!pohl4711 wrote: ↑Mon May 24, 2021 5:39 pmMy pleasure. To be honest, this idea was not mine. I was asked about a MultiPV-testing by the contact function of my website... After a sleepover, I thought, this was an interesting request, because a lot of people use engines in MultiPV-mode for analyzing human games. And I could not find valid testing results for this engine-mode. so, I did it by myself.
Interesting fact: The Elo-loss of KomodoDragon 2.0 is only around -30 Elo, if MultiPV is increased +1 in my test-tournament, the Elo-loss of SF 13 is around -60 Elo, so if MultiPV is set to 5-6, KomodoDragon 2.0 should be at the same Elo-level as SF 13. But in reality, most people use engines for analyzing with MultiPV 2,3 or 4, I believe. Nevertheless, the small Elo-loss of only -30 Elo of KomodoDragon 2.0 is quite impressive.
So, is it fair to say
SF 13, 4pv = 3544
KDragon 2, 4pv = 3499
is the break and maybe 5pv and greater leads to an = or slight edge for KDragon 2? I mean, you do not try 5pv, but one might think that is the case given the trendline.
Komodo rules!
-
- Posts: 450
- Joined: Mon Jun 07, 2010 3:13 am
- Location: Holland, MI
- Full name: Martin W
Re: Komodo Dragon 2 released.
I think a more useful metric would be comparing (say) the 5th best moves, when MultiPV = 5. Or at least, min(4, Number of Legal Moves), in base 0. Otherwise, it is quite trivial to make Stockfish search MultiPV moves 2+ with reduced depth, to produce a first move that is better. It's that fact which makes this current metric suspect, or at least, less than convincing.Cornfed wrote: ↑Mon May 24, 2021 7:01 pmNice to see actual 'flesh on the bone' of the general idea that KomDragon loses less ELO the greater the pv!pohl4711 wrote: ↑Mon May 24, 2021 5:39 pmMy pleasure. To be honest, this idea was not mine. I was asked about a MultiPV-testing by the contact function of my website... After a sleepover, I thought, this was an interesting request, because a lot of people use engines in MultiPV-mode for analyzing human games. And I could not find valid testing results for this engine-mode. so, I did it by myself.
Interesting fact: The Elo-loss of KomodoDragon 2.0 is only around -30 Elo, if MultiPV is increased +1 in my test-tournament, the Elo-loss of SF 13 is around -60 Elo, so if MultiPV is set to 5-6, KomodoDragon 2.0 should be at the same Elo-level as SF 13. But in reality, most people use engines for analyzing with MultiPV 2,3 or 4, I believe. Nevertheless, the small Elo-loss of only -30 Elo of KomodoDragon 2.0 is quite impressive.
So, is it fair to say
SF 13, 4pv = 3544
KDragon 2, 4pv = 3499
is the break and maybe 5pv and greater leads to an = or slight edge for KDragon 2? I mean, you do not try 5pv, but one might think that is the case given the trendline.
-
- Posts: 47
- Joined: Sun May 23, 2021 6:04 pm
- Full name: Jacques Ress
Re: Komodo Dragon 2 released.
As correspondance player, i don't search the better engine in the ultra fast time control engines rating lists !the_real_greco wrote: ↑Mon May 17, 2021 3:22 amThere are people out there (including me and you) who have interest in engine tournaments or correspondence chess. We only get excited when an engine is stronger compared to other engines. But we are the minority.Krzysztof Grzelak wrote: ↑Sun May 16, 2021 11:09 pmIf not you understand what I am writing, about it I am sorry.
For everybody else, it's more important to have a engine that is easier to use. Or can come up with new ideas. Or something. Whether Dragon 2 was +100 or +0 Elo against Dragon 1, doesn't really matter.
Even if Stockfish and some derivatives seem the best in many situations, they are not in some others where Leela and why not Dragon can be better for a good reason : they are différent and not a derivative....
I don't think i am alone to think that ?
Regards
Jacques
IM ICCF player
-
- Posts: 5624
- Joined: Wed Sep 05, 2018 2:16 am
- Location: Moving
- Full name: Jorge Picado
Re: Komodo Dragon 2 released.
https://en.chessbase.com/post/better-th ... ubicic-1-2jr66 wrote: ↑Tue May 25, 2021 4:23 pmAs correspondance player, i don't search the better engine in the ultra fast time control engines rating lists !the_real_greco wrote: ↑Mon May 17, 2021 3:22 amThere are people out there (including me and you) who have interest in engine tournaments or correspondence chess. We only get excited when an engine is stronger compared to other engines. But we are the minority.Krzysztof Grzelak wrote: ↑Sun May 16, 2021 11:09 pmIf not you understand what I am writing, about it I am sorry.
For everybody else, it's more important to have a engine that is easier to use. Or can come up with new ideas. Or something. Whether Dragon 2 was +100 or +0 Elo against Dragon 1, doesn't really matter.
Even if Stockfish and some derivatives seem the best in many situations, they are not in some others where Leela and why not Dragon can be better for a good reason : they are différent and not a derivative....
I don't think i am alone to think that ?
Regards
Jacques
According to Leonardo Ljubicic the 28th World Champion in Correspondence Chess he stated "As much as my time allows I try to follow the latest in chess engine development. From what I’ve seen, the best engines of today are Komodo and Stockfish. Both have their virtues and ... well, almost no weaknesses. Stockfish calculates variations fast, and excels in tactics and attacking, while Komodo is solid in style, and its positional play is second to none. They are very close in strength continuously developed further. Both are an excellent choice for serious correspondence chess"
https://en.chessbase.com/post/better-th ... ubicic-1-2
Probably you will like to play a similar game as Chrisw is playing with Knight Odds versus Dragon2 MCTS, and see how you do against the Dragon Beast
[pgn] [Event "knight odds match"]
[Site "?"]
[Date "2021.05.24"]
[Round "1"]
[White "KomodoDragon2, MCTS."]
[Black "Whittington, Chris"]
[Result "*"]
[Annotator "Kaufman,Larry"]
[SetUp "1"]
[FEN "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/R1BQKBNR w KQkq - 0 1"]
[PlyCount "39"]
[EventDate "2021.05.20"]
[EventType "match"]
1. e4 Nf6 2. e5 Nd5 3. d4 d6 4. Nf3 Bg4 5. Be2 Nc6 6. c4 Nb6 7. e6 fxe6 8. O-O
g6 9. a4 a5 10. d5 Bxf3 11. Bxf3 exd5 12. cxd5 Ne5 13. Be4 Bg7 14. f4 Ned7 15.
f5 O-O 16. Ra3 Ne5 17. Rg3 Qe8 18. b3 c6 19. dxc6 bxc6 20. Rf2 e6 [/pgn]
Who is 17 years old GM Gukesh 2nd at the Candidate in Toronto?
https://indianexpress.com/article/sport ... t-9281394/
https://indianexpress.com/article/sport ... t-9281394/
-
- Posts: 47
- Joined: Sun May 23, 2021 6:04 pm
- Full name: Jacques Ress
Re: Komodo Dragon 2 released.
I agree with Leonardo of course, except for one thing: since NNUE versions, Stockfish is really stronger !
I would say Leela is perhaps more useful is some positions than Dragon now ( see TCEC superfinal where Time control and hardware are interesting even for an ICCF player for help to evaluate a little bit engines ) ?
For the moment, i cannot say if Dragon is better with multi PV but work on multi PV is a very good thing for us ( engines are sometimes "closed" to one line which is not always the best and multi PV can help in this case for example )
I would say Leela is perhaps more useful is some positions than Dragon now ( see TCEC superfinal where Time control and hardware are interesting even for an ICCF player for help to evaluate a little bit engines ) ?
For the moment, i cannot say if Dragon is better with multi PV but work on multi PV is a very good thing for us ( engines are sometimes "closed" to one line which is not always the best and multi PV can help in this case for example )
IM ICCF player
-
- Posts: 47
- Joined: Sun May 23, 2021 6:04 pm
- Full name: Jacques Ress
Re: Komodo Dragon 2 released.
Note :Here is a well known position for illustrate:
1rb1qrk1/2b2pp1/p3pBn1/3pP1Pp/1ppP4/2P1QN2/PP3P1P/R2BR1K1 w - - 0 1
Only Lc0 can find it quickly because it was simply in one of his game ! Stockfish see nothing if you don't help , same thing for Dragon....
I found only one derivative which find the good move at low depth ( 27 or max 30 anyway ) but i use always Muti PV ???
1rb1qrk1/2b2pp1/p3pBn1/3pP1Pp/1ppP4/2P1QN2/PP3P1P/R2BR1K1 w - - 0 1
Only Lc0 can find it quickly because it was simply in one of his game ! Stockfish see nothing if you don't help , same thing for Dragon....
I found only one derivative which find the good move at low depth ( 27 or max 30 anyway ) but i use always Muti PV ???
IM ICCF player
-
- Posts: 511
- Joined: Sun Apr 26, 2020 11:40 pm
- Full name: Brian D. Smith
Re: Komodo Dragon 2 released.
And yet...it is not the 'best engine'!jr66 wrote: ↑Tue May 25, 2021 6:11 pm Note :Here is a well known position for illustrate:
1rb1qrk1/2b2pp1/p3pBn1/3pP1Pp/1ppP4/2P1QN2/PP3P1P/R2BR1K1 w - - 0 1
Only Lc0 can find it quickly because it was simply in one of his game ! Stockfish see nothing if you don't help , same thing for Dragon....
I found only one derivative which find the good move at low depth ( 27 or max 30 anyway ) but i use always Muti PV ???
People do like the 'exceptions that prove the rule'...