Fritz vs. Dragon
Moderators: hgm, Dann Corbit, Harvey Williamson
-
Fritz 0
- Posts: 144
- Joined: Fri Mar 11, 2022 12:10 pm
- Full name: Branislav Đošić
Fritz vs. Dragon
I have noticed that Dragon 2.5.1 set to level 25 reaches average search depth of 7-8 ply in the middlegame, so decided to do a test. I run a match between the ancient Fritz 9 at fixed depth of 7 ply and Dragon 2.5.1 level 25. The result is somewhat surprising for me. After 3500+ games Fritz has +73 Elo difference (60,4% - 39,6%). If we assume that Dragon 2.5.1 level 25 is at least 2200 Elo FIDE at classical time control (2500 rapid minus 300 for a safe measure), is it possible that Fritz 9 at only 7 ply is close to 2300? It seems rather high to me. Any thoughts on this?
-
lkaufman
- Posts: 5942
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
Re: Fritz vs. Dragon
This is important to me as I am determined to get the Elo ratings (same as the old Skill ratings with two trailing zeroes) "right" for Dragon 3, meaning that they should be pretty even matches at 15' + 10" Rapid with humans of the same FIDE rating (at least for players of say 1600 and above, roughly the cutoff for meaningful FIDE ratings in general (lower ratings may be for kids or others with few games and unreliable). We have enough human data with dev version 2883 to say that it is at least roughly correct, perhaps a bit stronger than the stated Elo but not by much. The 2600 level was tested especially vs. GM Alex Lenderman in Rapid. We had to make it somewhat weaker at a specified elo than released versions to make this so; Elo 2600 on this is between skill 25 and skill 26 on Dragon 2.5.1. So skill 25 on Dragon 2.5.1 should be around 2550 FIDE Rapid or perhaps 2300 FIDE Classical.Fritz 0 wrote: ↑Sun Mar 20, 2022 3:34 pm I have noticed that Dragon 2.5.1 set to level 25 reaches average search depth of 7-8 ply in the middlegame, so decided to do a test. I run a match between the ancient Fritz 9 at fixed depth of 7 ply and Dragon 2.5.1 level 25. The result is somewhat surprising for me. After 3500+ games Fritz has +73 Elo difference (60,4% - 39,6%). If we assume that Dragon 2.5.1 level 25 is at least 2200 Elo FIDE at classical time control (2500 rapid minus 300 for a safe measure), is it possible that Fritz 9 at only 7 ply is close to 2300? It seems rather high to me. Any thoughts on this?
I don't have Fritz 9, but I used Benjamin 1.0 for a similar test to yours. Benjamin 1.0 is a modernized version by Ed Shroder of his best engine from 2001 or 2002, Rebel Century. It is somewhat stronger but not radically different, it doesn't use later innovations like LMR I believe. It is rated (on CCRL Rapid) within one elo of Fritz 8 and 42 elo behind Fritz 9. I ran 1000 games between Dragon 2883 and Benjamin 1.0 with both doing 7 ply, and Dragon won by 204.6 elo. Then I ran Dragon 2.5.1 Skill 25 vs Dragon 2883 set to 7 ply, and the result was that Skill 25 lost by 39.3 elo. So Skill 25 on Dragon 2.5.1 should be about 165 elo stronger than 7 ply on Benjamin 1, which would put Benjamin 1 about 2135 FIDE Classical at 7 ply. Fritz 8 being 42 elo stronger should be about 2177 (though fixed depth elo may differ, there may be more extensions in Fritz for example). Even this seems high to me, but not by a lot; it is likely that old engines with minimal positional strength would do worse vs. humans than Dragon with an NN would do even if they are "equal" against each other. I can't really explain why your result for Fritz 9 was so drastically better than the implied result for Benjamin 1, even with the 42 elo added, unless Fritz 9 has a lot of extensions that make it very strong at a given fixed depth. But I am reasonably certain that the Skill levels in the mid 20s in Dragon 2.5.1 are a bit stronger than intended, they would almost surely win Rapid matches (15' + 10") from equally rated human players, though by small margins.
Komodo rules!
-
Fritz 0
- Posts: 144
- Joined: Fri Mar 11, 2022 12:10 pm
- Full name: Branislav Đošić
Re: Fritz vs. Dragon
Thank you Larry for your input, although I'm even more confused nowlkaufman wrote: ↑Mon Mar 21, 2022 4:39 amThis is important to me as I am determined to get the Elo ratings (same as the old Skill ratings with two trailing zeroes) "right" for Dragon 3, meaning that they should be pretty even matches at 15' + 10" Rapid with humans of the same FIDE rating (at least for players of say 1600 and above, roughly the cutoff for meaningful FIDE ratings in general (lower ratings may be for kids or others with few games and unreliable). We have enough human data with dev version 2883 to say that it is at least roughly correct, perhaps a bit stronger than the stated Elo but not by much. The 2600 level was tested especially vs. GM Alex Lenderman in Rapid. We had to make it somewhat weaker at a specified elo than released versions to make this so; Elo 2600 on this is between skill 25 and skill 26 on Dragon 2.5.1. So skill 25 on Dragon 2.5.1 should be around 2550 FIDE Rapid or perhaps 2300 FIDE Classical.Fritz 0 wrote: ↑Sun Mar 20, 2022 3:34 pm I have noticed that Dragon 2.5.1 set to level 25 reaches average search depth of 7-8 ply in the middlegame, so decided to do a test. I run a match between the ancient Fritz 9 at fixed depth of 7 ply and Dragon 2.5.1 level 25. The result is somewhat surprising for me. After 3500+ games Fritz has +73 Elo difference (60,4% - 39,6%). If we assume that Dragon 2.5.1 level 25 is at least 2200 Elo FIDE at classical time control (2500 rapid minus 300 for a safe measure), is it possible that Fritz 9 at only 7 ply is close to 2300? It seems rather high to me. Any thoughts on this?
I don't have Fritz 9, but I used Benjamin 1.0 for a similar test to yours. Benjamin 1.0 is a modernized version by Ed Shroder of his best engine from 2001 or 2002, Rebel Century. It is somewhat stronger but not radically different, it doesn't use later innovations like LMR I believe. It is rated (on CCRL Rapid) within one elo of Fritz 8 and 42 elo behind Fritz 9. I ran 1000 games between Dragon 2883 and Benjamin 1.0 with both doing 7 ply, and Dragon won by 204.6 elo. Then I ran Dragon 2.5.1 Skill 25 vs Dragon 2883 set to 7 ply, and the result was that Skill 25 lost by 39.3 elo. So Skill 25 on Dragon 2.5.1 should be about 165 elo stronger than 7 ply on Benjamin 1, which would put Benjamin 1 about 2135 FIDE Classical at 7 ply. Fritz 8 being 42 elo stronger should be about 2177 (though fixed depth elo may differ, there may be more extensions in Fritz for example). Even this seems high to me, but not by a lot; it is likely that old engines with minimal positional strength would do worse vs. humans than Dragon with an NN would do even if they are "equal" against each other. I can't really explain why your result for Fritz 9 was so drastically better than the implied result for Benjamin 1, even with the 42 elo added, unless Fritz 9 has a lot of extensions that make it very strong at a given fixed depth. But I am reasonably certain that the Skill levels in the mid 20s in Dragon 2.5.1 are a bit stronger than intended, they would almost surely win Rapid matches (15' + 10") from equally rated human players, though by small margins.
I will run the same test with Fritz 8, that is, Frtz 8 at 7 ply vs. Dragon 2.5.1 level 25. If the result is similar to Fritz 9, than it's clear that fixed depth strength ratio between two engines can be totally different than their maximum strength ratio.
-
Chessqueen
- Posts: 5481
- Joined: Wed Sep 05, 2018 2:16 am
- Location: Moving
- Full name: Jorge Picado
Re: Fritz vs. Dragon
What have you found out so far? This is a very interesting test that determing rating based on pies and NOT based on engine versionsFritz 0 wrote: ↑Mon Mar 21, 2022 9:22 amThank you Larry for your input, although I'm even more confused nowlkaufman wrote: ↑Mon Mar 21, 2022 4:39 amThis is important to me as I am determined to get the Elo ratings (same as the old Skill ratings with two trailing zeroes) "right" for Dragon 3, meaning that they should be pretty even matches at 15' + 10" Rapid with humans of the same FIDE rating (at least for players of say 1600 and above, roughly the cutoff for meaningful FIDE ratings in general (lower ratings may be for kids or others with few games and unreliable). We have enough human data with dev version 2883 to say that it is at least roughly correct, perhaps a bit stronger than the stated Elo but not by much. The 2600 level was tested especially vs. GM Alex Lenderman in Rapid. We had to make it somewhat weaker at a specified elo than released versions to make this so; Elo 2600 on this is between skill 25 and skill 26 on Dragon 2.5.1. So skill 25 on Dragon 2.5.1 should be around 2550 FIDE Rapid or perhaps 2300 FIDE Classical.Fritz 0 wrote: ↑Sun Mar 20, 2022 3:34 pm I have noticed that Dragon 2.5.1 set to level 25 reaches average search depth of 7-8 ply in the middlegame, so decided to do a test. I run a match between the ancient Fritz 9 at fixed depth of 7 ply and Dragon 2.5.1 level 25. The result is somewhat surprising for me. After 3500+ games Fritz has +73 Elo difference (60,4% - 39,6%). If we assume that Dragon 2.5.1 level 25 is at least 2200 Elo FIDE at classical time control (2500 rapid minus 300 for a safe measure), is it possible that Fritz 9 at only 7 ply is close to 2300? It seems rather high to me. Any thoughts on this?
I don't have Fritz 9, but I used Benjamin 1.0 for a similar test to yours. Benjamin 1.0 is a modernized version by Ed Shroder of his best engine from 2001 or 2002, Rebel Century. It is somewhat stronger but not radically different, it doesn't use later innovations like LMR I believe. It is rated (on CCRL Rapid) within one elo of Fritz 8 and 42 elo behind Fritz 9. I ran 1000 games between Dragon 2883 and Benjamin 1.0 with both doing 7 ply, and Dragon won by 204.6 elo. Then I ran Dragon 2.5.1 Skill 25 vs Dragon 2883 set to 7 ply, and the result was that Skill 25 lost by 39.3 elo. So Skill 25 on Dragon 2.5.1 should be about 165 elo stronger than 7 ply on Benjamin 1, which would put Benjamin 1 about 2135 FIDE Classical at 7 ply. Fritz 8 being 42 elo stronger should be about 2177 (though fixed depth elo may differ, there may be more extensions in Fritz for example). Even this seems high to me, but not by a lot; it is likely that old engines with minimal positional strength would do worse vs. humans than Dragon with an NN would do even if they are "equal" against each other. I can't really explain why your result for Fritz 9 was so drastically better than the implied result for Benjamin 1, even with the 42 elo added, unless Fritz 9 has a lot of extensions that make it very strong at a given fixed depth. But I am reasonably certain that the Skill levels in the mid 20s in Dragon 2.5.1 are a bit stronger than intended, they would almost surely win Rapid matches (15' + 10") from equally rated human players, though by small margins.. This would mean that Fritz 9 at 7 ply is well above 2300 FIDE Classical! I am sure that fixed depth Elo differs; for example, I know that Fritz 9 is significantly stronger than Fritz 12 at fixed depth of 6-8 ply (I mean, 6 vs. 6, 7 vs. 7, 8 vs. 8), despite Fritz 12 being much stronger in the full strength mode.
I will run the same test with Fritz 8, that is, Frtz 8 at 7 ply vs. Dragon 2.5.1 level 25. If the result is similar to Fritz 9, than it's clear that fixed depth strength ratio between two engines can be totally different than their maximum strength ratio.
Forget about memorization of Opening Theories https://www.youtube.com/watch?v=DN3381sdcdY
-
Fritz 0
- Posts: 144
- Joined: Fri Mar 11, 2022 12:10 pm
- Full name: Branislav Đošić
Re: Fritz vs. Dragon
Well, the test Fritz 8 at 7 ply vs. Dargon 2.5.1 level 25 is over. I run more than 4000 games to make the result more reliable. Fritz 8 won by 70 Elo, so very close to Fritz 9. That means that Fritz 8 at 7 ply is definitely much stronger than Benjamin 1.0 at the same depth, although they are almost equal at maximum strength.
Regarding Larry's remark that Dragon possibly performs better against humans than older AB engines would, I think it's probably true, because I highly doubt that Fritz 9 or Fritz 8 at 7 ply is equal to 2370 FIDE human player at classical time control. In the other hand, who knows. I can not promise, but I will try to test it when I have enough time. A few years ago I played some 10 or 15 serious long games against Fritz 9 at 6 ply and the score was about even, maybe a small plus for Fritz. Now I can find Dragon 2.6 Elo level that is equally strong as Fritz 9 at 6 ply and play some 90'+30'' games against it and see the result.
Regarding Larry's remark that Dragon possibly performs better against humans than older AB engines would, I think it's probably true, because I highly doubt that Fritz 9 or Fritz 8 at 7 ply is equal to 2370 FIDE human player at classical time control. In the other hand, who knows. I can not promise, but I will try to test it when I have enough time. A few years ago I played some 10 or 15 serious long games against Fritz 9 at 6 ply and the score was about even, maybe a small plus for Fritz. Now I can find Dragon 2.6 Elo level that is equally strong as Fritz 9 at 6 ply and play some 90'+30'' games against it and see the result.
-
Fritz 0
- Posts: 144
- Joined: Fri Mar 11, 2022 12:10 pm
- Full name: Branislav Đošić
Re: Fritz vs. Dragon
I've done some more testing of Dragon vs. Fritz at fixed depth. Last night I run a 5000 games match between Fritz 9 at 4 ply and Dragon 2.6 set at 2000 Elo (I couldn't use Dragon 2.6.1 because for some reason it doesn't work properly in my Fritz interface). Well, if previous results were surprising, this one is absolutely shocking. Fritz won by 115 Elo! I don't have another explanation except that estimated Elo levels in Dragon 2.6 are too low. There's no chance that Fritz 9 at only 4 ply is 2115 Elo FIDE Rapid, which transposes to almost 1900 Classical. I played countless games against it in the past and it's simply impossible that it is even close to it. I believe it can not be above 1600-1700 at best, since I can beat it at least 8-2 in a 10 game match if I am fully concentrated and take 1,5-2 hours per game. It's just too weak at such a low depth, both positionally and tactically. I would highly appreciate Larry's opinion on this.
-
lkaufman
- Posts: 5942
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
Re: Fritz vs. Dragon
I don't have Fritz 9, but as I suggested before it is likely that it would perform better vs. an NN engine (at low fixed depth for both) than it would against a human compared to that same NN engine. What I would suggest is that you play some games at 15' + 10" increment against both Fritz 9 at 4 ply and Dragon 2.6 at 2000 elo and compare your results. Please list all your relevant ratings (chess.com, lichess, ICC, FIDE, national etc at various time controls). It's better to test at the time control actually specified for the rating rather than at Classical where you have to guess what the elo drop should be. Perhaps you are not good at fast chess, I don't know, but usually the drop in strength from Classical to 15' + 10" Rapid is reasonably constant among humans, I mean it may vary by 50 elo or so, rarely by a hundred or more.Fritz 0 wrote: ↑Sat Mar 26, 2022 1:53 pm I've done some more testing of Dragon vs. Fritz at fixed depth. Last night I run a 5000 games match between Fritz 9 at 4 ply and Dragon 2.6 set at 2000 Elo (I couldn't use Dragon 2.6.1 because for some reason it doesn't work properly in my Fritz interface). Well, if previous results were surprising, this one is absolutely shocking. Fritz won by 115 Elo! I don't have another explanation except that estimated Elo levels in Dragon 2.6 are too low. There's no chance that Fritz 9 at only 4 ply is 2115 Elo FIDE Rapid, which transposes to almost 1900 Classical. I played countless games against it in the past and it's simply impossible that it is even close to it. I believe it can not be above 1600-1700 at best, since I can beat it at least 8-2 in a 10 game match if I am fully concentrated and take 1,5-2 hours per game. It's just too weak at such a low depth, both positionally and tactically. I would highly appreciate Larry's opinion on this.
Komodo rules!
-
Fritz 0
- Posts: 144
- Joined: Fri Mar 11, 2022 12:10 pm
- Full name: Branislav Đošić
Re: Fritz vs. Dragon
Unfortunately, it is not possible to set time control at fixed depth in the Fritz GUI. I am around 1900 both FIDE Classical and chess.com Rapid (15+10). What I can do is to play some more 15+10 games against Dragon 2.6 at Elo 1900 and/or 2000 and share the results. So far I played two games against Dragon 2.5.1 level 20 and the result is 1-1 (a win and a loss). After that I played one game against Dragon 2.6 at Elo 2000 and won as black. I have always considered myself a poor player at faster time controls.