My conclusion after playing Vs Top Engines with Rook Odds

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: My conclusion after playing Vs Top Engines with Rook Odds

Post by lkaufman »

Chessqueen wrote: Wed Jun 09, 2021 2:30 pm
lkaufman wrote: Tue Jun 08, 2021 6:40 pm It all depends on time limit, but at standard Rapid (15' + 10"), somewhere around 2000 elo is probably enough to win a rook odds match from any engine. Looks like our next official human match will be knight odds (with Armageddon rule) vs. IM Andras Toth (2377 FIDE, age 40) June 19 and 20. We have never won a (Rapid) match at knight odds from anyone over 2300 FIDE so this will be a challenge.

Hopefully Komodo Dragon 2 MCTS perform better then Komodo MCTS back in April of 2020 Practice Games Vs IM Andras Toth ==>
Well, I think he was playing the free Komodo version (10 or 11 then), no MCTS, default settings, no book, and probably running on a laptop, so vastly inferior to current Dragon MCTS on a 32 core threadripper with appropriate settings and a small book. Still, I don't know who is the favorite. Knight odds is at least a thousand elo handicap.
Komodo rules!
Chessqueen
Posts: 5580
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: My conclusion after playing Vs Top Engines with Rook Odds

Post by Chessqueen »

lkaufman wrote: Wed Jun 09, 2021 5:27 pm
Chessqueen wrote: Wed Jun 09, 2021 2:30 pm
lkaufman wrote: Tue Jun 08, 2021 6:40 pm It all depends on time limit, but at standard Rapid (15' + 10"), somewhere around 2000 elo is probably enough to win a rook odds match from any engine. Looks like our next official human match will be knight odds (with Armageddon rule) vs. IM Andras Toth (2377 FIDE, age 40) June 19 and 20. We have never won a (Rapid) match at knight odds from anyone over 2300 FIDE so this will be a challenge.

Hopefully Komodo Dragon 2 MCTS perform better then Komodo MCTS back in April of 2020 Practice Games Vs IM Andras Toth ==>
Well, I think he was playing the free Komodo version (10 or 11 then), no MCTS, default settings, no book, and probably running on a laptop, so vastly inferior to current Dragon MCTS on a 32 core threadripper with appropriate settings and a small book. Still, I don't know who is the favorite. Knight odds is at least a thousand elo handicap.
I just realized that, but last year you released version 11 of Komodo for free, so more likely it was 140 Elo weaker than than Komodo. Dragon2. You already tested at Knight Odds Komodo Dragon2 Vs Zahak3 which is rated around 2408 by CCRL, I Know that Dragon 4.6 is rated 2417 and Zahak 2408, and you tested komodo versus Zahak, give Dragon 4.6 the same 40 games test vs Komodo Dragon2

OK, I ran 44 knight odds games (b1 and g1, ChrisW book, middle of the file) between KomodoDragon2 and Zahak3 (64 bit) on my I9 laptop, very fast machine which favors the program getting the handicap compared to slower hardware. It was quite a close match. Dragon won by 21 wins to 17 losses with four draws. I would say that this is a very good result for Zahak, because in previous testing, at the just marginally faster time limit of 10' + 5", Dragon performed close to 2600 on the CCRL rapid scale giving knight odds this way. I didn't see a Rapid rating for Zahak (that's probably why I missed the rating before, I must have looked at Rapid rather than Blitz); perhaps it gets better with more time?


Dragon 4.6

Rank Engine Score Dr Za S-B
1 Dragon 4.6 10.0/10 · ·· ·· ·· ·· 1111111111 0.00
2 Zahak-windows-amd64-3.0 0.0/10 0000000000 · ·· ·· ·· ·· 0.00


10 games played / Tournament is finished

Tournament start: 2021.06.01, 23:27:40
Latest update: 2021.06.09, 17:05:19
Site/ Country: MININT-UB2PIMJ, United States
Level: Tournament Game in 5 Minutes
Hardware: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 15.9 GB Memory
Operating system: Windows 10 Enterprise Professional (Build 9200) 64 bit
PGN-File: C:\Program Files (x86)\Arena\Arena.pgn
Table created with: Arena 3.5.1
Do NOT worry and be happy, we all live a short life :roll:
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: My conclusion after playing Vs Top Engines with Rook Odds

Post by lkaufman »

Chessqueen wrote: Thu Jun 10, 2021 12:15 am
lkaufman wrote: Wed Jun 09, 2021 5:27 pm
Chessqueen wrote: Wed Jun 09, 2021 2:30 pm
lkaufman wrote: Tue Jun 08, 2021 6:40 pm It all depends on time limit, but at standard Rapid (15' + 10"), somewhere around 2000 elo is probably enough to win a rook odds match from any engine. Looks like our next official human match will be knight odds (with Armageddon rule) vs. IM Andras Toth (2377 FIDE, age 40) June 19 and 20. We have never won a (Rapid) match at knight odds from anyone over 2300 FIDE so this will be a challenge.

Hopefully Komodo Dragon 2 MCTS perform better then Komodo MCTS back in April of 2020 Practice Games Vs IM Andras Toth ==>
Well, I think he was playing the free Komodo version (10 or 11 then), no MCTS, default settings, no book, and probably running on a laptop, so vastly inferior to current Dragon MCTS on a 32 core threadripper with appropriate settings and a small book. Still, I don't know who is the favorite. Knight odds is at least a thousand elo handicap.
I just realized that, but last year you released version 11 of Komodo for free, so more likely it was 140 Elo weaker than than Komodo. Dragon2. You already tested at Knight Odds Komodo Dragon2 Vs Zahak3 which is rated around 2408 by CCRL, I Know that Dragon 4.6 is rated 2417 and Zahak 2408, and you tested komodo versus Zahak, give Dragon 4.6 the same 40 games test vs Komodo Dragon2

OK, I ran 44 knight odds games (b1 and g1, ChrisW book, middle of the file) between KomodoDragon2 and Zahak3 (64 bit) on my I9 laptop, very fast machine which favors the program getting the handicap compared to slower hardware. It was quite a close match. Dragon won by 21 wins to 17 losses with four draws. I would say that this is a very good result for Zahak, because in previous testing, at the just marginally faster time limit of 10' + 5", Dragon performed close to 2600 on the CCRL rapid scale giving knight odds this way. I didn't see a Rapid rating for Zahak (that's probably why I missed the rating before, I must have looked at Rapid rather than Blitz); perhaps it gets better with more time?


Dragon 4.6

Rank Engine Score Dr Za S-B
1 Dragon 4.6 10.0/10 · ·· ·· ·· ·· 1111111111 0.00
2 Zahak-windows-amd64-3.0 0.0/10 0000000000 · ·· ·· ·· ·· 0.00


10 games played / Tournament is finished

Tournament start: 2021.06.01, 23:27:40
Latest update: 2021.06.09, 17:05:19
Site/ Country: MININT-UB2PIMJ, United States
Level: Tournament Game in 5 Minutes
Hardware: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 15.9 GB Memory
Operating system: Windows 10 Enterprise Professional (Build 9200) 64 bit
PGN-File: C:\Program Files (x86)\Arena\Arena.pgn
Table created with: Arena 3.5.1
A ten to zero result for equal rated engines is statistically almost impossible; it probably means something is wrong, either a bad copy of Zahak, or perhaps it time forfeits in games without increment, or something set wrong by the GUI. I downloaded that old Dragon 4.6 but I get an error when I try to run it. But against several engines rated way above both of these, Dragon 2 performed about 2600 on CCRL scale giving knight odds at 10 min plus 5 sec. Remember this is just single thread Dragon 2; we'll have 32 cores for the human match. But humans just play much better given knight odds than engines, so there's no real way to predict the result from engine testing.
Komodo rules!
amanjpro
Posts: 883
Joined: Sat Mar 13, 2021 1:47 am
Full name: Amanj Sherwany

Re: My conclusion after playing Vs Top Engines with Rook Odds

Post by amanjpro »

lkaufman wrote: Thu Jun 10, 2021 3:23 am
Chessqueen wrote: Thu Jun 10, 2021 12:15 am
lkaufman wrote: Wed Jun 09, 2021 5:27 pm
Chessqueen wrote: Wed Jun 09, 2021 2:30 pm
lkaufman wrote: Tue Jun 08, 2021 6:40 pm It all depends on time limit, but at standard Rapid (15' + 10"), somewhere around 2000 elo is probably enough to win a rook odds match from any engine. Looks like our next official human match will be knight odds (with Armageddon rule) vs. IM Andras Toth (2377 FIDE, age 40) June 19 and 20. We have never won a (Rapid) match at knight odds from anyone over 2300 FIDE so this will be a challenge.

Hopefully Komodo Dragon 2 MCTS perform better then Komodo MCTS back in April of 2020 Practice Games Vs IM Andras Toth ==>
Well, I think he was playing the free Komodo version (10 or 11 then), no MCTS, default settings, no book, and probably running on a laptop, so vastly inferior to current Dragon MCTS on a 32 core threadripper with appropriate settings and a small book. Still, I don't know who is the favorite. Knight odds is at least a thousand elo handicap.
I just realized that, but last year you released version 11 of Komodo for free, so more likely it was 140 Elo weaker than than Komodo. Dragon2. You already tested at Knight Odds Komodo Dragon2 Vs Zahak3 which is rated around 2408 by CCRL, I Know that Dragon 4.6 is rated 2417 and Zahak 2408, and you tested komodo versus Zahak, give Dragon 4.6 the same 40 games test vs Komodo Dragon2

OK, I ran 44 knight odds games (b1 and g1, ChrisW book, middle of the file) between KomodoDragon2 and Zahak3 (64 bit) on my I9 laptop, very fast machine which favors the program getting the handicap compared to slower hardware. It was quite a close match. Dragon won by 21 wins to 17 losses with four draws. I would say that this is a very good result for Zahak, because in previous testing, at the just marginally faster time limit of 10' + 5", Dragon performed close to 2600 on the CCRL rapid scale giving knight odds this way. I didn't see a Rapid rating for Zahak (that's probably why I missed the rating before, I must have looked at Rapid rather than Blitz); perhaps it gets better with more time?


Dragon 4.6

Rank Engine Score Dr Za S-B
1 Dragon 4.6 10.0/10 · ·· ·· ·· ·· 1111111111 0.00
2 Zahak-windows-amd64-3.0 0.0/10 0000000000 · ·· ·· ·· ·· 0.00


10 games played / Tournament is finished

Tournament start: 2021.06.01, 23:27:40
Latest update: 2021.06.09, 17:05:19
Site/ Country: MININT-UB2PIMJ, United States
Level: Tournament Game in 5 Minutes
Hardware: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 15.9 GB Memory
Operating system: Windows 10 Enterprise Professional (Build 9200) 64 bit
PGN-File: C:\Program Files (x86)\Arena\Arena.pgn
Table created with: Arena 3.5.1
A ten to zero result for equal rated engines is statistically almost impossible; it probably means something is wrong, either a bad copy of Zahak, or perhaps it time forfeits in games without increment, or something set wrong by the GUI. I downloaded that old Dragon 4.6 but I get an error when I try to run it. But against several engines rated way above both of these, Dragon 2 performed about 2600 on CCRL scale giving knight odds at 10 min plus 5 sec. Remember this is just single thread Dragon 2; we'll have 32 cores for the human match. But humans just play much better given knight odds than engines, so there's no real way to predict the result from engine testing.

Zahak's default hash size is rather small (10 MB), maybe that could be a reason?
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: My conclusion after playing Vs Top Engines with Rook Odds

Post by lkaufman »

amanjpro wrote: Thu Jun 10, 2021 3:33 am
lkaufman wrote: Thu Jun 10, 2021 3:23 am
Chessqueen wrote: Thu Jun 10, 2021 12:15 am
lkaufman wrote: Wed Jun 09, 2021 5:27 pm
Chessqueen wrote: Wed Jun 09, 2021 2:30 pm
lkaufman wrote: Tue Jun 08, 2021 6:40 pm It all depends on time limit, but at standard Rapid (15' + 10"), somewhere around 2000 elo is probably enough to win a rook odds match from any engine. Looks like our next official human match will be knight odds (with Armageddon rule) vs. IM Andras Toth (2377 FIDE, age 40) June 19 and 20. We have never won a (Rapid) match at knight odds from anyone over 2300 FIDE so this will be a challenge.

Hopefully Komodo Dragon 2 MCTS perform better then Komodo MCTS back in April of 2020 Practice Games Vs IM Andras Toth ==>
Well, I think he was playing the free Komodo version (10 or 11 then), no MCTS, default settings, no book, and probably running on a laptop, so vastly inferior to current Dragon MCTS on a 32 core threadripper with appropriate settings and a small book. Still, I don't know who is the favorite. Knight odds is at least a thousand elo handicap.
I just realized that, but last year you released version 11 of Komodo for free, so more likely it was 140 Elo weaker than than Komodo. Dragon2. You already tested at Knight Odds Komodo Dragon2 Vs Zahak3 which is rated around 2408 by CCRL, I Know that Dragon 4.6 is rated 2417 and Zahak 2408, and you tested komodo versus Zahak, give Dragon 4.6 the same 40 games test vs Komodo Dragon2

OK, I ran 44 knight odds games (b1 and g1, ChrisW book, middle of the file) between KomodoDragon2 and Zahak3 (64 bit) on my I9 laptop, very fast machine which favors the program getting the handicap compared to slower hardware. It was quite a close match. Dragon won by 21 wins to 17 losses with four draws. I would say that this is a very good result for Zahak, because in previous testing, at the just marginally faster time limit of 10' + 5", Dragon performed close to 2600 on the CCRL rapid scale giving knight odds this way. I didn't see a Rapid rating for Zahak (that's probably why I missed the rating before, I must have looked at Rapid rather than Blitz); perhaps it gets better with more time?


Dragon 4.6

Rank Engine Score Dr Za S-B
1 Dragon 4.6 10.0/10 · ·· ·· ·· ·· 1111111111 0.00
2 Zahak-windows-amd64-3.0 0.0/10 0000000000 · ·· ·· ·· ·· 0.00


10 games played / Tournament is finished

Tournament start: 2021.06.01, 23:27:40
Latest update: 2021.06.09, 17:05:19
Site/ Country: MININT-UB2PIMJ, United States
Level: Tournament Game in 5 Minutes
Hardware: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 15.9 GB Memory
Operating system: Windows 10 Enterprise Professional (Build 9200) 64 bit
PGN-File: C:\Program Files (x86)\Arena\Arena.pgn
Table created with: Arena 3.5.1
A ten to zero result for equal rated engines is statistically almost impossible; it probably means something is wrong, either a bad copy of Zahak, or perhaps it time forfeits in games without increment, or something set wrong by the GUI. I downloaded that old Dragon 4.6 but I get an error when I try to run it. But against several engines rated way above both of these, Dragon 2 performed about 2600 on CCRL scale giving knight odds at 10 min plus 5 sec. Remember this is just single thread Dragon 2; we'll have 32 cores for the human match. But humans just play much better given knight odds than engines, so there's no real way to predict the result from engine testing.

Zahak's default hash size is rather small (10 MB), maybe that could be a reason?
Doubling of hash is generally worth something like 6 to 8 elo I think, so this might account for 20 or 25 elo, but hardly a 10 to 0 shutout. There must be another issue on top of this.
Komodo rules!
Chessqueen
Posts: 5580
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: My conclusion after playing Vs Top Engines with Rook Odds

Post by Chessqueen »

lkaufman wrote: Thu Jun 10, 2021 4:06 am
amanjpro wrote: Thu Jun 10, 2021 3:33 am
lkaufman wrote: Thu Jun 10, 2021 3:23 am
Chessqueen wrote: Thu Jun 10, 2021 12:15 am
lkaufman wrote: Wed Jun 09, 2021 5:27 pm
Chessqueen wrote: Wed Jun 09, 2021 2:30 pm
lkaufman wrote: Tue Jun 08, 2021 6:40 pm It all depends on time limit, but at standard Rapid (15' + 10"), somewhere around 2000 elo is probably enough to win a rook odds match from any engine. Looks like our next official human match will be knight odds (with Armageddon rule) vs. IM Andras Toth (2377 FIDE, age 40) June 19 and 20. We have never won a (Rapid) match at knight odds from anyone over 2300 FIDE so this will be a challenge.

Hopefully Komodo Dragon 2 MCTS perform better then Komodo MCTS back in April of 2020 Practice Games Vs IM Andras Toth ==>
Well, I think he was playing the free Komodo version (10 or 11 then), no MCTS, default settings, no book, and probably running on a laptop, so vastly inferior to current Dragon MCTS on a 32 core threadripper with appropriate settings and a small book. Still, I don't know who is the favorite. Knight odds is at least a thousand elo handicap.
I just realized that, but last year you released version 11 of Komodo for free, so more likely it was 340 Elo weaker than than Komodo. Dragon2. You already tested at Knight Odds Komodo Dragon2 Vs Zahak3 which is rated around 2408 by CCRL, I Know that Dragon 4.6 is rated 2417 and Zahak 2408, and you tested komodo versus Zahak, give Dragon 4.6 the same 40 games test vs Komodo Dragon2

OK, I ran 44 knight odds games (b1 and g1, ChrisW book, middle of the file) between KomodoDragon2 and Zahak3 (64 bit) on my I9 laptop, very fast machine which favors the program getting the handicap compared to slower hardware. It was quite a close match. Dragon won by 21 wins to 17 losses with four draws. I would say that this is a very good result for Zahak, because in previous testing, at the just marginally faster time limit of 10' + 5", Dragon performed close to 2600 on the CCRL rapid scale giving knight odds this way. I didn't see a Rapid rating for Zahak (that's probably why I missed the rating before, I must have looked at Rapid rather than Blitz); perhaps it gets better with more time?


Dragon 4.6

Rank Engine Score Dr Za S-B
1 Dragon 4.6 10.0/10 · ·· ·· ·· ·· 1111111111 0.00
2 Zahak-windows-amd64-3.0 0.0/10 0000000000 · ·· ·· ·· ·· 0.00


10 games played / Tournament is finished

Tournament start: 2021.06.01, 23:27:40
Latest update: 2021.06.09, 17:05:19
Site/ Country: MININT-UB2PIMJ, United States
Level: Tournament Game in 5 Minutes
Hardware: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 15.9 GB Memory
Operating system: Windows 10 Enterprise Professional (Build 9200) 64 bit
PGN-File: C:\Program Files (x86)\Arena\Arena.pgn
Table created with: Arena 3.5.1
A ten to zero result for equal rated engines is statistically almost impossible; it probably means something is wrong, either a bad copy of Zahak, or perhaps it time forfeits in games without increment, or something set wrong by the GUI. I downloaded that old Dragon 4.6 but I get an error when I try to run it. But against several engines rated way above both of these, Dragon 2 performed about 2600 on CCRL scale giving knight odds at 10 min plus 5 sec. Remember this is just single thread Dragon 2; we'll have 32 cores for the human match. But humans just play much better given knight odds than engines, so there's no real way to predict the result from engine testing.

Zahak's default hash size is rather small (10 MB), maybe that could be a reason?
Doubling of hash is generally worth something like 6 to 8 elo I think, so this might account for 20 or 25 elo, but hardly a 10 to 0 shutout. There must be another issue on top of this.
I posted this, buy did NOT realized it was a shutout, did not even paid attention I will run another tournament match, and probably other people here will run another short match of 10 games to see and compare the result..
Do NOT worry and be happy, we all live a short life :roll:
Chessqueen
Posts: 5580
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: My conclusion after playing Vs Top Engines with Rook Odds

Post by Chessqueen »

Chessqueen wrote: Thu Jun 10, 2021 5:18 am
lkaufman wrote: Thu Jun 10, 2021 4:06 am
amanjpro wrote: Thu Jun 10, 2021 3:33 am
lkaufman wrote: Thu Jun 10, 2021 3:23 am
Chessqueen wrote: Thu Jun 10, 2021 12:15 am
lkaufman wrote: Wed Jun 09, 2021 5:27 pm
Chessqueen wrote: Wed Jun 09, 2021 2:30 pm
lkaufman wrote: Tue Jun 08, 2021 6:40 pm It all depends on time limit, but at standard Rapid (15' + 10"), somewhere around 2000 elo is probably enough to win a rook odds match from any engine. Looks like our next official human match will be knight odds (with Armageddon rule) vs. IM Andras Toth (2377 FIDE, age 40) June 19 and 20. We have never won a (Rapid) match at knight odds from anyone over 2300 FIDE so this will be a challenge.

Hopefully Komodo Dragon 2 MCTS perform better then Komodo MCTS back in April of 2020 Practice Games Vs IM Andras Toth ==>
Well, I think he was playing the free Komodo version (10 or 11 then), no MCTS, default settings, no book, and probably running on a laptop, so vastly inferior to current Dragon MCTS on a 32 core threadripper with appropriate settings and a small book. Still, I don't know who is the favorite. Knight odds is at least a thousand elo handicap.
I just realized that, but last year you released version 11 of Komodo for free, so more likely it was 340 Elo weaker than than Komodo. Dragon2. You already tested at Knight Odds Komodo Dragon2 Vs Zahak3 which is rated around 2408 by CCRL, I Know that Dragon 4.6 is rated 2417 and Zahak 2408, and you tested komodo versus Zahak, give Dragon 4.6 the same 40 games test vs Komodo Dragon2

OK, I ran 44 knight odds games (b1 and g1, ChrisW book, middle of the file) between KomodoDragon2 and Zahak3 (64 bit) on my I9 laptop, very fast machine which favors the program getting the handicap compared to slower hardware. It was quite a close match. Dragon won by 21 wins to 17 losses with four draws. I would say that this is a very good result for Zahak, because in previous testing, at the just marginally faster time limit of 10' + 5", Dragon performed close to 2600 on the CCRL rapid scale giving knight odds this way. I didn't see a Rapid rating for Zahak (that's probably why I missed the rating before, I must have looked at Rapid rather than Blitz); perhaps it gets better with more time?


Dragon 4.6

Rank Engine Score Dr Za S-B
1 Dragon 4.6 10.0/10 · ·· ·· ·· ·· 1111111111 0.00
2 Zahak-windows-amd64-3.0 0.0/10 0000000000 · ·· ·· ·· ·· 0.00


10 games played / Tournament is finished

Tournament start: 2021.06.01, 23:27:40
Latest update: 2021.06.09, 17:05:19
Site/ Country: MININT-UB2PIMJ, United States
Level: Tournament Game in 5 Minutes
Hardware: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 15.9 GB Memory
Operating system: Windows 10 Enterprise Professional (Build 9200) 64 bit
PGN-File: C:\Program Files (x86)\Arena\Arena.pgn
Table created with: Arena 3.5.1
A ten to zero result for equal rated engines is statistically almost impossible; it probably means something is wrong, either a bad copy of Zahak, or perhaps it time forfeits in games without increment, or something set wrong by the GUI. I downloaded that old Dragon 4.6 but I get an error when I try to run it. But against several engines rated way above both of these, Dragon 2 performed about 2600 on CCRL scale giving knight odds at 10 min plus 5 sec. Remember this is just single thread Dragon 2; we'll have 32 cores for the human match. But humans just play much better given knight odds than engines, so there's no real way to predict the result from engine testing.

Zahak's default hash size is rather small (10 MB), maybe that could be a reason?
Doubling of hash is generally worth something like 6 to 8 elo I think, so this might account for 20 or 25 elo, but hardly a 10 to 0 shutout. There must be another issue on top of this.
I posted this, buy did NOT realized it was a shutout, did not even paid attention I will run another tournament match, and probably other people here will run another short match of 10 games to see and compare the result..
:roll:
Do NOT worry and be happy, we all live a short life :roll: