Chessqueens tests

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Dann Corbit, Harvey Williamson

Chessqueen
Posts: 5479
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Chessqueens tests

Post by Chessqueen »

Thanks to you lately I have been testing with 400 games, and this should be a good small sample to predict the outcome of what would be 800 or 1000 games. What I have noticed so far is that if two engines are equal from 40 games on .... the score pretty much remains equal throughout no matter if you test them 40 games 400, or even 1000 games. What you have to take into consideration is that there are some openings or not calibrated openings that are one sided that benefit certain engines even if the rating difference is 200 rating like a game that I saw between Rubichess and Stockfish 3 JA and no matter if they play Blitz or longer T/C the result was the same. Therefore, when two engine like these two are equally strong only a bad opening can cause one side to win fast. The only reason why Micro-Max beat Glaurung was because of the bad opening and the same happened when Stockfish 3 JA Drew Vs Rubichess forum3/viewtopic.php?f=2&t=80224. For instance you can take this bad opening and use any two engines rated about 200 rating apart and the weaker engine can at least draw. If you want give it to Rubichess Vs latest version of Stockfish and you will see the result :roll:

1. d4 Nf6 2. c4 g6 3. g3 Bg7 4. Bg2 O-O 5. Nf3 d5 6. O-O dxc4 7. Nbd2 b5 8.
Ne5 Nd5 9. b3 Nc3 10. Qe1 Qxd4 11. Nxf7 Nd5 12. Ng5 h6 13. bxc4 bxc4 14. e3
Qe5 15. Nxc4 Qxg5 16. e4 Qh5 17. exd5 Bxa1 18. d6 exd

Rank Engine Score St Ve S-B
1 Stockfish_5_x64_modern 26.0/50 · ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· =1==01=========================0=110====1===0=1===? 624.00
2 Velvet-v4.0.0-x86_64-avx2 24.0/50 =0==10=========================1=001====0===1=0===? · ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· 624.00

50 of 400 games played
Tournament start: 2022.06.28, 01:58:17
Latest update: 2022.07.07, 07:55:42
Site/ Country: DESKTOP-OFQ3C0P, United States
Level: Blitz 5/3
Hardware: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 16.0 GB Memory
Operating system: Windows 10 Home Home Edition (Build 9200) 64 bi
Table created with: Arena 3.5.1

For instance you can take this bad opening and use any two engines rated about 200 rating apart and the weaker engine can at least draw. If you want give it to Rubichess Vs latest version of Stockfish and you will see the result :roll:

[pgn][Event "Computer chess game"]
[Site "DESKTOP-OFQ3C0P"]
[Date "2022.07.03"]
[Round "?"]
[White "Stockfish 22050407_x64_avx2"]
[Black "RubiChess-20220223_x86-64-avx2"]
[Result "?"]
[BlackElo "3145"]
[ECO "D77"]
[Opening "Neo-Grünfeld, 6.O-O dxc4"]
[Time "17:17:34"]
[WhiteElo "3465"]
[TimeControl "900+10"]
[Termination "normal"]
[PlyCount "197"]
[WhiteType "program"]
[BlackType "program"]
1. d4 Nf6 2. c4 g6 3. g3 Bg7 4. Bg2 O-O 5. Nf3 d5 6. O-O dxc4 7. Nbd2 b5 8.
Ne5 Nd5 9. b3 Nc3 10. Qe1 Qxd4 11. Nxf7 Nd5 12. Ng5 h6 13. bxc4 bxc4 14. e3
Qe5 15. Nxc4 Qxg5 16. e4 Qh5[/pgn]
Forget about memorization of Opening Theories https://www.youtube.com/watch?v=DN3381sdcdY
Chessqueen
Posts: 5479
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: StockNemo 2.0.0.5 test

Post by Chessqueen »

[quote=Chessqueen post_id=929712 time=1657199175 user_id=10732]
Thanks to you lately I have been testing with 400 games, and this should be a good small sample to predict the outcome of what would be 800 or 1000 games. What I have noticed so far is that if two engines are equal from 40 games on .... the score pretty much remains equal throughout no matter if you test them 40 games 400, or even 1000 games. What you have to take into consideration is that there are some openings or not calibrated openings that are one sided that benefit certain engines even if the rating difference is 200 rating like a game that I saw between Rubichess and Stockfish 3 JA and no matter if they play Blitz or longer T/C the result was the same. Therefore, when two engine like these two are equally strong only a bad opening can cause one side to win fast. The only reason why Micro-Max beat Glaurung was because of the bad opening and the same happened when Stockfish 3 JA Drew Vs Rubichess forum3/viewtopic.php?f=2&t=80224. For instance you can take this bad opening and use any two engines rated about 200 rating apart and the weaker engine can at least draw. If you want give it to Rubichess Vs latest version of Stockfish and you will see the result :roll:

1. d4 Nf6 2. c4 g6 3. g3 Bg7 4. Bg2 O-O 5. Nf3 d5 6. O-O dxc4 7. Nbd2 b5 8.
Ne5 Nd5 9. b3 Nc3 10. Qe1 Qxd4 11. Nxf7 Nd5 12. Ng5 h6 13. bxc4 bxc4 14. e3
Qe5 15. Nxc4 Qxg5 16. e4 Qh5 17. exd5 Bxa1 18. d6 exd

Rank Engine Score St Ve S-B
1 Stockfish_5_x64_modern 26.0/50 · ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· =1==01=========================0=110====1===0=1===? 624.00
2 Velvet-v4.0.0-x86_64-avx2 24.0/50 =0==10=========================1=001====0===1=0===? · ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· 624.00

50 of 400 games played
Tournament start: 2022.06.28, 01:58:17
Latest update: 2022.07.07, 07:55:42
Site/ Country: DESKTOP-OFQ3C0P, United States
Level: Blitz 5/3
Hardware: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 16.0 GB Memory
Operating system: Windows 10 Home Home Edition (Build 9200) 64 bi
Table created with: Arena 3.5.1

For instance you can take this bad opening and use any two engines rated about 140 rating apart and the weaker engine can at least draw. If you want give it to Rubichess Vs latest version of Stockfish or versus Komodo-12.1.1 and you will see the result :roll:

[pgn][Event "Computer chess game"]
[Site "DESKTOP-OFQ3C0P"]
[Date "2022.07.03"]
[Round "?"]
[White "Stockfish_15_x64_avx2"]
[Black "Komodo-12.1.1-64bit-bmi2"]
[Result "1/2-1/2"]
[BlackElo "3430"]
[Time "13:05:38"]
[WhiteElo "3555"]
[TimeControl "900+10"]
[Termination "adjudication"]
[PlyCount "131"]
[WhiteType "program"]
[BlackType "program"]


1. d4 Nf6 2. c4 g6 3. g3 Bg7 4. Bg2 O-O 5. Nf3 d5 6. O-O dxc4 7. Nbd2 b5 8.
Ne5 Nd5 9. b3 Nc3 10. Qe1 Qxd4 11. Nxf7 Nd5 12. Ng5 h6 13. bxc4 bxc4 14. e3
Qe5 15. Nxc4 Qxg5 16. e4 Qh5 17. exd5 Bxa1 18. d6 exd6 19. Bxa8 Bh3 20. Bg2
Bxg2 21. Qe6+ Kh7 22. Kxg2 Bg7 23. f3 Qb5 24. Ne3 Re8 25. Qf7 Qd7 26. Qc4
c5 27. Rd1 Qe6 28. Qxe6 Rxe6 29. f4 Nc6 30. Kf1 Nb4 31. a3 Nc6 32. Nd5 Re4
33. Kf2 Kg8 34. Nf6+ Bxf6 35. Rxd6 Bd4+ 36. Kf3 Re1 37. Bd2 Rf1+ 38. Ke2
Rf2+ 39. Ke1 Kf7 40. Rxd4 Rxh2 41. Rd6 Nd4 42. Bc3 Nc2+ 43. Kf1 Ne3+ 44.
Ke1 Ng4 45. Rc6 c4 46. Rxc4 Rh3 47. Rc5 Rxg3 48. f5 gxf5 49. Rxf5+ Ke6 50.
Rc5 h5 51. Kf1 h4 52. Be1 Rh3 53. Rg5 Nf6 54. Rg7 Rxa3 55. Bxh4 Nd5 56. Bd8
Ra2 57. Rh7 Ke5 58. Ke1 Kd4 59. Kd1 a5 60. Bxa5 Rxa5 61. Ke2 Ra2+ 62. Kf3
Ra3+ 63. Kf2 Nf6 64. Rh4+ Ne4+ 65. Ke2 Re3+ 66. Kf1 *[/pgn]
Forget about memorization of Opening Theories https://www.youtube.com/watch?v=DN3381sdcdY
Chessqueen
Posts: 5479
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: StockNemo 2.0.0.5 test

Post by Chessqueen »

Chessqueen wrote: Thu Jul 07, 2022 8:17 pm
Chessqueen wrote: Thu Jul 07, 2022 3:06 pm Thanks to you lately I have been testing with 400 games, and this should be a good small sample to predict the outcome of what would be 800 or 1000 games. What I have noticed so far is that if two engines are equal from 40 games on .... the score pretty much remains equal throughout no matter if you test them 40 games 400, or even 1000 games. What you have to take into consideration is that there are some openings or not calibrated openings that are one sided that benefit certain engines even if the rating difference is 200 rating like a game that I saw between Rubichess and Stockfish 3 JA and no matter if they play Blitz or longer T/C the result was the same. Therefore, when two engine like these two are equally strong only a bad opening can cause one side to win fast. The only reason why Micro-Max beat Glaurung was because of the bad opening and the same happened when Stockfish 3 JA Drew Vs Rubichess forum3/viewtopic.php?f=2&t=80224. For instance you can take this bad opening and use any two engines rated about 200 rating apart and the weaker engine can at least draw. If you want give it to Rubichess Vs latest version of Stockfish and you will see the result :roll:

1. d4 Nf6 2. c4 g6 3. g3 Bg7 4. Bg2 O-O 5. Nf3 d5 6. O-O dxc4 7. Nbd2 b5 8.
Ne5 Nd5 9. b3 Nc3 10. Qe1 Qxd4 11. Nxf7 Nd5 12. Ng5 h6 13. bxc4 bxc4 14. e3
Qe5 15. Nxc4 Qxg5 16. e4 Qh5 17. exd5 Bxa1 18. d6 exd

Rank Engine Score St Ve S-B
1 Stockfish_5_x64_modern 26.0/50 · ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· =1==01=========================0=110====1===0=1===? 624.00
2 Velvet-v4.0.0-x86_64-avx2 24.0/50 =0==10=========================1=001====0===1=0===? · ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· ·· 624.00

50 of 400 games played
Tournament start: 2022.06.28, 01:58:17
Latest update: 2022.07.07, 07:55:42
Site/ Country: DESKTOP-OFQ3C0P, United States
Level: Blitz 5/3
Hardware: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 16.0 GB Memory
Operating system: Windows 10 Home Home Edition (Build 9200) 64 bi
Table created with: Arena 3.5.1

For instance you can take this bad opening and use any two engines rated about 140 rating apart and the weaker engine can at least draw. If you want give it to Rubichess Vs latest version of Stockfish or versus Komodo-12.1.1 and you will see the result :roll:

[pgn][Event "Computer chess game"]
[Site "DESKTOP-OFQ3C0P"]
[Date "2022.07.03"]
[Round "?"]
[White "Stockfish_15_x64_avx2"]
[Black "Komodo-12.1.1-64bit-bmi2"]
[Result "1/2-1/2"]
[BlackElo "3430"]
[Time "13:05:38"]
[WhiteElo "3555"]
[TimeControl "900+10"]
[Termination "adjudication"]
[PlyCount "131"]
[WhiteType "program"]
[BlackType "program"]


1. d4 Nf6 2. c4 g6 3. g3 Bg7 4. Bg2 O-O 5. Nf3 d5 6. O-O dxc4 7. Nbd2 b5 8.
Ne5 Nd5 9. b3 Nc3 10. Qe1 Qxd4 11. Nxf7 Nd5 12. Ng5 h6 13. bxc4 bxc4 14. e3
Qe5 15. Nxc4 Qxg5 16. e4 Qh5 17. exd5 Bxa1 18. d6 exd6 19. Bxa8 Bh3 20. Bg2
Bxg2 21. Qe6+ Kh7 22. Kxg2 Bg7 23. f3 Qb5 24. Ne3 Re8 25. Qf7 Qd7 26. Qc4
c5 27. Rd1 Qe6 28. Qxe6 Rxe6 29. f4 Nc6 30. Kf1 Nb4 31. a3 Nc6 32. Nd5 Re4
33. Kf2 Kg8 34. Nf6+ Bxf6 35. Rxd6 Bd4+ 36. Kf3 Re1 37. Bd2 Rf1+ 38. Ke2
Rf2+ 39. Ke1 Kf7 40. Rxd4 Rxh2 41. Rd6 Nd4 42. Bc3 Nc2+ 43. Kf1 Ne3+ 44.
Ke1 Ng4 45. Rc6 c4 46. Rxc4 Rh3 47. Rc5 Rxg3 48. f5 gxf5 49. Rxf5+ Ke6 50.
Rc5 h5 51. Kf1 h4 52. Be1 Rh3 53. Rg5 Nf6 54. Rg7 Rxa3 55. Bxh4 Nd5 56. Bd8
Ra2 57. Rh7 Ke5 58. Ke1 Kd4 59. Kd1 a5 60. Bxa5 Rxa5 61. Ke2 Ra2+ 62. Kf3
Ra3+ 63. Kf2 Nf6 64. Rh4+ Ne4+ 65. Ke2 Re3+ 66. Kf1 *[/pgn]
Forget about memorization of Opening Theories https://www.youtube.com/watch?v=DN3381sdcdY