The past couple of days I've been trying out and experimenting with many different ideas to increase Blunder's strength, and so far only two have held up: knight outposts, and simple dynamic time management. Both have contributed about 30 Elo from gauntlet testing, which is definitely nice to see. And it's especially nice to see Blunder finally move beyond its old strictly linear time management. It was pretty frustrating watching it in situations before where if it had just been a little more patient and spent a bit longer at a critical point in a game, it would've found the best move.
While trying to make progress for the next release, one interesting phenomenon I've noticed is that while developing Blunder, I've found much more Elo in improving the search over the evaluation. Or put a bit more accurately, I've had an easier time finding Elo gains in the search over the evaluation.
Of course, Blunder's search is far, far from perfect, but when comparing it with say, version 5.0.0, the latest version can often search more than twice as deep and often spots winning tactics long before 5.0.0 can spot them. So I'm quite happy with the progress I've made on the search. But the evaluation has proven to be a much more fickle beast, at least for me.
Blunder's evaluation has also definitely improved from previous versions, and positionally it's a much, much better player. But overall I've found it much more difficult to implement meaningful improvements in the search (beyond obvious ones like PST, tapered eval, etc.). Basic mobility took a long time to get working right, and it wasn't a win until I texely tuned all of the evaluation parameters from scratch. And in recent versions, pawn structure and king safety were minimal gains (16 and 14 Elo in the end respectively, if I recall correctly). And while I can tell Blunder's positional play is improving (it'll sometimes opt to double its opponent's pawns, and it'll start putting its knights and lining up its bishops against the enemy king in anticipation of a chance to attack), I can tell the evaluation is far from optimized.
Essentially, most of Blunder's 2400 rating comes from it being quite good at spotting winning tactics, and its positional play is good enough to get it to the point where it can spot these tactics many times. However, it still bothers me to have an engine that tactically plays at a very high level, but positionally plays quite weak, as I demonstrated in another thread: forum3/viewtopic.php?f=7&t=78724&p=912835#p912835. So I'll likely be putting some effort into improving the evaluation for the next release of Blunder.
Nevertheless, I think others are right when they mention how you can't focus too much on getting the most Elo possible from X feature, and it's important to keep making the engine a little smarter and a little better with every addition to the evaluation. And with my engine now having a texel tuner, I can also re-tune the evaluation from scratch every now and then to "sync" up all the evaluation parameters and hopefully gain some strength, as I'm currently doing. For this current tuning process, I've added a couple of other eval terms that I consider important, and have tweaked some details so hopefully, the evaluation overall is more granular and accurate, especially with determining king-safety.
If this current tuning process proves to be a bust, I'll likely look into getting a new dataset for Blunder, as the only one I've been able to see consistent improvements with is the one by Zurichess released several years ago. Yesterday I tried tuning everything from scratch with a dataset from Ethereal, and while the positional play seems much better, it proved to be a 30-40 Elo regression in strength. I've also still tried extracting positions from self-play and grandmaster games, which have had similar issues. So I'll probably do some experimenting to discover the best criteria for extracting quiet positions, as I think the theory works out, so the only issue is likely the positions I'm choosing are bad.
One last thing I've realized is that the search and evaluation are likely conflicting whenever I change up the evaluation, which would make sense. Blunder's search contains several pruning techniques that rely on the static evaluation, so tweaking it by adding something like king safety would cause the engine to become smarter, but it would be negated by the various pruning techniques now being inaccurate and/or ineffective. So I'll probably also look at places where I can tweak search parameters (perhaps more aggressive margins since now Blunder is smarter and better understands the true value of a position, so more pruning can be achieved).
P.S: Here was an interesting game played between Blunder and Barbarossa. I thought it was particularly fun to see Blunder recognize the winning strategy of creating a little "fortress" in the corner of the board to essentially "trap" and neutralize Barbarossa's king in the endgame, leaving it free to walk and promote its passed pawn:
[pgn]
[Event "?"]
[Site "?"]
[Date "2021.12.06"]
[Round "?"]
[White "Barbarossa 0.6.0"]
[Black "Blunder dev"]
[Result "0-1"]
[ECO "C43"]
[GameDuration "00:57:42"]
[GameEndTime "2021-12-06T08:57:16.514 Central Standard Time"]
[GameStartTime "2021-12-06T07:59:33.882 Central Standard Time"]
[Opening "Petrov"]
[PlyCount "200"]
[TimeControl "40/600+2"]
[Variation "Modern attack, Symmetrical Variation"]
1. e4 {+0.40/15 16s} e5 {-0.27/18 16s} 2. Nf3 {+0.36/14 4.8s} Nf6 {-0.30/17 16s}
3. d4 {+0.32/11 13s} Nxe4 {-0.18/17 16s} 4. Bd3 {+0.32/15 7.1s}
d5 {-0.25/17 16s} 5. Nxe5 {+0.36/16 57s} Bd6 {-0.39/16 16s}
6. O-O {+0.44/13 44s} O-O {-0.38/17 16s} 7. Nd2 {+0.16/15 90s} f5 {-0.33/18 16s}
8. c4 {+0.52/12 6.3s} Bxe5 {-0.15/19 16s} 9. dxe5 {+0.84/16 43s}
Nc6 {-0.15/17 16s} 10. Nf3 {+0.88/14 6.2s} Be6 {-0.15/19 16s}
11. Qe2 {+0.80/14 6.3s} Nb4 {-0.21/19 16s} 12. Rd1 {+1.00/16 11s}
Nxd3 {+0.02/17 16s} 13. Rxd3 {+0.84/17 29s} c6 {-0.10/18 16s}
14. Bf4 {+0.72/15 43s} Qe7 {-0.05/16 16s} 15. cxd5 {+0.80/16 30s}
Bxd5 {-0.04/17 16s} 16. Rc1 {+0.92/12 11s} Rad8 {-0.06/18 17s}
17. b3 {+0.88/14 31s} c5 {-0.14/15 17s} 18. Rcd1 {+0.88/12 19s}
Bc6 {-0.01/18 17s} 19. Rxd8 {+0.60/14 37s} Rxd8 {+0.03/17 17s}
20. Rxd8+ {+0.44/11 17s} Qxd8 {+0.11/16 17s} 21. h3 {+0.64/11 3.6s}
g5 {+0.13/16 17s} 22. Bc1 {+0.36/14 28s} Qd5 {+0.09/15 17s}
23. Bb2 {+0.28/14 12s} b5 {+0.20/17 17s} 24. Qe1 {+0.36/11 9.6s}
b4 {+0.41/18 17s} 25. Qe2 {+0.16/12 13s} Kf8 {+0.24/16 17s}
26. e6 {+0.28/11 10.0s} g4 {+0.07/17 17s} 27. hxg4 {-0.04/16 16s}
fxg4 {+0.14/19 17s} 28. Ne1 {+0.24/15 9.6s} Qxe6 {+0.59/19 17s}
29. Qe3 {+0.16/15 9.6s} Ke8 {+0.51/16 17s} 30. Nd3 {+0.20/10 2.4s}
Qf5 {+0.30/15 17s} 31. Be5 {+0.64/14 7.4s} Kd8 {+0.23/14 17s}
32. Bb8 {+0.64/13 5.9s} a6 {0.00/16 18s} 33. Qh6 {+0.88/12 6.5s}
Bb5 {-0.17/18 18s} 34. Nf4 {+0.72/12 6.9s} Qf7 {-0.17/16 18s}
35. Be5 {+0.88/12 1.6s} Ke8 {-0.09/16 18s} 36. Qb6 {+1.00/12 4.8s}
Qe7 {-0.09/19 18s} 37. Bc7 {+0.68/13 3.4s} Kf7 {+0.34/15 18s}
38. Nh5 {+0.24/13 4.6s} Qe6 {+0.51/15 19s} 39. Qb7 {+0.24/11 2.2s}
Nf6 {+0.38/16 19s} 40. Nxf6 {+0.40/13 1.6s} Kxf6 {+0.34/18 19s}
41. Qb8 {+0.28/15 41s} Bc6 {+0.34/16 16s} 42. Qf8+ {+0.36/15 47s}
Kg6 {+0.28/17 16s} 43. Bf4 {+0.36/15 13s} h5 {+0.33/18 16s}
44. Qh6+ {+1.08/14 7.1s} Kf5 {+0.03/20 16s} 45. Qg5+ {+1.04/14 58s}
Ke4 {+0.03/19 16s} 46. Bc1 {+1.04/12 23s} Qf5 {+0.17/18 16s}
47. Qh6 {+1.00/12 64s} Bb5 {-0.19/17 16s} 48. a4 {+0.80/13 66s}
bxa3 {+0.29/20 16s} 49. Bxa3 {+0.36/15 59s} Kd5 {+0.29/18 16s}
50. Qb6 {+0.28/14 48s} Kd4 {+0.35/18 16s} 51. Bb4 {0.00/15 40s}
Qe5 {+0.50/17 16s} 52. Ba5 {+0.60/12 6.7s} h4 {+0.39/17 16s}
53. Qd8+ {+0.60/14 3.4s} Ke4 {+0.33/17 16s} 54. Bc7 {+0.60/14 21s}
Qg7 {+0.33/18 16s} 55. Qd6 {+0.60/15 23s} Qg5 {+0.37/18 17s}
56. Qd1 {+0.72/12 3.5s} Bd3 {+0.34/15 17s} 57. g3 {+0.88/11 6.4s}
h3 {+0.50/17 17s} 58. Qe1+ {-0.32/13 24s} Kd4 {+1.10/17 17s}
59. Qa1+ {0.00/14 3.7s} Kd5 {+1.24/17 17s} 60. Bf4 {-0.36/15 20s}
Qe7 {+1.40/19 17s} 61. Qa5 {-0.68/16 16s} Bb5 {+1.66/16 17s}
62. Kh2 {-0.88/10 4.2s} Kc6 {+1.62/16 17s} 63. Qd2 {-1.04/11 5.5s}
Kb6 {+1.53/17 17s} 64. b4 {-0.52/14 8.2s} cxb4 {+1.81/18 17s}
65. Be3+ {-0.52/14 7.4s} Kb7 {+1.73/17 17s} 66. Qd4 {-0.56/15 6.4s}
Qe6 {+1.11/17 17s} 67. Qa7+ {-0.64/16 4.5s} Kc8 {+1.11/15 17s}
68. Qc5+ {-1.60/13 9.2s} Bc6 {+1.93/17 17s} 69. Qxb4 {-1.52/14 1.8s}
Qe4 {+2.14/17 17s} 70. Qf8+ {-1.72/16 3.3s} Be8 {+2.13/16 17s}
71. Qc5+ {-1.80/16 1.8s} Qc6 {+1.94/17 17s} 72. f3 {-1.92/20 3.2s}
Qxc5 {+3.00/27 18s} 73. Bxc5 {-2.04/16 3.0s} Bd7 {+3.55/29 18s}
74. fxg4 {-2.80/20 8.0s} Bxg4 {+3.85/28 18s} 75. Bd4 {-3.24/18 6.4s}
Kb7 {+4.01/43 18s} 76. Bf6 {-3.76/17 5.2s} a5 {+4.60/35 18s}
77. Bc3 {-3.84/16 2.5s} Kb6 {+3.94/34 18s} 78. Bf6 {-4.68/20 1.8s}
Kb5 {+5.19/38 19s} 79. Bd8 {-5.28/20 0.99s} Kb4 {+5.54/44 19s}
80. Kh1 {-5.04/19 2.7s} a4 {+4.69/34 19s} 81. Kh2 {-5.56/20 0.78s}
a3 {+4.92/29 16s} 82. Kh1 {-5.24/20 58s} Kb3 {+5.16/36 16s}
83. Kh2 {-5.24/20 1.3s} a2 {+10.37/35 16s} 84. Bf6 {-6.16/20 2.9s}
Be6 {+10.37/33 16s} 85. Kh1 {-5.60/20 7.9s} Kc2 {+8.82/31 16s}
86. Be5 {-7.16/20 1.1s} Kb1 {+10.62/42 16s} 87. Bc3 {-12.92/20 2.7s}
a1=R {+10.72/39 16s} 88. Bxa1 {-12.96/20 1.6s} Kxa1 {+10.72/37 16s}
89. g4 {-13.08/20 1.5s} Bxg4 {+10.64/32 16s} 90. Kh2 {-16.56/20 1.0s}
Kb2 {+10.78/27 16s} 91. Kg3 {-13.20/19 55s} Be6 {+M31/24 16s}
92. Kf2 {-13.12/19 28s} Kc3 {+M27/25 16s} 93. Kg3 {-16.60/17 92s}
Kd4 {+M25/24 16s} 94. Kf2 {-16.60/15 55s} Ke4 {+13.92/25 16s}
95. Kg1 {-16.64/14 50s} Ke3 {+M13/26 16s} 96. Kh1 {-M16/14 57s} Kf2 {+M9/27 17s}
97. Kh2 {-7.24/4 0.001s} Bg4 {+M7/28 17s} 98. Kh1 {-7.88/4 0.001s}
Kg3 {+M5/26 17s} 99. Kg1 {-M4/4 0.001s} h2+ {+M3/25 17s} 100. Kh1 {-M2/20 1.6s}
Bf3# {+M1/71 17s, Black mates} 0-1
[/pgn]
Progress on Blunder
Moderator: Ras
-
Rebel
- Posts: 7438
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Progress on Blunder
I will give it a third shot as soon as I have a PC free, but both times the engine that caused the disconnection was Blunder. I will remove Loki from the participation list.
BTW Guenther, I posted the gambit positions 4-5 months ago, you can get them here.
BTW Guenther, I posted the gambit positions 4-5 months ago, you can get them here.
90% of coding is debugging, the other 10% is writing bugs.
-
algerbrex
- Posts: 608
- Joined: Sun May 30, 2021 5:03 am
- Location: United States
- Full name: Christian Dean
Re: Progress on Blunder
Thanks. At this point, I'm not really sure if this is a bug with Blunder or not. It sounds like it is, but I'll wait until this last test before I do any premature bug hunting.
-
Rebel
- Posts: 7438
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Progress on Blunder
90% of coding is debugging, the other 10% is writing bugs.
-
Guenther
- Posts: 4718
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: Progress on Blunder
Ed, did you read my messages they were not about disconnections, but about the 'illegal move Ndb4 warning', which never produced a game in your pgn?
I will check your download link. I guess (hope) the positions are still unchanged and used like that?
Edit:
Well, I downloaded your pgn and immediately found the illegal move in it as expected! (not sure why you did not check the pgn after my posts)
8...Ndb4 is illegal and cutechess refuses it (see source snippet for pgn reading posted by me)
[Event "YAT"]
[Site "Deventer"]
[Date "2021.04.21"]
[Round "45"]
[White ""]
[Black ""]
[Result "*"]
[BlackElo ""]
[WhiteElo ""]
1. e4 e5 2. Nf3 Nc6 3. Bc4 Nf6 4. Ng5 d5 5. exd5 Nxd5 6. Nxf7 Kxf7 7. Qf3+
Ke6 8. Nc3 Ndb4 9. O-O *
-
Rebel
- Posts: 7438
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Progress on Blunder
Wrong Knight, typo likely, good catch!Guenther wrote: ↑Sat Dec 11, 2021 9:59 amEd, did you read my messages they were not about disconnections, but about the 'illegal move Ndb4 warning', which never produced a game in your pgn?
I will check your download link. I guess (hope) the positions are still unchanged and used like that?
Edit:
Well, I downloaded your pgn and immediately found the illegal move in it as expected! (not sure why you did not check the pgn after my posts)
8...Ndb4 is illegal and cutechess refuses it (see source snippet for pgn reading posted by me)
[Event "YAT"]
[Site "Deventer"]
[Date "2021.04.21"]
[Round "45"]
[White ""]
[Black ""]
[Result "*"]
[BlackElo ""]
[WhiteElo ""]
1. e4 e5 2. Nf3 Nc6 3. Bc4 Nf6 4. Ng5 d5 5. exd5 Nxd5 6. Nxf7 Kxf7 7. Qf3+
Ke6 8. Nc3 Ndb4 9. O-O *
Looked into the PGN's if it had consequences, fortunately not, cute just ignores the illegal book move, see:
[pgn][Event "?"]
[Site "?"]
[Date "2021.12.06"]
[Round "6"]
[White "Jumbo_0.6.10"]
[Black "Blunder_7.3.0"]
[Result "1-0"]
[ECO "C57"]
[GameDuration "00:00:23"]
[GameEndTime "2021-12-06T15:31:38.816 West-Europa (standaardtijd)"]
[GameStartTime "2021-12-06T15:31:15.518 West-Europa (standaardtijd)"]
[Opening "Two knights defense"]
[PlyCount "94"]
[Termination "adjudication"]
[TimeControl "40/10"]
[Variation "Fegatello attack"]
1. e4 {book} e5 {book} 2. Nf3 {book} Nc6 {book} 3. Bc4 {book} Nf6 {book}
4. Ng5 {book} d5 {book} 5. exd5 {book} Nxd5 {book} 6. Nxf7 {book} Kxf7 {book}
7. Qf3+ {book} Ke6 {book} 8. Nc3 {book} Nb4 {+0.11/9 0.31s}
9. Qe4 {-1.67/8 0.28s} c6 {+0.10/7 0.30s} 10. a3 {-1.34/7 0.27s}
Qa5 {-0.59/9 0.30s} 11. Rb1 {-0.65/8 0.28s} Nxc2+ {-0.94/9 0.30s}
12. Qxc2 {+0.99/8 0.27s} Qc5 {-1.01/8 0.30s} 13. d3 {+1.26/7 0.26s}
Bd6 {-1.43/6 0.30s} 14. b4 {+2.05/9 0.28s} Qd4 {-1.70/9 0.31s}
15. Bb2 {+2.15/8 0.25s} Qg4 {-1.32/9 0.31s} 16. O-O {+2.24/7 0.22s}
Rf8 {-1.29/8 0.30s} 17. Rbe1 {+2.87/6 0.29s} a5 {-1.41/6 0.30s}
18. Bxd5+ {+3.64/8 0.28s} cxd5 {-2.66/8 0.30s} 19. Nb5 {+3.39/8 0.29s}
Rf5 {-2.85/9 0.30s} 20. Nxd6 {+3.72/8 0.27s} Kxd6 {-3.42/10 0.30s}
21. f4 {+4.19/7 0.29s} d4 {-3.43/7 0.30s} 22. Qc5+ {+4.80/7 0.29s}
Kd7 {-4.26/11 0.30s} 23. Qd5+ {+5.07/7 0.30s} Kc7 {-3.42/8 0.30s}
24. Rc1+ {+5.33/8 0.30s} Kb8 {-4.24/10 0.30s} 25. Rxc8+ {+5.81/8 0.30s}
Kxc8 {-4.59/10 0.30s} 26. Qg8+ {+6.28/9 0.30s} Kd7 {-4.37/10 0.30s}
27. Qxa8 {+6.06/9 0.16s} Rxf4 {-4.53/10 0.30s} 28. Qxb7+ {+6.06/7 0.33s}
Ke6 {-4.71/10 0.30s} 29. Rc1 {+6.65/8 0.33s} g6 {-4.64/11 0.30s}
30. Qc8+ {+6.71/8 0.33s} Kf6 {-4.90/9 0.30s} 31. Qh8+ {+6.91/9 0.34s}
Kg5 {-5.02/10 0.30s} 32. Qxe5+ {+6.99/9 0.30s} Rf5 {-5.14/11 0.30s}
33. Qxd4 {+7.11/9 0.34s} Qxd4+ {-5.06/11 0.30s} 34. Bxd4 {+7.26/9 0.35s}
axb4 {-5.10/11 0.30s} 35. axb4 {+7.54/9 0.36s} Kg4 {-5.28/11 0.30s}
36. h3+ {+8.02/9 0.38s} Kg5 {-5.38/9 0.30s} 37. Rc5 {+8.87/10 0.39s}
Kf4 {-6.27/15 0.30s} 38. Kf2 {+9.11/9 0.41s} Kg5+ {-6.57/17 0.30s}
39. Rxf5+ {+8.93/10 0.43s} Kxf5 {-6.74/20 0.30s} 40. b5 {+8.93/10 0.48s}
Ke6 {-6.88/20 0.099s} 41. b6 {+8.91/10 0.45s} Kd6 {-7.07/21 0.26s}
42. g4 {+8.98/9 0.21s} Kc6 {-6.64/17 0.26s} 43. g5 {+9.91/10 0.21s}
Kd6 {-9.23/22 0.26s} 44. Bc5+ {+10.55/10 0.16s} Kc6 {-9.24/27 0.26s}
45. d4 {+15.05/12 0.21s} Kb7 {-13.41/15 0.26s} 46. d5 {+16.29/12 0.12s}
h5 {-22.42/17 0.26s} 47. gxh6 {+22.13/10 0.23s}
Ka6 {-24.22/16 0.26s, White wins by adjudication} 1-0[/pgn]
90% of coding is debugging, the other 10% is writing bugs.
-
lithander
- Posts: 923
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: Progress on Blunder
I really appreciate these kind of in-depth updates!
Personally, the diminishing returns of what new features contributed eventually drove me to stop developing further at all. When all the low hanging fruits are picked, except a few that are "by design" like a rather slow speed stemming from the choice of language and basic architecture, it feels like optimizing around these glaring shortcomings is pointless.
So if I'd pick up chess programming again I'd start with a complete rewrite. A better performance would not only provide strength but also help with the time consuming development that comes later: Like playing enough matches to verify improvements, generating your own dataset of played games so you don't rely on an external dataset like the Zurichess set (which I also used) and so on. It seems that Blunder is only about 30% faster than MinimalChess so maybe that's an avenue for improvement for you, too?
...and here is the game our engines played! At this level I need a 3rd AI to explain it to me^^
-
algerbrex
- Posts: 608
- Joined: Sun May 30, 2021 5:03 am
- Location: United States
- Full name: Christian Dean
Re: Progress on Blunder
Thanks, Thomas! And I see we both sometimes have to use secondary analysis ahaha. It's a bit weird having a relatively strong engine chess engine while also being a complete patzer myself. I'll often have to analyze moves after games it played to see why it played a move that looked kinda stupid but actually was brilliant.lithander wrote: ↑Sat Dec 11, 2021 2:21 pmI really appreciate these kind of in-depth updates!
Personally, the diminishing returns of what new features contributed eventually drove me to stop developing further at all. When all the low hanging fruits are picked, except a few that are "by design" like a rather slow speed stemming from the choice of language and basic architecture, it feels like optimizing around these glaring shortcomings is pointless.
So if I'd pick up chess programming again I'd start with a complete rewrite. A better performance would not only provide strength but also help with the time consuming development that comes later: Like playing enough matches to verify improvements, generating your own dataset of played games so you don't rely on an external dataset like the Zurichess set (which I also used) and so on. It seems that Blunder is only about 30% faster than MinimalChess so maybe that's an avenue for improvement for you, too?
...and here is the game our engines played! At this level I need a 3rd AI to explain it to me^^
I agree, the diminishing returns can start to make continuing a chess engine quite daunting, especially once, as you said, you pick off all the low-hanging fruit (TT, killer moves, PSQT, mobility, null-move, LMR, etc.) Now I actually have to start being a serious, original programmer
And Blunder's speed is definitely something that I think could be improved on. I actually tried incrementally updating PSQT + material in doMove/undoMove, and it definitely speed things up. But surprisingly got 0 Elo gain...which is weird to me, and I'll have to look into that again.
As far as generating my own dataset, that has been an ongoing task of mine, with little success of far, due to bad criteria for selecting quiet positions I think. Of course I could just bite the bullet and use qsearch, but having always used quiet positions to tune, I don't think I could go back from that kind of speed.
On a positive note, however, the retuning I did yesterday of all of Blunder's evaluation terms seemed to help, and after 314 iterations (the longest texel tuning run I've ever had) and 7-ish hours, it seemed I gained 23 Elo, at least against my gauntlet, which includes MinimalChess 0.6 actually! One nice thing I'm seeing is that it's much more aggressive now and is actively hunting down the enemy king, which is something I've always wanted to see. For example, here's a match where it's against 5.0.0, a version that had no king-safety whatsoever (posted here too for convenience):
[pgn]
[Event "?"]
[Site "?"]
[Date "2021.12.10"]
[Round "?"]
[White "Blunder dev"]
[Black "Blunder 5.0.0"]
[Result "1-0"]
[ECO "C45"]
[GameDuration "00:03:02"]
[GameEndTime "2021-12-10T23:48:22.969 Central Standard Time"]
[GameStartTime "2021-12-10T23:45:20.130 Central Standard Time"]
[Opening "Scotch"]
[PlyCount "55"]
[TimeControl "40/120"]
[Variation "Steinitz Variation"]
1. e4 e5 2. Nf3 Nc6 3. d4 exd4 4. Nxd4 Qh4 5. Nc3 Nxd4? ± {MISTAKE (+2.28)}
({(+0.49) The best move was} 5... Bb4 6. Be2 Qxe4 7. Nb5 Bxc3+ 8. bxc3 Kd8 9.
O-O Nf6 10. Bg5 d6 11. Bxf6+ gxf6 12. Bf3 Qf4 13. g3 Qf5 14. Rb1 Qc5 15. Re1 Ne5
16. Bg2 a6 17. Nd4 Qxc3 18. Rb3 Qa5 19. Bxb7 Bxb7 20. Rxb7 Qxa2) 6. Qxd4
{Critical move.} 6... Ne7 7. Bc4 c6 {INACCURACY (+2.90)} ({(+1.83) The best move
was} 7... d6 8. Be3 Nc6 9. Qd2 Be7 10. O-O-O Be6 11. Nd5 O-O-O 12. f4 Rhe8 13.
g3 Qh5 14. f5 Bxd5 15. Be2 Qh3 16. exd5 Ne5 17. Bb5 Rf8 18. Qe2 Qxf5 19. Bxa7
b6) 8. Be3 b5 9. Bb3 a5 10. a3 Ng6? {MISTAKE (+5.28)} ({(+2.68) The best move
was} 10... a4 11. Ba2 Qf6 12. e5 Qg6 13. g4 b4 14. axb4 h5 15. Ne4 Nd5 16. Bd2
Qxg4 17. f3 Qe6 18. O-O-O a3 19. Rhg1 axb2+ 20. Qxb2 h4 21. Nd6+ Bxd6 22. exd6)
11. Nxb5 Rb8 12. Nd6+ Bxd6 13. Qxd6 Ra8 {INACCURACY (+6.61)} ({(+5.06) The best
move was} 13... Rb5 14. O-O-O Rxb3 15. cxb3 Qe7 16. Qxe7+ Nxe7 17. Bc5 Ng6 18.
b4 h5 19. bxa5 Nf4 20. b4 Ba6 21. Kb2 h4 22. h3 Nxg2) 14. O-O-O a4 15. Ba2 Qg4
16. g3 Rf8 17. Rhe1 Qh3 18. Bb6 Rh8 19. Qc7 Ke7 20. f4 Qxh2 {INACCURACY
(+23.01)} ({(+13.51) The best move was} 20... Qg4 21. Rd2 Rf8 22. f5 Qg5 23.
fxg6 Ke8 24. Kb1 Qe7 25. gxh7 Rh8 26. Rf1 f5 27. e5 g6 28. Ba5 Ba6 29. Rf4 Rxh7
30. Bb4 Qf7 31. Bxf7+ Rxf7) 21. f5 Qh5 22. fxg6 hxg6 23. Rd5 f6 24. Rxh5 gxh5
25. e5 Ra5 26. Qd6+ Ke8 □ 27. exf6+ Re5 □ 28. Rxe5# 1-0
[/pgn]
This was the first time I've seen Blunder play so aggressively (even though the past couple of versions are rated 200+ points higher per CCRL), and it made me pretty happy. I made sure for the tuning last night I tweaked the king safety so that it was more fine-grained and hopefully accurate, and I'm finally starting to see pretty nice results. On the flip-side, however, I can now tell that because of the significant changes to the evaluation, Blunder's pruning and late-move reduction are now failing pretty badly in some cases, and seeing it miss certain ideas, I'm surprised I got any gain at all. For example, here's a game it played vs Nalwald 1.8:
[pgn]
[Event "?"]
[Site "?"]
[Date "2021.12.11"]
[Round "?"]
[White "Blunder dev"]
[Black "Nalwald 1.8"]
[Result "0-1"]
[ECO "C45"]
[GameDuration "00:06:00"]
[GameEndTime "2021-12-11T00:06:32.053 Central Standard Time"]
[GameStartTime "2021-12-11T00:00:31.684 Central Standard Time"]
[Opening "Scotch"]
[PlyCount "112"]
[TimeControl "40/120"]
[Variation "Steinitz Variation"]
1. e4 {+0.49/18 11s} e5 {-0.17/14 1.8s} 2. Nf3 {+0.70/17 4.7s}
Nc6 {-0.11/15 2.7s} 3. d4 {+0.55/17 3.6s} exd4 {+0.01/15 2.3s}
4. Nxd4 {+0.32/17 3.5s} Qh4 {-0.22/12 1.4s} 5. Nc3 {+0.91/17 5.9s}
Bb4 {+0.30/14 3.2s} 6. Be2 {+0.96/17 3.4s} Qxe4 {+0.80/15 2.0s}
7. Nb5 {+0.96/15 2.6s} Bxc3+ {+0.66/15 2.1s} 8. bxc3 {+0.53/15 3.4s}
Kd8 {+0.59/15 3.5s} 9. O-O {+0.55/15 3.3s} Nge7 {+0.56/14 2.2s}
10. Re1 {+1.51/15 7.2s} Qh4 {+0.47/14 6.1s} 11. Bf3 {+2.07/14 8.9s}
Qf6 {+0.53/15 3.2s} 12. Rb1 {+1.30/14 4.7s} Ne5 {+0.53/13 0.95s}
13. Ba3 {+1.21/14 4.5s} N7c6 {+0.35/13 3.7s} 14. Bd5 {+1.67/15 5.6s}
a6 {+0.30/13 5.5s} 15. Nd4 {+1.37/14 4.0s} d6 {+0.17/12 1.2s}
16. Qd2 {+1.29/13 2.9s} Na5 {+0.34/13 1.8s} 17. Rxe5 {+3.29/18 4.8s}
Qxe5 {+0.14/14 1.7s} 18. c4 {+2.98/17 2.0s} Re8 {-0.14/14 2.8s}
19. Nf3 {+3.23/16 1.5s} Qf5 {+0.20/14 2.7s} 20. Qxa5 {+3.50/17 4.0s}
Qxc2 {-0.06/15 2.0s} 21. Rc1 {+3.50/15 1.4s} Qxa2 {-0.24/14 2.3s}
22. Bxf7 {+3.23/14 1.4s} Re7 {-0.52/15 3.3s} 23. Bd5 {+3.79/15 1.8s}
Re2 {-0.59/14 1.1s} 24. Rf1 {+4.23/16 2.9s} Bd7 {-0.39/12 2.7s}
25. Qc3 {+4.33/17 1.7s} c5 {-0.58/14 3.9s} 26. Bc1 {+4.82/15 2.4s}
Kc7 {-0.92/15 3.6s} 27. Qxg7 {+5.21/14 1.2s} Qc2 {-1.23/15 4.0s}
28. Bxb7 {+5.41/14 1.5s} Rb8 {-1.26/17 2.3s} 29. Bxa6 {+3.69/17 1.7s}
Qg6 {-0.78/17 2.5s} 30. Qc3 {+5.97/14 1.4s} Ra2 {-0.78/16 2.7s}
31. Bb5 {+6.19/14 1.3s} Bxb5 {-0.78/15 1.5s} 32. cxb5 {+5.87/15 1.0s}
Rxb5 {-0.82/15 1.8s} 33. Qc4 {+6.07/14 1.0s} Rba5 {-0.98/16 3.7s}
34. Qb3 {+6.03/13 0.91s} Ra8 {-0.55/15 7.8s} 35. Re1 {+6.22/11 0.79s}
Ra1 {-0.60/14 1.5s} 36. Qc3 {+6.23/12 0.69s} Rb1 {-0.61/14 5.5s}
37. Nd2 {+6.54/13 0.61s} Rb4 {-0.67/13 8.4s} 38. Re7+ {+7.25/11 0.53s}
Kc6 {-1.82/13 1.5s} 39. Nc4 {+7.62/11 0.47s} Qg4 {-0.56/13 2.2s}
40. Ne3 {+7.10/11 0.41s} Qd4 {0.00/15 2.4s} 41. Qc2 {+7.09/14 8.8s}
Ra1 {0.00/15 3.6s} 42. Nf1 {+2.51/15 6.4s} h5 {0.00/14 1.4s}
43. g3 {+1.93/17 13s} h4 {0.00/12 2.9s} 44. Re8 {+0.93/16 12s}
Qd5 {0.00/13 8.3s} 45. Nd2 {+1.00/13 10s} h3 {+6.58/15 3.6s}
46. f3 {-13.23/16 9.0s} Qd4+ {+9.77/17 5.9s} 47. Kh1 {-15.70/18 7.9s}
Qf2 {+M19/11 0.11s} 48. Rc8+ {-27.91/22 6.9s} Kb6 {+M19/9 0.053s}
49. Nc4+ {-M16/16 6.0s} Rxc4 {+M19/4 0.052s} 50. Qb3+ {-M14/17 5.3s}
Rb4 {+M21/6 0.065s} 51. Rb8+ {-M12/17 3.5s} Ka7 {+M17/5 0.056s}
52. Ra8+ {-M10/16 1.5s} Kxa8 {+M17/4 0.056s} 53. Qd5+ {-M8/17 3.3s}
Ka7 {+M17/2 0.052s} 54. Qf7+ {-M6/17 1.8s} Kb6 {+M17/2 0.054s}
55. Qb7+ {-M4/16 1.7s} Kxb7 {+M3/1 0.054s} 56. f4 {-M2/16 1.3s}
Rxc1# {+M1/1 0.060s, Black mates} 0-1
[/pgn]
Throughout the game, it was aggressive, had the advantage, but blundered the draw, and then eventually the checkmate as well. And putting some of the above positions into 7.3.0, it finds the best moves almost instantly, and after some preliminary research, I believe LMR is causing the issue, so I'll need to look into reworking it and its criteria. Of course, as I said the new evaluation is a definite win, and I'll be keeping it, but I'd also like not to blunder so badly, even in rare cases. So hopefully this'll keep me occupied over the next couple of days and I can rework things to make Blunder aggressive and more sound
P.S: Even if you're away from chess programming for a while, thanks again for the help with understanding texel tuning. Blunder wouldn't be where it is today without that help!
-
Rebel
- Posts: 7438
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Progress on Blunder
Third crash after 1015 games.
PGN situation - http://rebel13.nl/pgn4web/match2.html
Click on square D8, it will show you the PGN, maybe it helps you to find the bastard.
Estimated rating for Blunder 2529 elo.
Games will count for the GRL.
Code: Select all
Finished game 1027 (Blunder_7.3.0 vs Galjoen_0.41.1): 0-1 {Black wins by adjudication}
Started game 1034 of 1400 (Galjoen_0.41.1 vs Blunder_7.3.0)
Terminating process of engine Blunder_7.3.0(1981)
Finished game 991 (Jumbo_0.6.10 vs Blunder_7.3.0): 1-0 {Black disconnects}
Finished game 1022 (Foxsee_7.20.1 vs Blunder_7.3.0): * {No result}
Finished game 1021 (CT800_1.43 vs Blunder_7.3.0): * {No result}
Finished game 1007 (CT800_1.43 vs Blunder_7.3.0): * {No result}
Finished game 1031 (Monolith_0.3 vs Blunder_7.3.0): * {No result}
Finished game 1024 (Blunder_7.3.0 vs Monolith_0.3): * {No result}
Finished game 1030 (Orion_05 vs Blunder_7.3.0): * {No result}
Finished game 1008 (Foxsee_7.20.1 vs Blunder_7.3.0): * {No result}
Finished game 1023 (Blunder_7.3.0 vs Orion_05): * {No result}
Finished game 1028 (Blunder_7.3.0 vs CT800_1.43): * {No result}
Finished game 1029 (Blunder_7.3.0 vs Foxsee_7.20.1): * {No result}
Finished game 989 (Monolith_0.3 vs Blunder_7.3.0): * {No result}
Finished game 1009 (Blunder_7.3.0 vs Orion_05): * {No result}
Finished game 1016 (Orion_05 vs Blunder_7.3.0): * {No result}
Finished game 1020 (Galjoen_0.41.1 vs Blunder_7.3.0): * {No result}
Finished game 1034 (Galjoen_0.41.1 vs Blunder_7.3.0): * {No result}
Finished game 1026 (Blunder_7.3.0 vs Jumbo_0.6.10): * {No result}
Finished game 1018 (Jumbo_0.5.3 vs Blunder_7.3.0): * {No result}
Finished game 1033 (Jumbo_0.6.10 vs Blunder_7.3.0): * {No result}
Finished game 1032 (Jumbo_0.5.3 vs Blunder_7.3.0): * {No result}
Rank Name Elo +/- Games Score Draw
0 Blunder_7.3.0 33 19 1015 54.7% 18.7%
1 CT800_1.43 10 52 144 51.4% 16.7%
2 Orion_05 7 49 144 51.0% 25.7%
3 Galjoen_0.41.1 0 52 146 50.0% 15.1%
4 Monolith_0.3 -36 51 145 44.8% 19.3%
5 Jumbo_0.5.3 -48 51 146 43.2% 19.2%
6 Jumbo_0.6.10 -58 52 146 41.8% 19.2%
7 Foxsee_7.20.1 -107 54 144 35.1% 16.0%
Finished match
Click on square D8, it will show you the PGN, maybe it helps you to find the bastard.
Code: Select all
Gambit Rating List
Running : gauntlet Blunder 7.3.0
Time Control : Time control : 40/120
Games : 1400
Results from file gauntlet-blunder.pgn:
No. Name Win Draw Loss Unf. Score Games %
-----------------------------------------------------------
1 Blunder 7.3.0 +460 =190 -365 *19 555.0 1015 54.7%
2 CT800 1.43 +62 =24 -58 *3 74.0 144 51.4%
3 Orion 05 +55 =37 -52 *4 73.5 144 51.0%
4 Galjoen 0.41.1 +62 =22 -62 *2 73.0 146 50.0%
5 Monolith 0.3 +51 =28 -66 *3 65.0 145 44.8%
6 Jumbo 0.5.3 +49 =28 -69 *2 63.0 146 43.2%
7 Jumbo 0.6.10 +47 =28 -71 *2 61.0 146 41.8%
8 Foxsee 7.20.1 +39 =23 -82 *3 50.5 144 35.1%
Total Games: 1034
White Wins: 400 (38.7%)
Black Wins: 425 (41.1%)
Draws: 190 (18.4%)
Unfinished: 19 (1.8%)
Estimated ratings for this elo 2500 pool
# PLAYER : RATING POINTS PLAYED (%)
1 CT800 1.43 : 2539.0 74.0 144 51
2 Orion 05 : 2536.5 73.5 144 51
3 Galjoen 0.41.1 : 2529.2 73.0 146 50
4 Blunder 7.3.0 : 2529.2 555.0 1015 55
5 Monolith 0.3 : 2492.8 65.0 145 45
6 Jumbo 0.5.3 : 2480.9 63.0 146 43
7 Jumbo 0.6.10 : 2471.1 61.0 146 42
8 Foxsee 7.20.1 : 2421.3 50.5 144 35Games will count for the GRL.
90% of coding is debugging, the other 10% is writing bugs.
-
Rebel
- Posts: 7438
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Progress on Blunder
Saved the PGN as http://rebel13.nl/b/blunder.pgn as I need the PC for another match.
90% of coding is debugging, the other 10% is writing bugs.