Crafty v22.2 is new. The evaluation is almost completely rewritten. It was previously hacked pretty well when I did the white/black duplication removal, which was a major change. This version now uses the fruit MG/EG interpolation approach which Tracy wanted to try. Whether it is the best solution or not is not clear, I personally like the older approach, because I believe that some scoring terms should scale up or down faster than others, whereas in the Fruit approach, everything scales from the MG scores to the EG scores as material comes off. But it is working, it is simpler, and we are attempting to make it work. We spent a lot of time on each group of scoring parameters (but not all yet). I have an automated script that will vary one (or a group or an array) in a consistent way and then play 32,000 games to see what happens. We are now getting nice "hyperbolic" curves that show worse results with small values, going up to a peak at the optimal value, then dropping back down again as we exceed the optimal value. Giving us confidence that those scoring terms are right on the money. Not all have been done yet.
In addition, the search has changed a bit. We had futility/razoring already, I added extended futility, and tuned all the margins in the same way. Razoring is a very minimal help, probably because of LMR. Futility and extended futility are minor improvements, but the margins that were best for us are smaller than what Heinz has in his book. We now do checks at the first layer of q-search nodes, and check-evasions at the next layer if the move we try is a check. We now use null move R=3 everywhere as that was 2-3 Elo better than the adaptive null-move, once we added q-search checks. Net, we gained about 10-15 Elo with qsearch checks, and the null-move R=3 change. Parallel search is unchanged and still performs like dynamite. We now only use the "give-check" check extension, all the others are removed. Since testing showed that 1.0 plies for give-check was optimal, I have also removed the fractional ply stuff completely as the code is a little simpler and easier to read as a result. We will have this version on the ftp machine later this week.
For this event we once again ran on one node of our 70 node 8 cores per node intel cluster. dual quad-core xeons, 2.33ghz. Typical search speeds are 18-20M nodes per second.
Round 1 vs Symbolic. We were in and out and back in the book for the first 10 moves or so, and ended up with a fairly significant edge out of book. By move 15 we had this sort of search:
Code: Select all
19 59.35 0.99 14. Nc3 Kh8 15. Re1 f5 16. Qg2 Nd7
17. Be3 Bd6 18. Bd4 fxg4 19. hxg4 Nc5
20. Kf1 Qh4 21. Ne4 Nxb3 22. axb3 Bxe4
23. dxe4
19-> 1:22 0.99 14. Nc3 Kh8 15. Re1 f5 16. Qg2 Nd7
17. Be3 Bd6 18. Bd4 fxg4 19. hxg4 Nc5
20. Kf1 Qh4 21. Ne4 Nxb3 22. axb3 Bxe4
23. dxe4 (s=2)
time=1:43 mat=1 n=1881305784 fh=94% nps=18.2M
ext-> check=71.8M qcheck=84.3M reduce=1013.3M/173.4M
Round 2. Chess Thinker. We came out of book in a position with a bad bishop. Ted pointed out we had followed a bad transposition into a QGD line we knew we didn't want to play. And the black bishop on the queenside was a problem. 3 moves out of book and we were material up but down positionally, and that was the story of the game. We continued to see the score slip and slide toward white to reach a position where the bad bishop was critical.
Round 3. Swaminathan. I actually do not remember what program he was running. We were white and came out of book at +.5 and based on the way the scores steadily climbed, we were out-searching him badly. This is typical:
Code: Select all
20 43.97 1.20 24. Nc5 Rd6 25. Re5 Bg4 26. Re7 h5
27. c3 bxc3 28. bxc3 f4 29. Rxc7 Re8
30. Nd3 Re4 31. Rb2 Rf6 32. d5 Rf7
33. Rbb7 Rxc7 34. Rxc7
20-> 54.37 1.20 24. Nc5 Rd6 25. Re5 Bg4 26. Re7 h5
27. c3 bxc3 28. bxc3 f4 29. Rxc7 Re8
30. Nd3 Re4 31. Rb2 Rf6 32. d5 Rf7
33. Rbb7 Rxc7 34. Rxc7 (s=2)
21 1:30 1.23 24. Nc5 Rd6 25. Re5 Bg4 26. Re7 h5
27. c3 bxc3 28. bxc3 Rc6 29. Rf2 Rf8
30. Rf4 h4 31. h3 Bxh3 32. Rxh4 Bg2
33. Rhh7 Rg6 34. Rxc7
time=1:30 mat=0 n=1684030911 fh=91% nps=18.6M
ext-> check=53.0M qcheck=79.4M reduce=800.3M/156.1M
Round 4.DeltomateX (dirty). One of Pradu's friends I understand. This was just a case of out-searching the opponent again and the game was again over prior to move 30 although it ended in mate at move 53.
Round 5. Rybka. The game we were really looking forward to.
Ruy Lopez, Crafty black. We came out of book at move 17, with an equal score. The first surprise came at move 25 when Rybka chose to start a somewhat risky trade of two pieces for a rook and 2 pawns (one could not be held so we eventually got that one back). By move 33, white has connected passed pawns on d5 and e4, black promptly blockades both with a pair of knights, queens are gone, and it looks easy (to the humans watching) to draw. However, Crafty chooses to be a bit more aggressive and rather than sit back and wait for the draw (we always use contempt = 0 so no issues there) it decided to start some complications that as a human, I liked. But a couple of our scoring terms had not been properly tuned (I had tried to finish that last week but cluster issues with IBRIX made it impossible) so we ended up going into an unbalanced position where white made faster progress on the kingside than we made on the queenside. for us, the good news is that we were nowhere close to getting 'out-searched'. But we did make an evaluation mistake that eventually led to losing the game. I was happy to not suddenly "fall off a cliff" but in general, our search depths were higher than what we were seeing reported from Rybka, as hard as his numbers are to understand, since the thing was literally spamming the game with every PV searched on each of the five cluster nodes, for every ply they searched...
Round 6 was anti-climatic against Prophet. We castled opposite early and it became a king-safety evaluation battle. We broke open his kingside with pawn pushes, and things fell apart. Score was +4 (crafty was white) by move 20. Black was mated by move 40.
impressions:
The new version looks to be very solid. Still some eval tuning to be done, and a few eval terms left to be added after the total rewrite already done. The search looks solid, the speed is pretty amazing since last year we were doing 13M nodes per second on this same identical hardware. I am now officially looking at a "cluster crafty". We have 70 nodes on this cluster. While I doubt I will be able to efficiently use 70 very quickly, just using 5 turns into 100M nodes per second, which is very fast indeed. Of course 70 x 20M nodes per second would be even better.

More to come on this later. I can guarantee you I won't just be splitting only at the root however.
