Crafty in the ACCA

bob · Post by **bob** » Mon Nov 10, 2008 9:55 pm

We entered this year after quite a lot of testing and tuning. We finally produced a test methodology that is good enough to measure small Elo changes, so that we could then try tuning various features of the evaluation to see if there were better settings.

Crafty v22.2 is new. The evaluation is almost completely rewritten. It was previously hacked pretty well when I did the white/black duplication removal, which was a major change. This version now uses the fruit MG/EG interpolation approach which Tracy wanted to try. Whether it is the best solution or not is not clear, I personally like the older approach, because I believe that some scoring terms should scale up or down faster than others, whereas in the Fruit approach, everything scales from the MG scores to the EG scores as material comes off. But it is working, it is simpler, and we are attempting to make it work. We spent a lot of time on each group of scoring parameters (but not all yet). I have an automated script that will vary one (or a group or an array) in a consistent way and then play 32,000 games to see what happens. We are now getting nice "hyperbolic" curves that show worse results with small values, going up to a peak at the optimal value, then dropping back down again as we exceed the optimal value. Giving us confidence that those scoring terms are right on the money. Not all have been done yet.

In addition, the search has changed a bit. We had futility/razoring already, I added extended futility, and tuned all the margins in the same way. Razoring is a very minimal help, probably because of LMR. Futility and extended futility are minor improvements, but the margins that were best for us are smaller than what Heinz has in his book. We now do checks at the first layer of q-search nodes, and check-evasions at the next layer if the move we try is a check. We now use null move R=3 everywhere as that was 2-3 Elo better than the adaptive null-move, once we added q-search checks. Net, we gained about 10-15 Elo with qsearch checks, and the null-move R=3 change. Parallel search is unchanged and still performs like dynamite. We now only use the "give-check" check extension, all the others are removed. Since testing showed that 1.0 plies for give-check was optimal, I have also removed the fractional ply stuff completely as the code is a little simpler and easier to read as a result. We will have this version on the ftp machine later this week.

For this event we once again ran on one node of our 70 node 8 cores per node intel cluster. dual quad-core xeons, 2.33ghz. Typical search speeds are 18-20M nodes per second.

Round 1 vs Symbolic. We were in and out and back in the book for the first 10 moves or so, and ended up with a fairly significant edge out of book. By move 15 we had this sort of search:

Code: Select all

               19    59.35   0.99   14. Nc3 Kh8 15. Re1 f5 16. Qg2 Nd7
                                    17. Be3 Bd6 18. Bd4 fxg4 19. hxg4 Nc5
                                    20. Kf1 Qh4 21. Ne4 Nxb3 22. axb3 Bxe4
                                    23. dxe4
               19->   1:22   0.99   14. Nc3 Kh8 15. Re1 f5 16. Qg2 Nd7
                                    17. Be3 Bd6 18. Bd4 fxg4 19. hxg4 Nc5
                                    20. Kf1 Qh4 21. Ne4 Nxb3 22. axb3 Bxe4
                                    23. dxe4 (s=2)
              time=1:43  mat=1  n=1881305784  fh=94%  nps=18.2M 
              ext-> check=71.8M qcheck=84.3M reduce=1013.3M/173.4M

It was apparent that Steven was getting out-searched pretty badly with this 8-core hardware and that really made a difference. Score steadily climbed, +2.5 by move 22, +3.6 by move 30, +5 by move 35, and a mate found at move 44.

Round 2. Chess Thinker. We came out of book in a position with a bad bishop. Ted pointed out we had followed a bad transposition into a QGD line we knew we didn't want to play. And the black bishop on the queenside was a problem. 3 moves out of book and we were material up but down positionally, and that was the story of the game. We continued to see the score slip and slide toward white to reach a position where the bad bishop was critical.

Round 3. Swaminathan. I actually do not remember what program he was running. We were white and came out of book at +.5 and based on the way the scores steadily climbed, we were out-searching him badly. This is typical:

Code: Select all

               20    43.97   1.20   24. Nc5 Rd6 25. Re5 Bg4 26. Re7 h5
                                    27. c3 bxc3 28. bxc3 f4 29. Rxc7 Re8
                                    30. Nd3 Re4 31. Rb2 Rf6 32. d5 Rf7
                                    33. Rbb7 Rxc7 34. Rxc7
               20->  54.37   1.20   24. Nc5 Rd6 25. Re5 Bg4 26. Re7 h5
                                    27. c3 bxc3 28. bxc3 f4 29. Rxc7 Re8
                                    30. Nd3 Re4 31. Rb2 Rf6 32. d5 Rf7
                                    33. Rbb7 Rxc7 34. Rxc7 (s=2)
               21     1:30   1.23   24. Nc5 Rd6 25. Re5 Bg4 26. Re7 h5
                                    27. c3 bxc3 28. bxc3 Rc6 29. Rf2 Rf8
                                    30. Rf4 h4 31. h3 Bxh3 32. Rxh4 Bg2
                                    33. Rhh7 Rg6 34. Rxc7
              time=1:30  mat=0  n=1684030911  fh=91%  nps=18.6M
              ext-> check=53.0M qcheck=79.4M reduce=800.3M/156.1M

Game was over by move 30, but lasted another 40 moves before checkmate.

Round 4.DeltomateX (dirty). One of Pradu's friends I understand. This was just a case of out-searching the opponent again and the game was again over prior to move 30 although it ended in mate at move 53.

Round 5. Rybka. The game we were really looking forward to.

Ruy Lopez, Crafty black. We came out of book at move 17, with an equal score. The first surprise came at move 25 when Rybka chose to start a somewhat risky trade of two pieces for a rook and 2 pawns (one could not be held so we eventually got that one back). By move 33, white has connected passed pawns on d5 and e4, black promptly blockades both with a pair of knights, queens are gone, and it looks easy (to the humans watching) to draw. However, Crafty chooses to be a bit more aggressive and rather than sit back and wait for the draw (we always use contempt = 0 so no issues there) it decided to start some complications that as a human, I liked. But a couple of our scoring terms had not been properly tuned (I had tried to finish that last week but cluster issues with IBRIX made it impossible) so we ended up going into an unbalanced position where white made faster progress on the kingside than we made on the queenside. for us, the good news is that we were nowhere close to getting 'out-searched'. But we did make an evaluation mistake that eventually led to losing the game. I was happy to not suddenly "fall off a cliff" but in general, our search depths were higher than what we were seeing reported from Rybka, as hard as his numbers are to understand, since the thing was literally spamming the game with every PV searched on each of the five cluster nodes, for every ply they searched...

Round 6 was anti-climatic against Prophet. We castled opposite early and it became a king-safety evaluation battle. We broke open his kingside with pawn pushes, and things fell apart. Score was +4 (crafty was white) by move 20. Black was mated by move 40.

impressions:

The new version looks to be very solid. Still some eval tuning to be done, and a few eval terms left to be added after the total rewrite already done. The search looks solid, the speed is pretty amazing since last year we were doing 13M nodes per second on this same identical hardware. I am now officially looking at a "cluster crafty". We have 70 nodes on this cluster. While I doubt I will be able to efficiently use 70 very quickly, just using 5 turns into 100M nodes per second, which is very fast indeed. Of course 70 x 20M nodes per second would be even better.

More to come on this later. I can guarantee you I won't just be splitting only at the root however.

Zach Wegner · Post by **Zach Wegner** » Mon Nov 10, 2008 10:12 pm

bob wrote:I am now officially looking at a "cluster crafty".

It's about damn time.

I am eagerly looking forward to this project, and hope to see that monster running full force at one of the next tournaments.

bob · Post by **bob** » Mon Nov 10, 2008 10:30 pm

Zach Wegner wrote:
bob wrote:I am now officially looking at a "cluster crafty".
It's about damn time. I am eagerly looking forward to this project, and hope to see that monster running full force at one of the next tournaments.

I think 70 will be a challenge to use without a _lot_ of work. But I can see modest numbers (say 16) possibly. I'll discuss this more as time goes on. But I am thinking along the lines of deep blue's two-level search. If I can do 20 plies normally, I might try to cluster-split anywhere in the first 8, and do smp splits only during the remainder of the tree. That idea probably makes the most sense to control the latency issues that exist even with the infiniband interconnect on the cluster...

The loss of hash sharing is an issue. I already have an idea for a small hash table for cluster splits so that I at least send a position with the same hash signature to the same cluster node each time I reach it so that the same sub-trees are searched on the same nodes to at least get some hash usage that makes sense...

swami · Post by **swami** » Tue Nov 11, 2008 5:50 am

Round 3. Swaminathan. I actually do not remember what program he was running.

Whoah, Bob. You were operating Crafty and even present in the venue personally, and yet don't know what program I was running?!

Oh well... It's Slowchess, and yes Crafty did seem to outsearch Slowchess...

Slowchess has never been updated, for like, 3 years. And It's still a single processor.

bob · Post by **bob** » Tue Nov 11, 2008 7:05 am

swami wrote:
Round 3. Swaminathan. I actually do not remember what program he was running.
Whoah, Bob. You were operating Crafty and even present in the venue personally, and yet don't know what program I was running?!

Oh well... It's Slowchess, and yes Crafty did seem to outsearch Slowchess...

Slowchess has never been updated, for like, 3 years. And It's still a single processor.

I simply didn't remember the name, sorry. I tend to remember opponents by ICC handle rather than program name since most play under the program's name...

zullil · Post by **zullil** » Tue Nov 11, 2008 12:16 pm

bob wrote: Crafty v22.2 is new. The evaluation is almost completely rewritten.

bob wrote:In addition, the search has changed a bit.

Maybe this should be crafty-23.0!

bob wrote:We will have this version on the ftp machine later this week.

Looking forward to the release. Thanks for making Crafty available to all.

jarkkop · Post by **jarkkop** » Sat Nov 15, 2008 12:30 pm

Come on. The week is almost over.

Where is our new toy?

tano-urayoan · Post by **tano-urayoan** » Sun Nov 16, 2008 4:51 am

jarkkop wrote:Come on. The week is almost over.
Where is our new toy?

At least show good manners, a please does not hurt anybody.

bob · Post by **bob** » Sun Nov 16, 2008 7:06 am

jarkkop wrote:Come on. The week is almost over.
Where is our new toy?

I hope tomorrow. I am running last-minute validation games to make sure everything works. I had to rewrite the personality load/store/list code since all the scoring terms changed...

Graham Banks · Post by **Graham Banks** » Sun Nov 16, 2008 8:02 am

bob wrote:
jarkkop wrote:Come on. The week is almost over.
Where is our new toy?
I hope tomorrow. I am running last-minute validation games to make sure everything works. I had to rewrite the personality load/store/list code since all the scoring terms changed...

Take your time Bob. Don't let people hurry you.

Crafty in the ACCA

Crafty in the ACCA

Re: Crafty in the ACCA

Re: Crafty in the ACCA

Re: Crafty in the ACCA

Re: Crafty in the ACCA

Re: Crafty in the ACCA

Re: Crafty in the ACCA

Re: Crafty in the ACCA

Re: Crafty in the ACCA

Re: Crafty in the ACCA