Thinker output

Uri Blass · Post by **Uri Blass** » Wed Mar 25, 2009 8:33 am

CThinker wrote:
bob wrote: My only question is this: Why would you not collect the PV on the fly by backing it up along with the score? A PV harvested from the trans/ref table is inaccurate, and (IMHO) makes it harder to debug things. I often look at a PV and score from Crafty, then run down the PV and repeatedly use the score command to tweak things to bring the score closer to what seems reasonable. A hash PV is almost guaranteed to not lead you to the actual position that was scored, which makes debugging much more complicated...

And backing up the PV is not expensive.
It is all about simplicity. It may not be expensive as far as run-time is concerned, but it is still extra ugly code. To me, at least, it looks like that.

Even within Thinker, the implementation of search is hidden from the shell. The tree search is just one form of search. The book search is another. Both have completely different behavior. So, the shell does not care how they find the best move. The shell just knows that they will return one.

To the shell, the book search and the tree search are just "strategies" (in design pattern speak). So, even if I implement a PV collection in the tree search, would I want to expose that? That means the book search would have to return a PV also? That means that any "search strategy" always include a PV. I could design it that way, but that would be a very bad design. That may not be a big deal to others, but it is to me.

Just to give an idea of the simplicity that I am talking about, below is a comparison of the call graphs of Fruit 2.3.1, Crafty 22.9 and Thinker 5.4C.

Cheers...

http://www.winboardengines.de/thinker/C ... -2-3-1.bmp
http://www.winboardengines.de/thinker/C ... y-22-9.bmp
http://www.winboardengines.de/thinker/C ... ker54C.bmp

I can only say that I do not understand what these graphs mean and how you generate them,

When you talk about elegence then I do not understand what is the exact meaning of ut.
I do not know if there is some mathematical way to measure it.

Uri

Edsel Apostol · Post by **Edsel Apostol** » Wed Mar 25, 2009 9:22 am

Uri Blass wrote:
CThinker wrote:
bob wrote: My only question is this: Why would you not collect the PV on the fly by backing it up along with the score? A PV harvested from the trans/ref table is inaccurate, and (IMHO) makes it harder to debug things. I often look at a PV and score from Crafty, then run down the PV and repeatedly use the score command to tweak things to bring the score closer to what seems reasonable. A hash PV is almost guaranteed to not lead you to the actual position that was scored, which makes debugging much more complicated...

And backing up the PV is not expensive.
It is all about simplicity. It may not be expensive as far as run-time is concerned, but it is still extra ugly code. To me, at least, it looks like that.

Even within Thinker, the implementation of search is hidden from the shell. The tree search is just one form of search. The book search is another. Both have completely different behavior. So, the shell does not care how they find the best move. The shell just knows that they will return one.

To the shell, the book search and the tree search are just "strategies" (in design pattern speak). So, even if I implement a PV collection in the tree search, would I want to expose that? That means the book search would have to return a PV also? That means that any "search strategy" always include a PV. I could design it that way, but that would be a very bad design. That may not be a big deal to others, but it is to me.

Just to give an idea of the simplicity that I am talking about, below is a comparison of the call graphs of Fruit 2.3.1, Crafty 22.9 and Thinker 5.4C.

Cheers...

http://www.winboardengines.de/thinker/C ... -2-3-1.bmp
http://www.winboardengines.de/thinker/C ... y-22-9.bmp
http://www.winboardengines.de/thinker/C ... ker54C.bmp

I can only say that I do not understand what these graphs mean and how you generate them,

When you talk about elegence then I do not understand what is the exact meaning of ut.
I do not know if there is some mathematical way to measure it.

Uri

Hi Lance,

What software did you use in creating those graphs? I'm thinking of rewriting my engine using C++ in the near future (I'm currently using C) after I implement SMP and I'm planning to use some Design Patterns to make it more elegant and as simple as it could make it.

CThinker · Post by **CThinker** » Wed Mar 25, 2009 9:23 am

Uri Blass wrote:
CThinker wrote:
bob wrote: My only question is this: Why would you not collect the PV on the fly by backing it up along with the score? A PV harvested from the trans/ref table is inaccurate, and (IMHO) makes it harder to debug things. I often look at a PV and score from Crafty, then run down the PV and repeatedly use the score command to tweak things to bring the score closer to what seems reasonable. A hash PV is almost guaranteed to not lead you to the actual position that was scored, which makes debugging much more complicated...

And backing up the PV is not expensive.
It is all about simplicity. It may not be expensive as far as run-time is concerned, but it is still extra ugly code. To me, at least, it looks like that.

Even within Thinker, the implementation of search is hidden from the shell. The tree search is just one form of search. The book search is another. Both have completely different behavior. So, the shell does not care how they find the best move. The shell just knows that they will return one.

To the shell, the book search and the tree search are just "strategies" (in design pattern speak). So, even if I implement a PV collection in the tree search, would I want to expose that? That means the book search would have to return a PV also? That means that any "search strategy" always include a PV. I could design it that way, but that would be a very bad design. That may not be a big deal to others, but it is to me.

Just to give an idea of the simplicity that I am talking about, below is a comparison of the call graphs of Fruit 2.3.1, Crafty 22.9 and Thinker 5.4C.

Cheers...

http://www.winboardengines.de/thinker/C ... -2-3-1.bmp
http://www.winboardengines.de/thinker/C ... y-22-9.bmp
http://www.winboardengines.de/thinker/C ... ker54C.bmp
I can only say that I do not understand what these graphs mean and how you generate them,

When you talk about elegence then I do not understand what is the exact meaning of ut.
I do not know if there is some mathematical way to measure it.

Uri

These are "call graphs". Each node is a function, and each lines (edge) is a function call (from one function to another).

I found this example of a fully labeled one:
http://multimedia.cx/eggs/images/dcng-w ... lgraph.png

I don't think they are an indicator of elegance.

It can however give you an indication of the complexity of the program. The number of functions is a simple indicator, but not very accurate. But if your functions are bunched together and layered, and there are just few lines between those bunched-up (or layered) functions, then it indicates that program is modularized. That is, a group of functions (or a layer) perform a specific task, and there is a well defined small set of interfaces to them.

The Fruit and Crafty code definitely has a lot of functions, and so, you can't see each one of them anymore (they should show-up as rectangles, like in Thinker graph). This explains why the Fruit binary is 10 times the Thinker binary.

Now, I can't accurately comment on modularity of Fruit and Crafty because the individual lines are no longer visible. I just have a feeling that there are a lot of crisscrossing of lines there.

CThinker · Post by **CThinker** » Wed Mar 25, 2009 9:30 am

Edsel Apostol wrote:
Uri Blass wrote:
CThinker wrote:
bob wrote: My only question is this: Why would you not collect the PV on the fly by backing it up along with the score? A PV harvested from the trans/ref table is inaccurate, and (IMHO) makes it harder to debug things. I often look at a PV and score from Crafty, then run down the PV and repeatedly use the score command to tweak things to bring the score closer to what seems reasonable. A hash PV is almost guaranteed to not lead you to the actual position that was scored, which makes debugging much more complicated...

And backing up the PV is not expensive.
It is all about simplicity. It may not be expensive as far as run-time is concerned, but it is still extra ugly code. To me, at least, it looks like that.

Even within Thinker, the implementation of search is hidden from the shell. The tree search is just one form of search. The book search is another. Both have completely different behavior. So, the shell does not care how they find the best move. The shell just knows that they will return one.

To the shell, the book search and the tree search are just "strategies" (in design pattern speak). So, even if I implement a PV collection in the tree search, would I want to expose that? That means the book search would have to return a PV also? That means that any "search strategy" always include a PV. I could design it that way, but that would be a very bad design. That may not be a big deal to others, but it is to me.

Just to give an idea of the simplicity that I am talking about, below is a comparison of the call graphs of Fruit 2.3.1, Crafty 22.9 and Thinker 5.4C.

Cheers...

http://www.winboardengines.de/thinker/C ... -2-3-1.bmp
http://www.winboardengines.de/thinker/C ... y-22-9.bmp
http://www.winboardengines.de/thinker/C ... ker54C.bmp

I can only say that I do not understand what these graphs mean and how you generate them,

When you talk about elegence then I do not understand what is the exact meaning of ut.
I do not know if there is some mathematical way to measure it.

Uri
Hi Lance,

What software did you use in creating those graphs? I'm thinking of rewriting my engine using C++ in the near future (I'm currently using C) after I implement SMP and I'm planning to use some Design Patterns to make it more elegant and as simple as it could make it.

Those were generated with IDA.

I think you should be able to achieve in C whatever it is that you want to do in C++, especially in a simple program such as a chess engine. That's just my opinion.

Edsel Apostol · Post by **Edsel Apostol** » Wed Mar 25, 2009 9:53 am

CThinker wrote:
Edsel Apostol wrote:
Uri Blass wrote:
CThinker wrote:
bob wrote: My only question is this: Why would you not collect the PV on the fly by backing it up along with the score? A PV harvested from the trans/ref table is inaccurate, and (IMHO) makes it harder to debug things. I often look at a PV and score from Crafty, then run down the PV and repeatedly use the score command to tweak things to bring the score closer to what seems reasonable. A hash PV is almost guaranteed to not lead you to the actual position that was scored, which makes debugging much more complicated...

And backing up the PV is not expensive.
It is all about simplicity. It may not be expensive as far as run-time is concerned, but it is still extra ugly code. To me, at least, it looks like that.

Even within Thinker, the implementation of search is hidden from the shell. The tree search is just one form of search. The book search is another. Both have completely different behavior. So, the shell does not care how they find the best move. The shell just knows that they will return one.

To the shell, the book search and the tree search are just "strategies" (in design pattern speak). So, even if I implement a PV collection in the tree search, would I want to expose that? That means the book search would have to return a PV also? That means that any "search strategy" always include a PV. I could design it that way, but that would be a very bad design. That may not be a big deal to others, but it is to me.

Just to give an idea of the simplicity that I am talking about, below is a comparison of the call graphs of Fruit 2.3.1, Crafty 22.9 and Thinker 5.4C.

Cheers...

http://www.winboardengines.de/thinker/C ... -2-3-1.bmp
http://www.winboardengines.de/thinker/C ... y-22-9.bmp
http://www.winboardengines.de/thinker/C ... ker54C.bmp

I can only say that I do not understand what these graphs mean and how you generate them,

When you talk about elegence then I do not understand what is the exact meaning of ut.
I do not know if there is some mathematical way to measure it.

Uri
Hi Lance,

What software did you use in creating those graphs? I'm thinking of rewriting my engine using C++ in the near future (I'm currently using C) after I implement SMP and I'm planning to use some Design Patterns to make it more elegant and as simple as it could make it.
Those were generated with IDA.

I think you should be able to achieve in C whatever it is that you want to do in C++, especially in a simple program such as a chess engine. That's just my opinion.

Okay. I will take it as an advice.

By the way, what about your choice of protocol? UCI seems to be more elegant for me compared to Xboard.

Laszlo Gaspar · Post by **Laszlo Gaspar** » Wed Mar 25, 2009 3:11 pm

Dear Lance,

It seems the only feasible option for you to get the PV is to get it from the hash table. You already mentioned that you do this right now and you only get such messy PV because it is overwritten in the hash. You don't get just short PVs but it is totally messy. I think the purpose of the hash table is just to get the best move of the position any time. Constructing the PV is just this, getting the best move chain. If the most important moves are not there there is something wrong with it.

Last time I was studying MTD search and I observed this phenomenom myself: during the convergence of the null window towards the real score the already established best moves were overwritten with mess. It turned out that because of the nature of MTD a position is examined with different bounds on the same depth which causes the fail-high (best) moves to fail-low next time. And since I use the fail-soft method some 'better' moves where found which I wrote into the hash table (I realized that I have to store such alpha-nodes - which where beta-node previously - in the hash in the MTD search).
My solution was always keeping the hashmove when all moves failed low. I can imagine that the fail-hard method helps as well.

This was my experience, you might have other problem with the hash replacement policy (or not).

Anyway thank you for sharing your engine with us!

Best regards and good luck!

László

bob · Post by **bob** » Wed Mar 25, 2009 3:20 pm

Tord Romstad wrote:
bob wrote:
glorfindel wrote:So how fast does crafty run on a DS? Do you have an estimation of playing strength perhaps?
It's been a while, I believe it was in the 30K nodes per second range. I was asked to do this for a game company a couple of years ago. I had to get rid of rotated bitboards due to space and did a direct calculation to generate moves, which is about 10% slower than rotated/magic. I also had to remove lots of "extras" and use a _really_ small hash table.
30 kN/s? That's incredibly fast! The Nintendo DS, according to Wikipedia, has a 67 MHz ARM CPU. Glaurung, when running on the much faster 412 MHz ARM CPU of the Apple iPhone, searches about 12-15 kN/s. Of course, Crafty is a lot faster than Glaurung on all platforms, but not that much faster.

Do you remember what you used instead of rotated bitboards?

Tord

Yes, I just did direct calculations. The basic idea is to take each ray for a slider, find the first point where it is blocked, and zero any bits beyond that poing with a simple AND. I have masks like plus1dir[sq] and such that give the bits from a square for a specific direction. to generate moves (bits) from (say) a rook on E1, only considering the attacks up the file toward E8, I take plus8dir[E1] which gives me all the squares in that direction (E2, E3, ..., E8), which I AND with Occupied(). Then a LSB(result) gives me the first square that is occupied in that direction. Suppose this is E5. I now take that previous map of all squares from E2 to E8, and AND it with ~plus8dir[E5] (which is zero only on squares E6, E7 and E8, and I end up with the proper attack bits for that file. I repeat for the other 3 directions for a rook. This is only about 10% slower than rotated/magic without the huge memory footprint which was a killer on the DS.

bob · Post by **bob** » Wed Mar 25, 2009 3:21 pm

CThinker wrote:
Tord Romstad wrote:
bob wrote:
glorfindel wrote:So how fast does crafty run on a DS? Do you have an estimation of playing strength perhaps?
It's been a while, I believe it was in the 30K nodes per second range. I was asked to do this for a game company a couple of years ago. I had to get rid of rotated bitboards due to space and did a direct calculation to generate moves, which is about 10% slower than rotated/magic. I also had to remove lots of "extras" and use a _really_ small hash table.
30 kN/s? That's incredibly fast! The Nintendo DS, according to Wikipedia, has a 67 MHz ARM CPU. Glaurung, when running on the much faster 412 MHz ARM CPU of the Apple iPhone, searches about 12-15 kN/s. Of course, Crafty is a lot faster than Glaurung on all platforms, but not that much faster.

Do you remember what you used instead of rotated bitboards?

Tord
I think there is one extra zero there. My estimate is that it is something like 3 knps.

On a 625Mhz ARM (iPaq), Crafty searches at around 20 knps. That machine is about 7 or 8 times faster than a DS.

I will try to check. Memory can certainly be faulty. That dates back about 2 years ago, so the question is can I find the numbers. Off and looking right now...

OK, you are correct. Actual number was 2900 nodes per second. Had to find the old emails. Also, for fun, 300K on the Wii...

bob · Post by **bob** » Wed Mar 25, 2009 3:31 pm

CThinker wrote:
bob wrote:
CThinker wrote:
bob wrote:
Zach Wegner wrote:
CThinker wrote:The ways that other engines collect PV arrays look very hacky to me. They either allocate a square array, but use only half of it, or, allocates PV arrays on the stack on each call to a search function. Neither is acceptable to me. So, until I device a really clean solution, I am not inclined on polluting the Thinker code.
One idea from Vincent is to make an array pv_stack[MAX_PLY * MAX_PLY / 2]; and then keep a pointer to the first and last entry into it for each ply. Whenever you get a new PV, you save it starting at the first pointer, and then save the last entry. You then pass the entry past the last move (also you need a marker to signify the end of a PV) to the next ply.

ZCT used to do this but I got rid of it because of the copying of PVs in SMP. One CPU could be searching a node with current PV of length 1, then another CPU backs up a PV of length 3. You either have to truncate it or shift the current PV forward in the array to make room for it.
I like the approach I use, which is a pv[maxply][maxply]

At any depth, you only store into pv[ply][ply] for a terminal move. As you back up you only back up a part of this array. When you back up from 10 to 9, you copy everything from pv[10][0 to endofpv] and then save the current move in pv[9[[9] to go along with the partial PV being backed up... This is executed _very_ few times and costs nothing as a result.
Yup, I am aware of this approach, but it does not appeal to me. Half of that array is unused.
And that is a problem because... ???
Well, I guess to you there is no problem then. That's OK.

This is similar to fretting around because you waste up to 63 bytes of memory to force your hash table to sit on a 64-byte memory alignment to improve cache efficiency. Do you worry about that potential loss of 63 bytes of memory? It _is_ wasted...

the 1/2 used array makes the code clean and elegant and fast. All much more important than losing 1/2 of a 64xx64 = 4096 element array. 2048 x 4 = 8K bytes on machines that nowadays have at least 1-2 gigabytes of memory. On that scale, properly aligning the hash table is just as bad... Yet I'd hope everyone does that...

bob · Post by **bob** » Wed Mar 25, 2009 4:05 pm

Edsel Apostol wrote:
CThinker wrote:
Edsel Apostol wrote:
Uri Blass wrote:
CThinker wrote:
bob wrote: My only question is this: Why would you not collect the PV on the fly by backing it up along with the score? A PV harvested from the trans/ref table is inaccurate, and (IMHO) makes it harder to debug things. I often look at a PV and score from Crafty, then run down the PV and repeatedly use the score command to tweak things to bring the score closer to what seems reasonable. A hash PV is almost guaranteed to not lead you to the actual position that was scored, which makes debugging much more complicated...

And backing up the PV is not expensive.
It is all about simplicity. It may not be expensive as far as run-time is concerned, but it is still extra ugly code. To me, at least, it looks like that.

Even within Thinker, the implementation of search is hidden from the shell. The tree search is just one form of search. The book search is another. Both have completely different behavior. So, the shell does not care how they find the best move. The shell just knows that they will return one.

To the shell, the book search and the tree search are just "strategies" (in design pattern speak). So, even if I implement a PV collection in the tree search, would I want to expose that? That means the book search would have to return a PV also? That means that any "search strategy" always include a PV. I could design it that way, but that would be a very bad design. That may not be a big deal to others, but it is to me.

Just to give an idea of the simplicity that I am talking about, below is a comparison of the call graphs of Fruit 2.3.1, Crafty 22.9 and Thinker 5.4C.

Cheers...

http://www.winboardengines.de/thinker/C ... -2-3-1.bmp
http://www.winboardengines.de/thinker/C ... y-22-9.bmp
http://www.winboardengines.de/thinker/C ... ker54C.bmp

I can only say that I do not understand what these graphs mean and how you generate them,

When you talk about elegence then I do not understand what is the exact meaning of ut.
I do not know if there is some mathematical way to measure it.

Uri
Hi Lance,

What software did you use in creating those graphs? I'm thinking of rewriting my engine using C++ in the near future (I'm currently using C) after I implement SMP and I'm planning to use some Design Patterns to make it more elegant and as simple as it could make it.
Those were generated with IDA.

I think you should be able to achieve in C whatever it is that you want to do in C++, especially in a simple program such as a chess engine. That's just my opinion.
Okay. I will take it as an advice.

By the way, what about your choice of protocol? UCI seems to be more elegant for me compared to Xboard.

It depends. If you just want your engine to search when told to search, without regard to time or anything, then UCI is for you. If you want your engine to manage time itself, manage the book itself, manage learning itself, manage pondering itself, then UCI is not for you...

Thinker output

Re: Thinker output

Re: Thinker output

Re: Thinker output

Re: Thinker output

Re: Thinker output

Re: Thinker output

Re: Thinker output

Re: Thinker output

Re: Thinker output

Re: Thinker output