Repetition detection structure.

bob · Post by **bob** » Wed Feb 27, 2008 7:11 pm

hgm wrote:If it was obvious to the evaluation that a pawn can be sacrificed to improve winning prospects, it wouldn't have taken 50 moves to discover it. So there still would be no reason to pay attention to the 50-move rule.

The only thing that evaluating 50-move draws can achieve is that the engine will start to make sacrifices that it thinks make its position worse, to avoid the draw. The others it will make spontaneously long before the 50-move curtain falls. With a very accurate evaluation making such sacrifices is an even worse idea than with a poor evaluation.

Evaluations (even poor ones) usually evaluate the difference between two related positions more accurately than their absolute evaluation.

This is all complete nonsense. I am three pawns ahead. I have to sacrifice one of them to win the game. Otherwise I can't make progress and the 50 move rule kicks in and ends the game as a draw. So you voluntarily take the draw? And throw away the win? Or you are in a position where you can force mate in 10 moves, but the 50 move counter is at 45. So you choose to draw the game rather than push a pawn, reset the counter, and then finish mating your opponent? Never happens you say? Helped me to win the 1986 WCCC event in a game against Bobby where we had to toss the pawn to win the game.

That's not a decision a rational chess programmer would choose to make if a program is going to play serious chess. I have seen plenty of occasions where 50 move knowledge saved a win by avoiding a draw.

If you choose to fall into that abyss voluntarily, that's fine, but don't try to justify it as a reasonable choice. why not eliminate the repetition check as well, since that will cause you to dump a pawn to avoid a forced repetition?

bob · Post by **bob** » Wed Feb 27, 2008 7:15 pm

mathmoi wrote:
bob wrote: A hash table is doable, but there is an efficiency issue. When you start a node, you have to enter the position, then when you leave the node, you have to remove the position. There are other sticky issues dealing with SMP search, in the repetition hash table has to be a reasonable size to prevent overwriting which would be fatal, and it has to somehow handle an SMP search which makes duplicating it expensive...

repetition list scanning is not a big issue, once you profile an engine to see how little time it actually spends in there, particularly since you don't even probe it in q-search...
Hi M.Hyatt,

Thanks for the input. I did implement a position list, since data you presented elsewhere in the thread seems to indicate there is not much to gain in using a hashtable.

My only advice here would be that you should always test any suggestion you get. It could well be that you come up with an implementation of the repetition hash that works well for you. The list is an easier method to use, particularly when you choose to implement a parallel search at some point in time in the future. But you would be well advised to take all advice with a grain of salt and test for yourself before drawing a conclusion. That will lead to a smoother development and you will avoid incorporating poorly-tested ideas into your code that will need to be modified later.

hgm · Post by **hgm** » Wed Feb 27, 2008 7:17 pm

bob wrote:why not eliminate the repetition check as well, since that will cause you to dump a pawn to avoid a forced repetition?

Because the repetition check causes a large increase of Elo. I only tend to drop features that lower the rating. Very irrational, of course, in your eyes.

But you can of course tell us exactly how much Elo Crafty derives from evaluating 50-move draws. You have the Elo microscope, after all...

hgm · Post by **hgm** » Wed Feb 27, 2008 7:40 pm

bob wrote:Ever consider that perhaps your idea will _not_ work? If you aren't willing to listen, blunder ahead until you see the light...

Of course I considered that, I consider it all the time. But not being able to split at a split point because spliiting at split points in all cases will cause a disastrous drop in performance will certainly not be a failure mode I would consider...

You hash to a zero entry. I execute a loop one iteration. And your approach is supposedly faster? Please think again. How do you know your entry is zero? Perhaps a comparison? I do a single comparison to determine that the hash signature does not match the table entry. A single comparison is a single comparison. And I don't have to do a store to zero the entry when I exit, I just use a counter that is decremented for other reasons so there is no cost in doing it associated with repetitions.

And how do you know the element you compared with was the last one? Another comparison, perhaps?

As I don't waste time on updating a fifty-move counter for other purposes, for me the fact that it has to decrement after UnMake would cancel out the fact that I have to clear the entry (which doesn't require a new lookup, as the address is already known). And even worse, the 50-move counter would also have to be incremented on MakeMove. So that would also be twice as expensive as what I do now. Even worser, the 50-move counter would have to be _conditionally_ incremented, making it far, far more expensive.

bob · Post by **bob** » Wed Feb 27, 2008 8:50 pm

hgm wrote:
wgarvin wrote:If you can do that many useful optimizations in an hour, then gaining 7.5% that way might make sense.

On the other hand, if there's a routine taking 4.6% of the total, maybe its worth more attention than a routine taking only 0.5%?
That depends. If that routine contains 100x more code, all executed on every call, it would be very tough going to get significant reduction of it.

No matter what Bob got from his favorite books, fact is that the way code lines are distributed over the routines that you profile is completely irrelevant. This makes profiles such as presented here by Bob pretty useless. What would be relevant is a breakdown of execution time code line by code line, or code section by code section for approximately equally-sized sections.

You are aware that I don't have any 1,000 line procedures in Crafty? That is not that uncommon for most chess programs. And it also is not that uncommon for a very small function (saw Swap() or InCheck() or whatever) to be at the top of the list, and it is sometimes impossible to improve their speed. So what? You _still_ start at the top and work your way down, you do _not_ start at the bottom and work your way up. Which is what you are proposing, and it is, and always will be, the wrong way to optimize a program.

[/quote]

And even then, it is just as important how amenable the code is to improvement as how much time it consumes. Hash probes take a significant fraction of the time in my engine (I also probe in QS). But no matter how much I work on the code, I couldn't make it any faster, as it is purely limited by memory access time.

And the point of that would be? Once you discover that is at the top of the usage list, and that you can't improve it, you simply drop down to #2 on the list and repeat the analysis. You don't drop down to #40...

There are some unfortunate classes of software where *everything* takes like 0.5 - 1.0%, and then we have no choice but to try and optimize them all one by one---even if each one takes weeks rather than minutes. ("Uniformly slow code"). That doesn't appear to be the case here though. Optimizing the top 10 things in bob's list is probably a better use of the time.
This is exactly my point. Number 1 in Bob's list, 'Evaluate', is exactly of that type _within_ the routine. It will not be possible to gain much on it, and even if it is possible, it will involve an enormous amount of work. So the list is _very_ misleading.

That's incorrect. There is plenty of room for improvement. It has been put off for years because the evaluation has been very fluid. But as things are settling down, those parts get a thorough review for speed improvements. And some have already been made, with more on the way. This last lengthy effort was just to clean things up to make this kind of optimization easier.

Evaluate() will _not_ always be at the top of the list in Crafty, although all the components combined probably will be, by the nature of the problem. But one thing is for certain, until I consider those implementations to be optimal, I won't be dropping way down the list to improve something there that will have no impact on playing skill at all... and that has been my point all along.

bob · Post by **bob** » Wed Feb 27, 2008 9:08 pm

hgm wrote:
bob wrote:Ever consider that perhaps your idea will _not_ work? If you aren't willing to listen, blunder ahead until you see the light...
Of course I considered that, I consider it all the time. But not being able to split at a split point because spliiting at split points in all cases will cause a disastrous drop in performance will certainly not be a failure mode I would consider...

I have no idea what that means. There are well-known (sane) ways to split a tree, from the old principle-variation-split algorithm used by some of us in the 1980 time-frame, to more sophisticated approaches like DTS used in the late 80's and beyond. There are well-known bad ways to split also, and using the hash table is one of them. It sort of works for 2 processors, it is bad for 4, and it is simply lousy beyond 4.

And you have _yet_ to answer this problem with repetitions, which I will clearly define so we can get beyond this point...

one process searches starting at node A and continues for several plies until it reaches node B. It then continues searching beyond that node eventually hitting node C. Meanwhile another process decides to help this process out, whether it be via a shared hash table, a traditional split approach or whatever, and it decides to help this guy out starting at node B. Now you have to solve the following problem, with 100% accuracy.

Whenever the original process does a repetition check, it has to check against
the positions between A to B to whatever position it currently is on. It can not check any positions that are not part of its own subtree below node B. So, simply stated, it must search the global set of positions from A to B, and the private set of positions below B since another processor is also searching below B and would be searching different positions that would not be a repetition on our branch.

the second process has started to search at B, and continues on to D. When it does a repetition check, it must check against positions between A and B (that are common with the other process) and then against positions between B and D (that are _not_ in common with the other process.

How are you going to handle that? In my case, if I split at B, I just duplicate the repetition list that covers A-B, and then let the splitting process add stuff to that and no other process can see the added stuff. This guarantees 100% accuracy in repetitions, with very little overhead (at a typical split point, I copy just one entry because I use the same idea that is used in the RepetitionCheck() code, in that there is no need to copy early positions made prior to an irreversible move. In a hash approach, you can either duplicate the entire table to provide correctness, or each entry has to be tagged with the process ID it belongs to so that two different processes will not match each other's positions and get false repetitions. And you have to handle the A-B case where _both_ processors can match those positions.

As I said, it is far messier than you are suspecting, but you actually have to write a parallel search before you begin to see the hidden issues. My first parallel search played in the 1978 ACM computer chess tournament running on a Univac 1100/42 (dual cpu machine). Yours has yet to be designed, much less written. I suspect my experience here provides the better insight into the pitfalls lying ahead...

But go as you please... it is your time you are wasting...

You hash to a zero entry. I execute a loop one iteration. And your approach is supposedly faster? Please think again. How do you know your entry is zero? Perhaps a comparison? I do a single comparison to determine that the hash signature does not match the table entry. A single comparison is a single comparison. And I don't have to do a store to zero the entry when I exit, I just use a counter that is decremented for other reasons so there is no cost in doing it associated with repetitions.
And how do you know the element you compared with was the last one? Another comparison, perhaps?

Perhaps a loop? Did you not use a loop? Do you have a more efficient way of doing a loop than I do? Say one without a test and branch?

didn't think so.

Perhaps your usage of those silly emoticons is a better way of spending your time than having a technical discussion, since you are just not wanting to get the point.

As I don't waste time on updating a fifty-move counter for other purposes, for me the fact that it has to decrement after UnMake would cancel out the fact that I have to clear the entry (which doesn't require a new lookup, as the address is already known). And even worse, the 50-move counter would also have to be incremented on MakeMove. So that would also be twice as expensive as what I do now. Even worser, the 50-move counter would have to be _conditionally_ incremented, making it far, far more expensive.

That's stupid. If you don't want to implement all the rules of chess, that's fine. But that is not a point in your favor when comparing effort. The 50-move-rule is necessary in a chess engine if it is going to compete with IM/GM level players. If that is not your goal, dump the repetition check completely as it is also just as big a waste of time. If you are going to implement something half-assed, then don't try to compare it to something that works correctly. What I do is correct. What you do is a hack that works part-time. Uri pointed that out. As did I. So get off that discussion because if you ever decide to develop something truly competitive, you will also be handling the 50 move rule counter, and that difference between our programs goes away.

While you are at it, you might eliminate under-promotions as well. That will also make you faster.

Boy do these discussions go South quickly. I do something right, you choose to omit it, and then you claim that my doing it right makes my repetition code less efficient than yours???

hgm · Post by **hgm** » Wed Feb 27, 2008 9:10 pm

bob wrote:You _still_ start at the top and work your way down, you do _not_ start at the bottom and work your way up. Which is what you are proposing, and it is, and always will be, the wrong way to optimize a program.

Not at all! What I propose was to make a list sorter by (executionTime * expectedGain / requiredProgrammingEffort), and work from the top of that list.

And the point of that would be? Once you discover that is at the top of the usage list, and that you can't improve it, you simply drop down to #2 on the list and repeat the analysis. You don't drop down to #40...

It is not my fault that what could be on top of my list is #40 on yours!

That's incorrect. There is plenty of room for improvement.

That is just one of the factors in my sort key. I couldn't be interested less in how your evaluation looks, but I tink it is a safe bet it exceeds the 3 lines of code needed to scan through the repeat table by a very substantial factor. And that factor is in the denominator. So it could very well be that it is seriously counter-productive to tackle this code, rather than stuff lower on your list. But hey, it is your time. Waste it as much as you like. Just don't waste mine!

bob · Post by **bob** » Wed Feb 27, 2008 10:05 pm

hgm wrote:
bob wrote:You _still_ start at the top and work your way down, you do _not_ start at the bottom and work your way up. Which is what you are proposing, and it is, and always will be, the wrong way to optimize a program.
Not at all! What I propose was to make a list sorter by (executionTime * expectedGain / requiredProgrammingEffort), and work from the top of that list.
And the point of that would be? Once you discover that is at the top of the usage list, and that you can't improve it, you simply drop down to #2 on the list and repeat the analysis. You don't drop down to #40...
It is not my fault that what could be on top of my list is #40 on yours!
That's incorrect. There is plenty of room for improvement.
That is just one of the factors in my sort key. I couldn't be interested less in how your evaluation looks, but I tink it is a safe bet it exceeds the 3 lines of code needed to scan through the repeat table by a very substantial factor. And that factor is in the denominator. So it could very well be that it is seriously counter-productive to tackle this code, rather than stuff lower on your list. But hey, it is your time. Waste it as much as you like. Just don't waste mine!

I really don't need to waste your time, you seem to be quite adept at that on your own, so I'll leave you to it.

For the final time, my optimization process goes like this:

1. Run a detailed profile run (procedure-based to start with) to get an ordered list of modules from top to bottom cpu cycle users.

2. Start at the top. If, after studying that module, it seems to be difficult or too time-consuming to improve it, then move on to the next entry in the list. Repeat step 2 until I reach modules that use so little time that improving them would be pointless.

That is what _almost_ everybody does. Nobody starts at #40 in the list. If the 39 above it can't be improved, then improving #40 gets you exactly nothing for your effort. To date, you have not shown that going from list to hashing has any benefit at all. I ran a test which suggested there is no advantage to hashing over repetition list, and the list has advantages with parallel search and with simplicity since it is 2 lines of code, and called from exactly one place in the search.

I don't know what more there is to say. You seem to imply that I say do things that I never said do. I did say you _always_ start at the top of the list. Now you are hedging your statement since it was obviously wrong, and claiming that you factor in complexity as well. But you didn't claim that at first. And to understand complexity you have to look at the modules carefully anyway. If you want to start at #40, that's your programming decision. But it definitely is _not_ a good one. Particularly when your suggested change appears to be no better to boot...

bob · Post by **bob** » Wed Feb 27, 2008 10:26 pm

hgm wrote:
bob wrote:why not eliminate the repetition check as well, since that will cause you to dump a pawn to avoid a forced repetition?
Because the repetition check causes a large increase of Elo. I only tend to drop features that lower the rating. Very irrational, of course, in your eyes.

But you can of course tell us exactly how much Elo Crafty derives from evaluating 50-move draws. You have the Elo microscope, after all...

I do have one, but I don't use it to evaluate things that are part of the rules of the game. I could care less what I lose by removing under-promotions, because I have no intention of doing that. Ditto for the 50 move rule. If you would rather push a pawn and miss a 50 move rule draw in a losing position and lose a game, that's OK. If you would rather not push a pawn and get a 50 move rule draw in a winning position, that is also OK.

I might be interested in one day testing the Elo gain (if any) for using endgame tables, but that is not ignoring a chess rule to turn them on or off.

BTW if your 50 move rule handling lowers your rating, you did it poorly. Incrementing a counter once per node, is no more expensive than counting the node. Making the check is just as cheap. So there is no measurable cost, and a very definite measurable benefit (perhaps not large, but greater than zero). So I just want to play the game according to the rules of chess, and use those rules to my benefit whenever possible.

You do realize you are in the minority here once again I assume? Hardly any program ignores this issue unless they are very new or low-rated... but don't let that stop you...

Uri Blass · Post by **Uri Blass** » Wed Feb 27, 2008 10:42 pm

I can only comment about the following:
"if you ever decide to develop something truly competitive, you will also be handling the 50 move rule counter"

I disagree with it.
I think that it is better to handle the 50 move rule but I also believe that it does not improve the playing strength by many elo points so programs clearly can be competitive even without it.

Uri

Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.

Re: Repetition detection structure.