hgm wrote:It is hard to believe that the replacement scheme could have much todo with it. As you can see below, my new engine finds the solution after searching only 28k nodes. Even with 16-byte entries that corresponds to only 0.4MB hash, or 0.1% loading factor. There should be virtually no collisions.
I was surprised as well. But the problem is not collisions, it's storing results for when the hash key (and lock) match.
My replacement strategy (old version below) was to check if the ply being stored was higher or equal to the ply in the hash-table. This caused the winning score to never be able to enter the hash table on the winning line.
I wouldn't be too concerned. Your engine finds the correct move with a big score, which is the main thing. As the other posts indicate the replacement strategy has a big impact. I'd suggest the following:
Have two or more slots.
1. If you find the position in the hash table - always replace with latest (regardless of depth).
2. Replace the position with the lowest depth in the two slots.
This should reduce the search time to < 1 sec.
I'm amazed the position has generated so much discussion, but then again it was one of my best positions for debugging the hash table.
hgm wrote:It is hard to believe that the replacement scheme could have much todo with it. As you can see below, my new engine finds the solution after searching only 28k nodes. Even with 16-byte entries that corresponds to only 0.4MB hash, or 0.1% loading factor. There should be virtually no collisions.
I was surprised as well. But the problem is not collisions, it's storing results for when the hash key (and lock) match.
My replacement strategy (old version below) was to check if the ply being stored was higher or equal to the ply in the hash-table. This caused the winning score to never be able to enter the hash table on the winning line.
function StoreHash(value, flags, ply, move, force){
var hashNode = g_hashTable[g_hashKeyLow & g_hashMask];
if (hashNode == null || ply >= hashNode.ply || force) {
g_hashTable[g_hashKeyLow & g_hashMask] = new HashEntry(g_hashKeyHigh, value, flags, ply, move);
}
}
I don't understand what you mean by 'ply'. If it has the usual meaning of how high up a position is in the chess tree (root has ply == 0), then it is unusual.
Most replacement scheme stores the 'draft' of the position which is the depth passed to search() as in :-
x = -search(depth - 1, -beta, -alpha, ply+1,...);
Replacement is normally based on a higher or at least an equal draft. Or are there alternatives?
hgm wrote:It is hard to believe that the replacement scheme could have much todo with it. As you can see below, my new engine finds the solution after searching only 28k nodes. Even with 16-byte entries that corresponds to only 0.4MB hash, or 0.1% loading factor. There should be virtually no collisions.
I was surprised as well. But the problem is not collisions, it's storing results for when the hash key (and lock) match.
My replacement strategy (old version below) was to check if the ply being stored was higher or equal to the ply in the hash-table. This caused the winning score to never be able to enter the hash table on the winning line.
function StoreHash(value, flags, ply, move, force){
var hashNode = g_hashTable[g_hashKeyLow & g_hashMask];
if (hashNode == null || ply >= hashNode.ply || force) {
g_hashTable[g_hashKeyLow & g_hashMask] = new HashEntry(g_hashKeyHigh, value, flags, ply, move);
}
}
It was hard to believe until I saw it with my own eyes. In absence of collisions there is another problem, which may be the most serious (I guess).
You have position A stored with a high depth but with a bound that you cannot use later. For instance, >=DRAW. In another iteration, alpha is +0.02 and you visit this position again nearer the tips. You could use the old stored information, but the bound does not allow you to cut. So you have to search again. You get >=+0.03 and cut, and you try to store this information. However, the depth is not enough to beat the original position, so it goes to the second slot. Whenever you come back to this position in this iteration, you may want to read the position stored in the second slot, but... you most likely read the first one and conclude a cut is not possible, w/o examining the second one. Therefore every time you visit this position you have search it every time (and that happens a lot in these endgames).
In this type of endgames, the later you visit a position, the more info you get for the same depth so... the newer should beat deeper if we are talking about the same position. Also, the newer may have better bounds.
gladius wrote:My replacement strategy (old version below) was to check if the ply being stored was higher or equal to the ply in the hash-table. This caused the winning score to never be able to enter the hash table on the winning line.
Ah, OK, you preferred to keep deeper over more recent. That is of course fatal: you will quickly poison the entire table with deep entres with useless bounds., and without a table you will never reach enough depth to ever replace them. You get a new search result for an existing hash entry only when that entry was not useful, because if it was, you wouldnot have had to search again.
....
It was hard to believe until I saw it with my own eyes. In absence of collisions there is another problem, which may be the most serious (I guess).
You have position A stored with a high depth but with a bound that you cannot use later. For instance, >=DRAW. In another iteration, alpha is +0.02 and you visit this position again nearer the tips. You could use the old stored information, but the bound does not allow you to cut. So you have to search again. You get >=+0.03 and cut, and you try to store this information. However, the depth is not enough to beat the original position, so it goes to the second slot. Whenever you come back to this position in this iteration, you may want to read the position stored in the second slot, but... you most likely read the first one and conclude a cut is not possible, w/o examining the second one. Therefore every time you visit this position you have search it every time (and that happens a lot in these endgames).
In this type of endgames, the later you visit a position, the more info you get for the same depth so... the newer should beat deeper if we are talking about the same position. Also, the newer may have better bounds.
//Is node hashed ?
for (p = pTT + 4; (p--) > pTT; ) {
if (PROBE_HASH_FAILS(p, signature));
else {
// depth is stored in the 12 high bits of int dt_sno
//depth == current depth
if ((depth << 20) > (p->dt_sno & 0xfff00000)) {
// this search better
goto UPDATE_SAME_NODE;
}
return;
}
}
The above code is what I edited for testing. My TT has a cluster of 4 entries and all 4 slots may be for probe/store. But a position may be stored only in at most one slot. My original codes take into consideration depth+bound-type+ age. The above codes only replaced if the current depth is greater than the depth of the old entries (without any other condition). It seems to work as shown below. May be there are other aspects affecting your result.
Many problems in a TT implementation can be recognised from abnormal percentage misses or useless hits.
That advice is still sound. Unfortunately I went on to say:
For instance, if your miss rate is > 50% in reasonably shallow (6--9 ply ) searches from the starting position, something is wrong.
I since discovered an embarrassing bug in hashing of the en passant square. Apparently, when converting from square to file, I sometimes like to write (enPasssantSquare >> 3) instead of (enPasssantSquare & 7). I wish I had written these functions much earlier:
int static inline File( int sq ) { return sq & 7; }
int static inline Rank( int sq ) { return sq >> 3; }
More sensible results for a search from the opening:
I wouldn't be too concerned. Your engine finds the correct move with a big score, which is the main thing. As the other posts indicate the replacement strategy has a big impact. I'd suggest the following:
Have two or more slots.
1. If you find the position in the hash table - always replace with latest (regardless of depth).
2. Replace the position with the lowest depth in the two slots.
This should reduce the search time to < 1 sec.
I'm amazed the position has generated so much discussion, but then again it was one of my best positions for debugging the hash table.
Cheers,
Steve
Thanks Steve. That'll have to be a future improvement. I want to make sure everything's working fine with a basic hash before I try to add another slot.
I think having a proper q-search will help, too. I'm pretty inexperienced with programming, so even the basic things take awhile...
bob wrote:This is not so much about finding Kb1 as getting to depth 30. If you have optimal move ordering, this will take to depth 26 or so to find Kb1. If your move ordering is worse, then you will actually find it at a shallower depth due to search grafting. I can explain how if you are interested. But John now gets to depth 30 in 1 second or so which seems reasonable...
Ok, since Myrddin finds Rb1 at depth 21, I'm now very curious to know why this is possibly a bad thing.
Move ordering in Myrddin is, like the rest of the engine, fairly primitive. But it does use history and killers, plus has the normal move-type ordering:
PV move
Hash move
Capture (MVV/LVA), with killers after good captures and before bad captures
Checks
Quiet moves
jm
OK, here is how you find the solution.
I assume you have looked at the position and PV carefully and understand that with best play by both sides, this position requires 13 non-capturing/non-checking moves by each side, and on the 27th ply (first ply of q-search) white finally captures a pawn. I think this is covered in Newborn's book on computer chess where he talks about his endgame-specific program called "peasant"...
In 21 plies you can't possibly win a pawn, so how does the search see it?
Suppose your move ordering for black is really bad, so that at each ply white makes a good move and black makes a lousy move. By the time you get to (say) depth 19, which discovers it can win a pawn. But that is not so useful, since black gets opportunities at each ply to find better moves to improve it's score, right? But along comes the hash table and along that PV you store the result that says "if I can reach this position, I win a pawn". And while with best play by both sides, you can't force the win of a pawn, you will discover that with best play by both sides you _can_ reach one of those positions where you found you win a pawn if black had played poorly. You graft that score onto the current position. Suppose we are at ply=21 for white, and from this position it takes another 6 plies to win the pawn. With a depth=21 search, you can't see this. But if black played like a patzer, you might have reached this position at depth=12 and saw "aha, I win a pawn." Now with best play you force that position (that you saw at ply=12) to happen, and remember you were still doing a 21 ply search, so that you searched 9 plies further and spotted the pawn win. Now at depth=21, you forced the game to reach that position you previously encountered at depth=12, and use that hash entry. Which, by the way, has a "draft" of 9 plies don't forget. You graft that onto the current 21 ply search, and actually pull off a 30 ply search on iteration 21, and spot the win.
If you had not first searched lousy moves for black, then you would not know that position is won, and your 21 ply search would still show Kb2 as the best move and white simply a single pawn ahead, as in the starting position.
Ugly, but effective.
Bob
Ok, I understand that explanation. Thanks, Bob!
jm
The bad thing is, the worse your ordering sucks, the shallower the depth needed to solve problems like this. However, bad ordering hurts everything overall. So the plies go by slower, but for this particular type of position where transpositions are incredibly common (pawns are locked so nothing but king moves until white breaks thru and wins a pawn to open things up) you see the right answer at a shallower depth. But in a longer time.
Crafty finds this at depth=24, for example, but uses < 0.01 seconds to get there. Reductions hurt. Pruning hurts. But 24 plies in < 0.01 is way better than finding it at depth=18 but needing much more time...
The depth is irrelevant, only the time matters in chess...
How many nodes is that? That will be easier to normalize the comparison.
My program sees Kb1 in 54k nodes (ply17). 100k nodes (ply23) if I use material only in eval (64Mb of hash, but I think it is very relevant since only 2.8% is filled).
Are you using LMR?
Miguel
Yes, Crafty uses LMR, plus an aggressive forward pruning approach in the last 4 plies of full-width. Old versions of Crafty used to find it between 17 and 19 plies depending on hash size. Current version takes longer due to reductions + forward pruning.
To complete 24 plies, where the current version finds Kb1, requires a total of 83K nodes:
jacobbl wrote:Hm...
This was a bit painful reading.
My engine (Sjakk) finds the solution at depth 19, but I think that's just luck, because the evaluation doesn't rise before depth 23. But what's nagging me is that I need 16,11 seconds to find it. And I'm using hash tables. I know my engine is slow (written in Visual Basic, and probably not a very good code either....), but on this test it's doing about 243 kN/s. If I understand you correctly I should have reached depth 23 much faster.
What am I doing wrong? Bugs in the hash tables? Bad move ordering? something else?
I'd be greatful for any sugestions.
Very difficult to say. Crafty is searching about 6M nodes per second on this position using 1 cpu on my laptop (the output I posted says 1.0M, but the number is meaningless at that short search time. After a few seconds, it settles in at around 6M. So there is a significant speed difference, but I don't believe it explains the difference. Took Crafty 0.02 seconds and about 80K nodes total to find this. If yours takes many more nodes, I would first suspect hashing is not working as expected...