"The wrong approach" seems a bit harsh. I like the sound of "conservative estimate" better.syzygy wrote:This always comes up, but it is the wrong approach. The hash table does not remember all previous birthdays, only those that are still present.AlvaroBegue wrote:Here's some relevant math behind the question: http://en.wikipedia.org/wiki/Birthday_attackcdani wrote:Someone has an idea of how often happen the hash collisions? I did some tries and I'm not able to catch an illegal move coming from hash. Anyway I keep checking them for legality.
Legality Check on TT move
Moderators: hgm, Rebel, chrisw
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: Legality Check on TT move
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Legality Check on TT move
I cheat a bit here. All I care is that I don't make/unmake a move that will break things. The right piece has to be on the source, the captured piece has to be on the destination else it has to be empty, for castling king has to be on e1/e8, and rook on correct square. I simply want to avoid screwing up the board by moving a piece and then unmoving to produce a different piece, etc. I can't even measure the cost in a profile run...hgm wrote:Note that this is only true if the hash table can store the entire tree. Usually more relevant is the probability the signature stored in the table accidentally matches that of a different position. E.g. with a 32-bit signature the probability for this is 1 in 4 billion (2^32). So with searches of 200M (different) nodes one in every 20 searches would contain one collision.AlvaroBegue wrote:Here's some relevant math behind the question: http://en.wikipedia.org/wiki/Birthday_attack
So if you use a 64-bit number for hashing and nothing else, a search with 200 million nodes in it will have a collision more then 0.1% of the time.
Of course a collision does not necessarily mean the hash move would be so wrong that it would crash the engine. E.g. the chance to accidentally expose yourself to check by a random move might be only 1%, so if that would be what crashes your engine, you should expect one crash in 2000 moves, i.e. once every 20 games (with 32-bit signature). This is unacceptably high, so I make sure the engine can handle King capture properly. (Which is a much cheaper test than verifying if the hash move is legal in every node.)
-
- Posts: 373
- Joined: Wed Mar 22, 2006 10:17 am
- Location: Novi Sad, Serbia
- Full name: Karlo Balla
Re: Legality Check on TT move
Here you can find some good analysis. "Memory versus Search in Games": https://project.dke.maastrichtuniversit ... thesis.pdfAlvaroBegue wrote:Here's some relevant math behind the question: http://en.wikipedia.org/wiki/Birthday_attackcdani wrote:Someone has an idea of how often happen the hash collisions? I did some tries and I'm not able to catch an illegal move coming from hash. Anyway I keep checking them for legality.
So if you use a 64-bit number for hashing and nothing else, a search with 200 million nodes in it will have a collision more then 0.1% of the time.
What I have done in my last checkers program is use a 24-bit hash for picking the 4-entry bucket where I'll store the position and a separate 64-bit hash for verifying it's the right entry. So that's an effective 88-bit hash.
With 88 bits, you'll get a collision with probability less than 10^-6 after 200 million stores. In order to get a probability of 0.1% with 88 bits, you need to store more than 8*10^12 positions.
Best Regards,
Karlo Balla Jr.
Karlo Balla Jr.
-
- Posts: 741
- Joined: Tue May 22, 2007 11:13 am
Re: Legality Check on TT move
I should add a bit more context: my engine plays draughts, not chess. In draughts, captures are mandatory. This means that legality testing needs to check whether there are captures pending. If there are captures, then legality testing pretty much requires generating all captures because it is also mandatory to capture the maximum number of pieces (and for some draughts variants, also the order in which pieces are captured). Storing all that info in the TT is prohibitively expensive. Furthermore, capture positions happen sufficiently often that always generating all moves is competitive with capture detection and capture generation in positions where captures are pending (and it vastly simplifies my code).cdani wrote:Most engines try to avoid generating all the moves to validate the hash move, because it's faster to test for legality. Many times the hash move will be good enough to return immediately, so it's better to avoid generating all.Rein Halbersma wrote: I only store the move index instead of the move in the hash table. Then I simply generate all moves, and play move "index" equal to "index = hashed % size;" where "hashed" is looked up from the TT and "size" is the size of the generated move list. This is guaranteed to work (I do check for division by zero, but that is an immediate loss for the side to move anyway).
However, I do like HGM's tip to avoid the % operator for simple overflow comparison of "index > size" and dropping into IID if it happens.
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: Legality Check on TT move
I recently wrote a Spanish-checkers engine and I also decided to store an index instead of a move. Move generation is not all that expensive, since there are typically something like 8 moves (more late in the game, when there are "damas", which move like bishops in chess).Rein Halbersma wrote: I should add a bit more context: my engine plays draughts, not chess. In draughts, captures are mandatory. This means that legality testing needs to check whether there are captures pending. If there are captures, then legality testing pretty much requires generating all captures because it is also mandatory to capture the maximum number of pieces (and for some draughts variants, also the order in which pieces are captured). Storing all that info in the TT is prohibitively expensive. Furthermore, capture positions happen sufficiently often that always generating all moves is competitive with capture detection and capture generation in positions where captures are pending (and it vastly simplifies my code).
For Spanish checkers the full move could be stored in a single 32-bit integer (the bitwise OR of the origin, the destination and the captured pieces). But still, I would rather use a single byte.
Of course you have to keep track of the permutation that is applied when the moves are ordered, so you can recover the index of the move in the original move list.
-
- Posts: 741
- Joined: Tue May 22, 2007 11:13 am
Re: Legality Check on TT move
For 10x10 international draughts, I would need at least 64-bits for a full move. There are some other variants (Italian, Thai) for which you need even larger move structs. E.g. Italian draughts requires that you also keep track of the order in which kings vs men are captured. And in Thai draughts, you immediately remove a capture piece during jumping, rather than afterwards. This means that you can land on a square which you also jumped during that move. I.e. you cannot have a single bitwise OR of jumped pieces and from/dest.AlvaroBegue wrote:I recently wrote a Spanish-checkers engine and I also decided to store an index instead of a move. Move generation is not all that expensive, since there are typically something like 8 moves (more late in the game, when there are "damas", which move like bishops in chess).Rein Halbersma wrote: I should add a bit more context: my engine plays draughts, not chess. In draughts, captures are mandatory. This means that legality testing needs to check whether there are captures pending. If there are captures, then legality testing pretty much requires generating all captures because it is also mandatory to capture the maximum number of pieces (and for some draughts variants, also the order in which pieces are captured). Storing all that info in the TT is prohibitively expensive. Furthermore, capture positions happen sufficiently often that always generating all moves is competitive with capture detection and capture generation in positions where captures are pending (and it vastly simplifies my code).
For Spanish checkers the full move could be stored in a single 32-bit integer (the bitwise OR of the origin, the destination and the captured pieces). But still, I would rather use a single byte.
Of course you have to keep track of the permutation that is applied when the moves are ordered, so you can recover the index of the move in the original move list.
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: Legality Check on TT move
I disagree. You may need more information to describe a move or a partial move while you are generating the moves, but once you have decided what moves are legal, a single bitboard is enough to fully describe the move in all variants I am aware of, including Italian and Thai.Rein Halbersma wrote:For 10x10 international draughts, I would need at least 64-bits for a full move. There are some other variants (Italian, Thai) for which you need even larger move structs. E.g. Italian draughts requires that you also keep track of the order in which kings vs men are captured. And in Thai draughts, you immediately remove a capture piece during jumping, rather than afterwards. This means that you can land on a square which you also jumped during that move. I.e. you cannot have a single bitwise OR of jumped pieces and from/dest.AlvaroBegue wrote:I recently wrote a Spanish-checkers engine and I also decided to store an index instead of a move. Move generation is not all that expensive, since there are typically something like 8 moves (more late in the game, when there are "damas", which move like bishops in chess).Rein Halbersma wrote: I should add a bit more context: my engine plays draughts, not chess. In draughts, captures are mandatory. This means that legality testing needs to check whether there are captures pending. If there are captures, then legality testing pretty much requires generating all captures because it is also mandatory to capture the maximum number of pieces (and for some draughts variants, also the order in which pieces are captured). Storing all that info in the TT is prohibitively expensive. Furthermore, capture positions happen sufficiently often that always generating all moves is competitive with capture detection and capture generation in positions where captures are pending (and it vastly simplifies my code).
For Spanish checkers the full move could be stored in a single 32-bit integer (the bitwise OR of the origin, the destination and the captured pieces). But still, I would rather use a single byte.
Of course you have to keep track of the permutation that is applied when the moves are ordered, so you can recover the index of the move in the original move list.
You may need to do some extra work to map the move to a string in the usual notation (e.g., "2x9x18x27", indicating the initial square, the intermediate landings and the final square), but that's an operation that is not needed for I/O and is not critical.
-
- Posts: 2554
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Legality Check on TT move
As I said before, I don't understand why you generate the whole chain and treat is as a single move. That's unnecessarily complicated. I would go for "level 1" moves only, extending as necessary.Rein Halbersma wrote: I should add a bit more context: my engine plays draughts, not chess. In draughts, captures are mandatory. This means that legality testing needs to check whether there are captures pending. If there are captures, then legality testing pretty much requires generating all captures because it is also mandatory to capture the maximum number of pieces (and for some draughts variants, also the order in which pieces are captured). Storing all that info in the TT is prohibitively expensive. Furthermore, capture positions happen sufficiently often that always generating all moves is competitive with capture detection and capture generation in positions where captures are pending (and it vastly simplifies my code).
However, I do like HGM's tip to avoid the % operator for simple overflow comparison of "index > size" and dropping into IID if it happens.
This should reduce complexity a lot while keeping "one move" really really small. But it's up to you
As for divisions/modulo: old(er) ARM processors don't have div instruction and everything has to be emulated. What this implies is obvious I guess.
-
- Posts: 5563
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Legality Check on TT move
I don't like the sound of "wrong" either, but I've seen the birthday paradox come up so often in this context...AlvaroBegue wrote:"The wrong approach" seems a bit harsh. I like the sound of "conservative estimate" better.syzygy wrote:This always comes up, but it is the wrong approach. The hash table does not remember all previous birthdays, only those that are still present.AlvaroBegue wrote:Here's some relevant math behind the question: http://en.wikipedia.org/wiki/Birthday_attackcdani wrote:Someone has an idea of how often happen the hash collisions? I did some tries and I'm not able to catch an illegal move coming from hash. Anyway I keep checking them for legality.
I may have once come up with it myself as well in the days of rgcc. At least I remember working it out on a whiteboard.
As usual, it's the wrong approach!!Karlo Bala wrote:Here you can find some good analysis. "Memory versus Search in Games": https://project.dke.maastrichtuniversit ... thesis.pdf
The example on p. 21 is a program searching 100,000 nps for 2 hours, thereby searching 7.2 x 10^8 nodes. The assumption is that 30% is stored in the transposition table. With a hash key of 32 bits, the probability of at least two stored positions having the same hash key is calculated to be very close to 1. With 64 bits the probability was found to be 10^-3 with an expected number of collisions of 0.05.Dennis Breuker wrote:We note that the problem of calculating the probability that at least one error occurs (being 1 - P(no errors)), is analogous to the problem widely known as the birthday paradox (Feller, 1950), where the probability of at least two persons having the same birthday in a group of M persons (N being 365) has to be calculated.
That he is looking only at stores already shows the error of his ways. If you're only storing, you can never go wrong. Problems only occur when you probe, find a hit, but the stored position although having the same hash key is not the right one.
The thesis was written in 1998. At that time 16 MB was a reasonably big hashtable, so let's assume 2^20 (about 1 million) entries. With 32 bits for the hash key, that leaves 12 bits to distinguish between positions mapped to the same hash entry. Assuming a full hash table, that means a false hit every 2^12 = 4096 probes. With 100,000 nps during 2 hours that certainly will give a serious amount of false hits. With 64 bits for the hash key, we still have 44 bits to distinguish between positions mapped to the same hash entry. Again assuming a full hash table, we now get one false hit every 2^44 probes. If we probe at EVERY node for 2 hours at 100,000 nps, this leads to a probability of 0.00004092642 and an expected number of false hits of 0.00004092726 (using Google as calculator). Quite a difference with Breuker's result.
-
- Posts: 4366
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Legality Check on TT move
That sounds about right. And with modern multi-core hardware you can get to 200 million nodes pretty fast.
I used not to do the legality check but now I do. See http://arasanchess.org/blogs/dec13.html (entry for Dec. 11).
--Jon
I used not to do the legality check but now I do. See http://arasanchess.org/blogs/dec13.html (entry for Dec. 11).
--Jon