Killer Curiosity

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Killer Curiosity

Post by hgm »

Due to a bug I was sometimes overwiriting the killer of the next ply (I tried third killer that could be a capture, but had forgotten to increase the dimension of the killer array from 2 to 3 :oops: ). To my surprise, fixing the bug seemed to slow things down. So I started to experiment a little.

Clearing the killers for the next ply whenever I find a cut move on the curent ply seemed to reduce tree size by something like 3%. Not using two equivalent killers, but reserving one killer slot exclusively for the refutation of the null move, (cleared before starting ach null-move search), and the other for refutations of other moves, reduced the tree size by another 3%.

From taking stats on cut moves, it turns out the null-move killer hardly produced any cutoffs, about 50 times less than the normal killer:

Code: Select all

 d  tot.node null-cuts other-cut hash-cut 0-kill  killer 1st-mov     2nd   3rd
 0. 36009975 18436892  2153913     1970        0        7 2094640  52185   5458
 1.  3532329  2619598  4640805     6051        0    81579 4432204 117997  28309
 2.  1121716   837451   209357    28422        0    18804 178712  16911   4624
 3.   310864   217633    59290    19896      519     6500  51684   4062   1281
 4.   120869    95279    17860     6091      163     2707  14405   2127    539
 5.    30689    22245     4688     2553       42      638   4177    316     81
 6.    10160     7525     1995     1423        1      355   1733    162     41
 7.     2814     1983      520      473        2      116    501     11      2
 8.      887      676      175      155        0       46    161      6      3
 9.      148       86       53       53        0       27     53      0      0
10.       47       30       17       17        0        9     17      0      0
11.        0        0        1        1        0        0      1      0      0
So I am getting the suspicion now that the main positive effect of the null killer is that setting a slot apart for it, which is often cleared, and almost never filled (as most null moves are refuted by captures, which are not eligible as killers) is that it prevents the second normal killer from being rmembered, so that no time is wasted on searching it.

Is it really counter-productive to try a second killer? Who else has a similar experience?

Note that I have a very high first-move cut rate: 97% in QS (d=0, 90% of the nodes) and 95% at d=1, not counting null move cuts (in QS the null-move cut entry represents stand pats). This is almost always the best capture, as there usually is no hash move at d=0 or d=1. (At d=0 because they are encountered for the first time, while the d=1 nodes are usually nodes that were encountered as QS nodes in the previous iteration, where they failed high through stand pat in the face of a threat.)
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Killer Curiosity

Post by Don »

hgm wrote:Due to a bug I was sometimes overwiriting the killer of the next ply (I tried third killer that could be a capture, but had forgotten to increase the dimension of the killer array from 2 to 3 :oops: ). To my surprise, fixing the bug seemed to slow things down. So I started to experiment a little.

Clearing the killers for the next ply whenever I find a cut move on the curent ply seemed to reduce tree size by something like 3%. Not using two equivalent killers, but reserving one killer slot exclusively for the refutation of the null move, (cleared before starting ach null-move search), and the other for refutations of other moves, reduced the tree size by another 3%.

From taking stats on cut moves, it turns out the null-move killer hardly produced any cutoffs, about 50 times less than the normal killer:

Code: Select all

 d  tot.node null-cuts other-cut hash-cut 0-kill  killer 1st-mov     2nd   3rd
 0. 36009975 18436892  2153913     1970        0        7 2094640  52185   5458
 1.  3532329  2619598  4640805     6051        0    81579 4432204 117997  28309
 2.  1121716   837451   209357    28422        0    18804 178712  16911   4624
 3.   310864   217633    59290    19896      519     6500  51684   4062   1281
 4.   120869    95279    17860     6091      163     2707  14405   2127    539
 5.    30689    22245     4688     2553       42      638   4177    316     81
 6.    10160     7525     1995     1423        1      355   1733    162     41
 7.     2814     1983      520      473        2      116    501     11      2
 8.      887      676      175      155        0       46    161      6      3
 9.      148       86       53       53        0       27     53      0      0
10.       47       30       17       17        0        9     17      0      0
11.        0        0        1        1        0        0      1      0      0
So I am getting the suspicion now that the main positive effect of the null killer is that setting a slot apart for it, which is often cleared, and almost never filled (as most null moves are refuted by captures, which are not eligible as killers) is that it prevents the second normal killer from being remembered, so that no time is wasted on searching it.

Is it really counter-productive to try a second killer? Who else has a similar experience?

Note that I have a very high first-move cut rate: 97% in QS (d=0, 90% of the nodes) and 95% at d=1, not counting null move cuts (in QS the null-move cut entry represents stand pats). This is almost always the best capture, as there usually is no hash move at d=0 or d=1. (At d=0 because they are encountered for the first time, while the d=1 nodes are usually nodes that were encountered as QS nodes in the previous iteration, where they failed high through stand pat in the face of a threat.)
This seems to be a coincidence, I just had a very similar experience. My program supposedly sorts Killer A ahead of Killer B but I discovered that my program NEVER did this properly. It gave BOTH moves the same sort score so they got sorted together in no particular order.

When I fixed this, I expected to get a 3 or 4 percent speedup or something, but the program definitely slowed down just a little bit.

I use a stable sort when I do this so my guess is that really 1 killer is doing most of the good work. When another move comes along it is probably just refuting a specific move and perhaps it's counter-productive to make it suddenly be the best killer. But if it is best, it would eventually get kicked up into the best slot. I'm just taking a wild guess.

What is a null killer? I don't use that. Is it the move that refutes a null move? How much does it help?
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Killer Curiosity

Post by hgm »

The null killer is indeed the non-capture refuting the null move in the sibbling of the current node. I figured that a move tht could refute the null move would do well against most real moves as well.

But, as you can see from the stats, it only produced a very low number of cutoffs, between 10 and 20 times fewer than the (now only) true killer.

I guess that in most cases where the null move fails low, the cut-side puts things right by executing a 'hostage' (a valuable hanging piece that it was not under pressure to take earlier), and hardly ever gets to doing moves that actually can be refuted. And certainly not by a killer.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Killer Curiosity

Post by Don »

Don wrote: This seems to be a coincidence, I just had a very similar experience. My program supposedly sorts Killer A ahead of Killer B but I discovered that my program NEVER did this properly. It gave BOTH moves the same sort score so they got sorted together in no particular order.

When I fixed this, I expected to get a 3 or 4 percent speedup or something, but the program definitely slowed down just a little bit.

I use a stable sort when I do this so my guess is that really 1 killer is doing most of the good work. When another move comes along it is probably just refuting a specific move and perhaps it's counter-productive to make it suddenly be the best killer. But if it is best, it would eventually get kicked up into the best slot. I'm just taking a wild guess.
Now that I think about it, I don't think I explained anything. Whether it's a stable sort or not is not relevant, since I'm not sorting a list that has any particular order other than the order the moves were generated in.
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Killer Curiosity

Post by michiguel »

Don wrote:
hgm wrote:Due to a bug I was sometimes overwiriting the killer of the next ply (I tried third killer that could be a capture, but had forgotten to increase the dimension of the killer array from 2 to 3 :oops: ). To my surprise, fixing the bug seemed to slow things down. So I started to experiment a little.

Clearing the killers for the next ply whenever I find a cut move on the curent ply seemed to reduce tree size by something like 3%. Not using two equivalent killers, but reserving one killer slot exclusively for the refutation of the null move, (cleared before starting ach null-move search), and the other for refutations of other moves, reduced the tree size by another 3%.

From taking stats on cut moves, it turns out the null-move killer hardly produced any cutoffs, about 50 times less than the normal killer:

Code: Select all

 d  tot.node null-cuts other-cut hash-cut 0-kill  killer 1st-mov     2nd   3rd
 0. 36009975 18436892  2153913     1970        0        7 2094640  52185   5458
 1.  3532329  2619598  4640805     6051        0    81579 4432204 117997  28309
 2.  1121716   837451   209357    28422        0    18804 178712  16911   4624
 3.   310864   217633    59290    19896      519     6500  51684   4062   1281
 4.   120869    95279    17860     6091      163     2707  14405   2127    539
 5.    30689    22245     4688     2553       42      638   4177    316     81
 6.    10160     7525     1995     1423        1      355   1733    162     41
 7.     2814     1983      520      473        2      116    501     11      2
 8.      887      676      175      155        0       46    161      6      3
 9.      148       86       53       53        0       27     53      0      0
10.       47       30       17       17        0        9     17      0      0
11.        0        0        1        1        0        0      1      0      0
So I am getting the suspicion now that the main positive effect of the null killer is that setting a slot apart for it, which is often cleared, and almost never filled (as most null moves are refuted by captures, which are not eligible as killers) is that it prevents the second normal killer from being remembered, so that no time is wasted on searching it.

Is it really counter-productive to try a second killer? Who else has a similar experience?

Note that I have a very high first-move cut rate: 97% in QS (d=0, 90% of the nodes) and 95% at d=1, not counting null move cuts (in QS the null-move cut entry represents stand pats). This is almost always the best capture, as there usually is no hash move at d=0 or d=1. (At d=0 because they are encountered for the first time, while the d=1 nodes are usually nodes that were encountered as QS nodes in the previous iteration, where they failed high through stand pat in the face of a threat.)
This seems to be a coincidence, I just had a very similar experience. My program supposedly sorts Killer A ahead of Killer B but I discovered that my program NEVER did this properly. It gave BOTH moves the same sort score so they got sorted together in no particular order.

When I fixed this, I expected to get a 3 or 4 percent speedup or something, but the program definitely slowed down just a little bit.

I use a stable sort when I do this so my guess is that really 1 killer is doing most of the good work. When another move comes along it is probably just refuting a specific move and perhaps it's counter-productive to make it suddenly be the best killer. But if it is best, it would eventually get kicked up into the best slot. I'm just taking a wild guess.

What is a null killer? I don't use that. Is it the move that refutes a null move? How much does it help?
Do the people use counters for the killers? I do but I do not know how useful that is. I have not tested these things in a long time. Maybe a scheme like that would help?
(BTW, now that read my own old implementation, I see that it can certainly be improved).

This is what I do every time I add a killer to a killer table. (I removed the debug code for readability). I have killer tables for each thread (I have not implemented smp yet, but it is there because I will try that soon).
kt is kitller table, m is the move.

Code: Select all


typedef struct {
          move_t  mv;
          int     count;
}							killer_cell_t;

typedef struct KILLERTABLE {
	killer_cell_t      killer [MAXPLYSEARCH] [2];
	move_t             tried  [MAXPLYSEARCH] [2];
} killertable_t;

killertable_t kt[MAXTHREADS];




void
killer_table_addmove (int tid, int ply, move_t m)
{
	killer_cell_t tmp;

	if ((MV_TYPE(m) == NORMAL_MOVE) && (MV_TK(m) == NOPIECE)) {

		/* same as the first killer? */
		if			(kt[tid].killer [ply] [0].mv == m) {
						kt[tid].killer [ply] [0].count++;
						return;
		}

		/* same as the second killer */		
		if			(kt[tid].killer [ply] [1].mv == m) {
						kt[tid].killer [ply] [1].count++;

					/* did the counter of second reach the first one? */
					if	(kt[tid].killer [ply] [1].count >= kt[tid].killer [ply] [0].count) {
						tmp              = kt[tid].killer [ply] [1];
						kt[tid].killer [ply] [1] = kt[tid].killer [ply] [0];
						kt[tid].killer [ply] [0] = tmp;
					}						
					return;
		} 

		/* firts killer empty, so fill it */		
		if			(kt[tid].killer [ply] [0].count == 0) {
						kt[tid].killer [ply] [0].mv    = m;
						kt[tid].killer [ply] [0].count = 1;
						return;
		}

		/* replace the second killer position */
		{
						kt[tid].killer [ply] [1].mv    = m;
						kt[tid].killer [ply] [1].count = 1;
		}

		/* did the counter of second killer reach the first one? swap them */		
		if			(kt[tid].killer [ply] [1].count >= kt[tid].killer [ply] [0].count) {
						tmp              = kt[tid].killer [ply] [1];
						kt[tid].killer [ply] [1] = kt[tid].killer [ply] [0];
						kt[tid].killer [ply] [0] = tmp;
		}
	}
	return;
}
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Killer Curiosity

Post by michiguel »

hgm wrote:The null killer is indeed the non-capture refuting the null move in the sibbling of the current node. I figured that a move tht could refute the null move would do well against most real moves as well.

But, as you can see from the stats, it only produced a very low number of cutoffs, between 10 and 20 times fewer than the (now only) true killer.

I guess that in most cases where the null move fails low, the cut-side puts things right by executing a 'hostage' (a valuable hanging piece that it was not under pressure to take earlier), and hardly ever gets to doing moves that actually can be refuted. And certainly not by a killer.
Do you have a special nullmove search? othewise, the null move killer, as you call it, will be included among the killers anyway Isn't that so?

Miguel
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Killer Curiosity

Post by bob »

Don wrote:
hgm wrote:Due to a bug I was sometimes overwiriting the killer of the next ply (I tried third killer that could be a capture, but had forgotten to increase the dimension of the killer array from 2 to 3 :oops: ). To my surprise, fixing the bug seemed to slow things down. So I started to experiment a little.

Clearing the killers for the next ply whenever I find a cut move on the curent ply seemed to reduce tree size by something like 3%. Not using two equivalent killers, but reserving one killer slot exclusively for the refutation of the null move, (cleared before starting ach null-move search), and the other for refutations of other moves, reduced the tree size by another 3%.

From taking stats on cut moves, it turns out the null-move killer hardly produced any cutoffs, about 50 times less than the normal killer:

Code: Select all

 d  tot.node null-cuts other-cut hash-cut 0-kill  killer 1st-mov     2nd   3rd
 0. 36009975 18436892  2153913     1970        0        7 2094640  52185   5458
 1.  3532329  2619598  4640805     6051        0    81579 4432204 117997  28309
 2.  1121716   837451   209357    28422        0    18804 178712  16911   4624
 3.   310864   217633    59290    19896      519     6500  51684   4062   1281
 4.   120869    95279    17860     6091      163     2707  14405   2127    539
 5.    30689    22245     4688     2553       42      638   4177    316     81
 6.    10160     7525     1995     1423        1      355   1733    162     41
 7.     2814     1983      520      473        2      116    501     11      2
 8.      887      676      175      155        0       46    161      6      3
 9.      148       86       53       53        0       27     53      0      0
10.       47       30       17       17        0        9     17      0      0
11.        0        0        1        1        0        0      1      0      0
So I am getting the suspicion now that the main positive effect of the null killer is that setting a slot apart for it, which is often cleared, and almost never filled (as most null moves are refuted by captures, which are not eligible as killers) is that it prevents the second normal killer from being remembered, so that no time is wasted on searching it.

Is it really counter-productive to try a second killer? Who else has a similar experience?

Note that I have a very high first-move cut rate: 97% in QS (d=0, 90% of the nodes) and 95% at d=1, not counting null move cuts (in QS the null-move cut entry represents stand pats). This is almost always the best capture, as there usually is no hash move at d=0 or d=1. (At d=0 because they are encountered for the first time, while the d=1 nodes are usually nodes that were encountered as QS nodes in the previous iteration, where they failed high through stand pat in the face of a threat.)
This seems to be a coincidence, I just had a very similar experience. My program supposedly sorts Killer A ahead of Killer B but I discovered that my program NEVER did this properly. It gave BOTH moves the same sort score so they got sorted together in no particular order.

When I fixed this, I expected to get a 3 or 4 percent speedup or something, but the program definitely slowed down just a little bit.

I use a stable sort when I do this so my guess is that really 1 killer is doing most of the good work. When another move comes along it is probably just refuting a specific move and perhaps it's counter-productive to make it suddenly be the best killer. But if it is best, it would eventually get kicked up into the best slot. I'm just taking a wild guess.

What is a null killer? I don't use that. Is it the move that refutes a null move? How much does it help?
Here's what I _think_ you should do and I am going to test this once these other tests are finished... I have two killers. When I go to store a killer, if it is the first one in the list (a match) I do nothing. If it does not match the first one, I always overwrite the second. I have had a note for months to try a LRU-type replacement mechanism, so that I always try the one last stored, rather than the one at the top of the list which is most popular. I had decided that since killers are somewhat local in nature, for most cases, it makes sense to first try the one that worked the last time around, and then try the more common killer when that does not apply. There may well be a way to measure this so that you try a killer that is from a position "close" to this before trying one that came from some other part of the three... That's also on my to-do list to try...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Killer Curiosity

Post by bob »

hgm wrote:The null killer is indeed the non-capture refuting the null move in the sibbling of the current node. I figured that a move tht could refute the null move would do well against most real moves as well.
I don't agree there. Most null-move searches are refuted because the side on move is already down in material, and _any_ move that doesn't toss material away will cause a cutoff. It might even be better to check for move[ply-1] != null before storing a killer[ply]. I'll also test this...



But, as you can see from the stats, it only produced a very low number of cutoffs, between 10 and 20 times fewer than the (now only) true killer.

I guess that in most cases where the null move fails low, the cut-side puts things right by executing a 'hostage' (a valuable hanging piece that it was not under pressure to take earlier), and hardly ever gets to doing moves that actually can be refuted. And certainly not by a killer.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Killer Curiosity

Post by hgm »

I see what you mean, but in this particular engine I don't even do a null move when curEval < beta, so what you describe can't happen. The opponent really has to work to refute the null move.

As to the killers: the stats below are with the conventionl dual-killer algorithm, with LRU replacement. (It is only for a single position, so one should not put too much value on it, but it still gives some indication.)

Code: Select all

 d  tot.node null-cuts other-cut hash-cut kill-1  kill-2 1st-mov     2nd   3rd
 0. 18574353  9156253   947419     1159       54       64 922963  21118   2567
 1.  2125229  1655035  2476247     3634    44993    11933 2377179  56905  23328
 2.   716146   558076   104229    23382     7916     2794  91784   6569   2557
 3.   231939   169150    36318    17082     3921     1153  32716   2047    618
 4.    97283    79073    11186     5177     1625      544   9239   1222    335
 5.    28147    21007     3716     2203      446      137   3365    200     68
 6.     9740     7387     1690     1317      275       79   1556     77     18
 7.     2792     1960      520      472      111       30    498     10      5
 8.      887      676      184      153       46       12    167      6      6
 9.      148       86       53       53       27        2     53      0      0
10.       47       30       17       17        9        3     17      0      0
11.        0        0        1        1        0        0      1      0      0
Kill-1 is the LRU killer, kill-2 the older one. You can see that kill-1 is 3-4 times as successful as kill-2. Note that this is not necessarily because it is tried first: they are actually tried in move-generator order, so the older killer might very well have been tried before the LRU one. (But the move generator generates moves piece by piece from the piece list, so it is likely to be pretty consistent in the order.)

(That the number of nodes is so much smaller here as in the previous case is not a consequence of the other killer algorithm, but because I have switched from alpha-beta to PVS in the mean time.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Killer Curiosity

Post by Don »

bob wrote:
Here's what I _think_ you should do and I am going to test this once these other tests are finished... I have two killers. When I go to store a killer, if it is the first one in the list (a match) I do nothing. If it does not match the first one, I always overwrite the second. I have had a note for months to try a LRU-type replacement mechanism, so that I always try the one last stored, rather than the one at the top of the list which is most popular. I had decided that since killers are somewhat local in nature, for most cases, it makes sense to first try the one that worked the last time around, and then try the more common killer when that does not apply. There may well be a way to measure this so that you try a killer that is from a position "close" to this before trying one that came from some other part of the three... That's also on my to-do list to try...
That does not seem to make logical sense to me, although nothing surprises me here. If I understand you correctly, the first move to become a killer will ALWAYS be a killer. Because if a different move produces a cutoff it is always stored in slot B.

In my implementation I also have 2 killers - A and B. If a move produces a cutoff and it's already killer A, I do nothing. Otherwise, it becomes the NEW killer A. The old killer A gets moved over to the killer B slot to make room and now it becomes the new killer B.

I think it's common for a move to be the best killer, and occasionally get replaced for just 1 move. For instance Nf3 might be best 90% of the time, but something gets attacked and must move, so the escape move gets to be killer A but really shouldn't be. So it may be worthwhile to explore a scheme where you use the killer than worked best the last N times where N should be pretty small so that it adapts quickly.