what is the rating value of path dependent evaluation

Uri Blass · Post by **Uri Blass** » Fri Sep 11, 2009 4:12 pm

Note that for movei it seems to be 20-30 elo for all time controls.

I wonder if you tried different ideas of adding path dependent evaluation for your program and how many rating points did you get from it.

Here is the CCRL and CEGT data for movei.

only difference between movei00.8.438(10 10 10) and the default personality is that the default personality does not use path dependent evaluation.

I can add that I am not sure that 10 10 10 are the optimal values for movei.
and I never tested bigger values than 10 10 10 that means even bigger path dependent evaluation(10 10 10 means that the maximal path dependent evaluation is 0.3 pawns)

CCRL 40/40

1 Movei 00.8.438 (10 10 10) 2775 +14 −14 50.6% −6.5 38.0% 1694
99.3%
Movei 00.8.438 2746 +19 −19 50.2% −1.1 37.7% 953

CCRL 40/4

1 Movei 00.8.438 (10 10 10) 2729 +11 −11 44.8% +39.2 30.6% 3311
99.8%
Movei 00.8.438 2700 +16 −16 39.3% +73.4 31.0% 1479

CEGT 40/20

Movei 0.08.438 P10 2672 17 17 1096 55.3% 2634 35.3%
Movei 0.08.438 2650 14 14 1556 49.4% 2654 31.9%

Gian-Carlo Pascutto · Fri Sep 11, 2009 10:35 pm

What path dependant evaluation are you talking about?

bob · Post by **bob** » Fri Sep 11, 2009 11:01 pm

Gian-Carlo Pascutto wrote:What path dependant evaluation are you talking about?

And does this mean your hashing is now path-dependent as well???

Uri Blass · Post by **Uri Blass** » Sat Sep 12, 2009 2:08 am

bob wrote:
Gian-Carlo Pascutto wrote:What path dependant evaluation are you talking about?
And does this mean your hashing is now path-dependent as well???

I will answer to both of you.

1)path dependent evaluation means that the evaluation has a component that is not dependent on the leaf position but on the sequence to go to the leaf position.

In my case movei evaluates every node so I get a sequence of numbers and not only one number for the leaf position and the final evaluation is dependent only on the sequence of numbers(in case of having path dependent evaluation).

The idea that I use is simply that improvement in the evaluation is good(the static evaluation is often not correct and if I can improve my position in some line then it is good so I want to give some bonus for it and I believe that this idea can be good for other games and not only for chess)

so I compare the evaluation at the leaf to the evaluation 2 ply earlier and give a bonus of 0.1 pawns in case of having an improvement(or reduce the evaluation by 0.1 pawns in case of having an improvement for the opponent)

I later compare with the evaluation 4 plies earlier and 6 plies earlier

Here is the relevant code from movei

Code: Select all

int calculbonus&#40;int ply,int progress0,int progress1,int progress2&#41;
&#123;
	int score=0;
	if &#40;ply>=4&#41;
	&#123;
		if &#40;eval_dat&#91;ply&#93;.evalfull>eval_dat&#91;ply-2&#93;.evalfull&#41;
		&#123;
			score=progress0;
			if &#40;eval_dat&#91;ply-2&#93;.evalfull>=eval_dat&#91;ply-4&#93;.evalfull&#41;
			&#123;
				score+=progress1;
		 
			  if (&#40;ply>=6&#41;&&&#40;eval_dat&#91;ply-4&#93;.evalfull>=eval_dat&#91;ply-6&#93;.evalfull&#41;)
			  &#123;
				  score+=progress2;
			  &#125;	  
			&#125;
		&#125;
	 	if &#40;eval_dat&#91;ply&#93;.evalfull<eval_dat&#91;ply-2&#93;.evalfull&#41;
		&#123;
			score=-progress0;
			if &#40;eval_dat&#91;ply-2&#93;.evalfull<=eval_dat&#91;ply-4&#93;.evalfull&#41;
			&#123;
				score-=progress1;
				if (&#40;ply>=6&#41;&&&#40;eval_dat&#91;ply-4&#93;.evalfull<=eval_dat&#91;ply-6&#93;.evalfull&#41;)
					score-=progress2;
			&#125;
		&#125;
	&#125;

	return score;
&#125;

2)I do not hash the path and changed nothing in my hash code in order to get 20-30 elo improvement when
progress0=10,progress1=10 progress2=10.

Of course there may be different ideas of path dependent evaluation
and I wonder if people tried different ideas and found if they are productive or counter productive.

Uri

bob · Post by **bob** » Sat Sep 12, 2009 2:24 am

Uri Blass wrote:
bob wrote:
Gian-Carlo Pascutto wrote:What path dependant evaluation are you talking about?
And does this mean your hashing is now path-dependent as well???
I will answer to both of you.

1)path dependent evaluation means that the evaluation has a component that is not dependent on the leaf position but on the sequence to go to the leaf position.

In my case movei evaluates every node so I get a sequence of numbers and not only one number for the leaf position and the final evaluation is dependent only on the sequence of numbers(in case of having path dependent evaluation).

The idea that I use is simply that improvement in the evaluation is good(the static evaluation is often not correct and if I can improve my position in some line then it is good so I want to give some bonus for it and I believe that this idea can be good for other games and not only for chess)

so I compare the evaluation at the leaf to the evaluation 2 ply earlier and give a bonus of 0.1 pawns in case of having an improvement(or reduce the evaluation by 0.1 pawns in case of having an improvement for the opponent)

I later compare with the evaluation 4 plies earlier and 6 plies earlier

Here is the relevant code from movei
Code: Select all
int calculbonus&#40;int ply,int progress0,int progress1,int progress2&#41;
&#123;
	int score=0;
	if &#40;ply>=4&#41;
	&#123;
		if &#40;eval_dat&#91;ply&#93;.evalfull>eval_dat&#91;ply-2&#93;.evalfull&#41;
		&#123;
			score=progress0;
			if &#40;eval_dat&#91;ply-2&#93;.evalfull>=eval_dat&#91;ply-4&#93;.evalfull&#41;
			&#123;
				score+=progress1;
		 
			  if (&#40;ply>=6&#41;&&&#40;eval_dat&#91;ply-4&#93;.evalfull>=eval_dat&#91;ply-6&#93;.evalfull&#41;)
			  &#123;
				  score+=progress2;
			  &#125;	  
			&#125;
		&#125;
	 	if &#40;eval_dat&#91;ply&#93;.evalfull<eval_dat&#91;ply-2&#93;.evalfull&#41;
		&#123;
			score=-progress0;
			if &#40;eval_dat&#91;ply-2&#93;.evalfull<=eval_dat&#91;ply-4&#93;.evalfull&#41;
			&#123;
				score-=progress1;
				if (&#40;ply>=6&#41;&&&#40;eval_dat&#91;ply-4&#93;.evalfull<=eval_dat&#91;ply-6&#93;.evalfull&#41;)
					score-=progress2;
			&#125;
		&#125;
	&#125;

	return score;
&#125;
2)I do not hash the path and changed nothing in my hash code in order to get 20-30 elo improvement when
progress0=10,progress1=10 progress2=10.

Of course there may be different ideas of path dependent evaluation
and I wonder if people tried different ideas and found if they are productive or counter productive.

Uri

How do you solve the hashing problems this will incur? In your evaluation, P and P' are the same position but reached by different paths. You evaluate them differently and produce different scores. But when hashing, P and P' are identical and one of the scores is, by definition, wrong...

This looks like you are simply adding a random term to the evaluation here and there as a result...

Uri Blass · Post by **Uri Blass** » Sat Sep 12, 2009 2:40 am

bob wrote:
Uri Blass wrote:
bob wrote:
Gian-Carlo Pascutto wrote:What path dependant evaluation are you talking about?
And does this mean your hashing is now path-dependent as well???
I will answer to both of you.

1)path dependent evaluation means that the evaluation has a component that is not dependent on the leaf position but on the sequence to go to the leaf position.

In my case movei evaluates every node so I get a sequence of numbers and not only one number for the leaf position and the final evaluation is dependent only on the sequence of numbers(in case of having path dependent evaluation).

The idea that I use is simply that improvement in the evaluation is good(the static evaluation is often not correct and if I can improve my position in some line then it is good so I want to give some bonus for it and I believe that this idea can be good for other games and not only for chess)

so I compare the evaluation at the leaf to the evaluation 2 ply earlier and give a bonus of 0.1 pawns in case of having an improvement(or reduce the evaluation by 0.1 pawns in case of having an improvement for the opponent)

I later compare with the evaluation 4 plies earlier and 6 plies earlier

Here is the relevant code from movei
Code: Select all
int calculbonus&#40;int ply,int progress0,int progress1,int progress2&#41;
&#123;
	int score=0;
	if &#40;ply>=4&#41;
	&#123;
		if &#40;eval_dat&#91;ply&#93;.evalfull>eval_dat&#91;ply-2&#93;.evalfull&#41;
		&#123;
			score=progress0;
			if &#40;eval_dat&#91;ply-2&#93;.evalfull>=eval_dat&#91;ply-4&#93;.evalfull&#41;
			&#123;
				score+=progress1;
		 
			  if (&#40;ply>=6&#41;&&&#40;eval_dat&#91;ply-4&#93;.evalfull>=eval_dat&#91;ply-6&#93;.evalfull&#41;)
			  &#123;
				  score+=progress2;
			  &#125;	  
			&#125;
		&#125;
	 	if &#40;eval_dat&#91;ply&#93;.evalfull<eval_dat&#91;ply-2&#93;.evalfull&#41;
		&#123;
			score=-progress0;
			if &#40;eval_dat&#91;ply-2&#93;.evalfull<=eval_dat&#91;ply-4&#93;.evalfull&#41;
			&#123;
				score-=progress1;
				if (&#40;ply>=6&#41;&&&#40;eval_dat&#91;ply-4&#93;.evalfull<=eval_dat&#91;ply-6&#93;.evalfull&#41;)
					score-=progress2;
			&#125;
		&#125;
	&#125;

	return score;
&#125;
2)I do not hash the path and changed nothing in my hash code in order to get 20-30 elo improvement when
progress0=10,progress1=10 progress2=10.

Of course there may be different ideas of path dependent evaluation
and I wonder if people tried different ideas and found if they are productive or counter productive.

Uri
How do you solve the hashing problems this will incur? In your evaluation, P and P' are the same position but reached by different paths. You evaluate them differently and produce different scores. But when hashing, P and P' are identical and one of the scores is, by definition, wrong...

This looks like you are simply adding a random term to the evaluation here and there as a result...

Practically I do not solve the problem and there may be cases when I may return wrong score from P' based on hash and
I got a positive rating improvement of 20-30 elo based on the CCRL and the CEGT tests.

Originally I did not think that I will get a positive rating improvement because of this problem(and this is the reason that the default personality of version 438 does not have path dependent evaluation unlike previous versions) but I was surprised when testing suggested that I still have a positive rating improvement and CEGT+CCRL played enough games to convince me that it is not because of luck.

Uri

Dann Corbit · Post by **Dann Corbit** » Sat Sep 12, 2009 4:56 am

I think that sometimes a good positional move will slowly improve over time, whereas a game that is really drawn (for instance) will have a score that hits a fixed number and then stays pegged at that value.

Uri's idea captures not only the value of the score but also the direction of the score.

Sometimes, the "improved" directional score will get over-written with a version that does not have this information. This will result in a small loss of information.

There are many other things that will cause sudden changes in our evaluation of a position. So I think it is in line with many other things that can have the same effect, but for which the estimate that they provide is better than the estimate without the added information.

Suppose (for instance) that a search extension gets triggered and something vile or wonderful is found just over the horizon. This also causes a jump in score. It is also possible for this information to be overwritten. But the better information provided value while it was there.

My theory is that if the duration of the information is long enough to provide enough value to overcome the cost of calculation before being overwritten (on average) then it will result in improved game play.

Of course, things like this will make the engine search less determinstic. Of course, sometimes that is a good thing (so that opponents cannot set simple traps for your engine from what it has learned before because it is moody) and sometimes it is a bad thing (trying to debug a problem).

Just a theory, but it makes sense to me.

bob · Post by **bob** » Sat Sep 12, 2009 7:09 am

Dann Corbit wrote:I think that sometimes a good positional move will slowly improve over time, whereas a game that is really drawn (for instance) will have a score that hits a fixed number and then stays pegged at that value.

Uri's idea captures not only the value of the score but also the direction of the score.

Sometimes, the "improved" directional score will get over-written with a version that does not have this information. This will result in a small loss of information.

There are many other things that will cause sudden changes in our evaluation of a position. So I think it is in line with many other things that can have the same effect, but for which the estimate that they provide is better than the estimate without the added information.

Suppose (for instance) that a search extension gets triggered and something vile or wonderful is found just over the horizon. This also causes a jump in score. It is also possible for this information to be overwritten. But the better information provided value while it was there.

My theory is that if the duration of the information is long enough to provide enough value to overcome the cost of calculation before being overwritten (on average) then it will result in improved game play.

Of course, things like this will make the engine search less determinstic. Of course, sometimes that is a good thing (so that opponents cannot set simple traps for your engine from what it has learned before because it is moody) and sometimes it is a bad thing (trying to debug a problem).

Just a theory, but it makes sense to me.

The problem is, in quite a few positions, this is not "information lost in just a few positions". Take fine 70. You get _way_ more from the hash than from the evaluation function.

Dann Corbit · Post by **Dann Corbit** » Sat Sep 12, 2009 7:37 am

bob wrote:
Dann Corbit wrote:I think that sometimes a good positional move will slowly improve over time, whereas a game that is really drawn (for instance) will have a score that hits a fixed number and then stays pegged at that value.

Uri's idea captures not only the value of the score but also the direction of the score.

Sometimes, the "improved" directional score will get over-written with a version that does not have this information. This will result in a small loss of information.

There are many other things that will cause sudden changes in our evaluation of a position. So I think it is in line with many other things that can have the same effect, but for which the estimate that they provide is better than the estimate without the added information.

Suppose (for instance) that a search extension gets triggered and something vile or wonderful is found just over the horizon. This also causes a jump in score. It is also possible for this information to be overwritten. But the better information provided value while it was there.

My theory is that if the duration of the information is long enough to provide enough value to overcome the cost of calculation before being overwritten (on average) then it will result in improved game play.

Of course, things like this will make the engine search less determinstic. Of course, sometimes that is a good thing (so that opponents cannot set simple traps for your engine from what it has learned before because it is moody) and sometimes it is a bad thing (trying to debug a problem).

Just a theory, but it makes sense to me.
The problem is, in quite a few positions, this is not "information lost in just a few positions". Take fine 70. You get _way_ more from the hash than from the evaluation function.

Presumably, most of his searches would see the winning move advancing and all the others static.

I would expect that the winning move might get different bonus values on many searches, but always a bonus.

And it would never be *worse* than the score without these bonus values.

Whereas all the other drawing scores would not advance and hence would never get any boost.

I expect that if Uri turned the feature on for Fine 70, it will solve faster than if the feature is turned off.

It should be an easy experiment to try.

Uri Blass · Post by **Uri Blass** » Sat Sep 12, 2009 8:32 am

bob wrote:
Dann Corbit wrote:I think that sometimes a good positional move will slowly improve over time, whereas a game that is really drawn (for instance) will have a score that hits a fixed number and then stays pegged at that value.

Uri's idea captures not only the value of the score but also the direction of the score.

Sometimes, the "improved" directional score will get over-written with a version that does not have this information. This will result in a small loss of information.

There are many other things that will cause sudden changes in our evaluation of a position. So I think it is in line with many other things that can have the same effect, but for which the estimate that they provide is better than the estimate without the added information.

Suppose (for instance) that a search extension gets triggered and something vile or wonderful is found just over the horizon. This also causes a jump in score. It is also possible for this information to be overwritten. But the better information provided value while it was there.

My theory is that if the duration of the information is long enough to provide enough value to overcome the cost of calculation before being overwritten (on average) then it will result in improved game play.

Of course, things like this will make the engine search less determinstic. Of course, sometimes that is a good thing (so that opponents cannot set simple traps for your engine from what it has learned before because it is moody) and sometimes it is a bad thing (trying to debug a problem).

Just a theory, but it makes sense to me.
The problem is, in quite a few positions, this is not "information lost in just a few positions". Take fine 70. You get _way_ more from the hash than from the evaluation function.

correct but still movei is going to get it correct quickly.

The point is that the hash is going to save a lot of searches when
the hash is not dependent on the path.

I believe that even if you add a small random noise to the evaluation(not more than 0.3 pawns) and do not change your hash code you are going to solve fine70 quickly.

I do not claim that adding a small random noise is a good idea to improve the evaluation but only that it is not going to change much in fine70.

The point is that you are consistent in your cutoff because you do not evaluate when you have a cutoff so if you get a cutoff based on the hash then you continue to get the same cutoff later at different paths(unless you are going to search to a bigger depth).

repetition is always evaluated as 0.00 and this is not going to be hash dependent.

Uri

what is the rating value of path dependent evaluation

what is the rating value of path dependent evaluation

Re: what is the rating value of path dependent evaluation

Re: what is the rating value of path dependent evaluation

Re: what is the rating value of path dependent evaluation

Re: what is the rating value of path dependent evaluation

Re: what is the rating value of path dependent evaluation

Re: what is the rating value of path dependent evaluation

Re: what is the rating value of path dependent evaluation

Re: what is the rating value of path dependent evaluation

Re: what is the rating value of path dependent evaluation