lkaufman wrote:bob wrote:lkaufman wrote:bob wrote:
If I recall correctly, just checking the material score actually hurt the elo by something like -20. The lazy eval test hurt by something like -7, and the full eval was a pure wash, no change...
I did leave my other constraints in place of course...
Of course using just material score or even lazy score will hurt if you don't use a margin, because there is a high probability that the static score is above beta if the lazy score is just slightly below beta. Or do you mean that you used lazy eval with the lazy margin subtracted from beta for the comparison? That would be the right way to do it. I'm pretty sure that the reason full eval didn't work for you is just the overhead of the eval, which doesn't apply to us as we need the score anyway for other uses. If you got a zero result despite the cost, then obviously the idea would help if there were no cost.
Yes to the margin idea. But these were pretty fast games. I tried 4 different margins and reported the results for the best one.
Just out of curiosity, what were the best margins for the two cases (lazy and material)?
Since you found that the cost of doing the eval exactly balanced out the benefit, this suggests that you should try using this restriction only when depth is not too low, perhaps above your 4 plies of futility. The cost becomes insignificant if you are far enough from the leaves, but I don't think the same holds for the benefit.
I can give you the margins, but it will be misleading. In Crafty, we have three pieces of the eval.
1. material + a few things, followed by a potential lazy exit using a fairly wide margin.
2. material + a few things + pawn/passed pawn evaluation + things like trapped piece and such. Followed by a second potential lazy exit with a narrower margin.
3. full eval which includes everything above + the individual piece scoring terms and king safety...
The margin for my "lazy eval" would certainly be different (and misleading, hence my hesitance to post it only to have someone try it and fail) because of what my "lazy" eval includes, vs what someone else might include...
I will try the depth limit to see if it changes anything. Just finished the 3 tests at longer games (60s + 1s) and there was no change. Standard 22.5R22 is still the best, R24 was 1 elo worse (but within +/-4 so I consider them equal, and the material only version was -15 worse.