It is still confusing me because i see the following when i am interested in position x:AlvaroBegue wrote:The trick is doing the gradient descent. While it would be possible to do it on the search function itself, it would be hard to make that efficient. So instead, you need to recover what position gave the eval that was propagated to the root, and then compute the gradient of the evaluation function at that node.Desperado wrote:Maybe i should think about it twice, but the pv eval should be passed to the root as search result. So at first glance i don't know in what way the "eval at the end of the pv" is different to the search result score.Ferdy wrote:Interesting, by "end-of-PV evals", did you use your static evaluation function to get the eval at end of the pv position?jdart wrote:Right. 20 seconds is fast. Takes me maybe 10 minutes (on a big 24-core machine) but I am using a 2-ply search. I calculate the PV, and then do gradient descent based on the end-of-PV evals. Then periodically I re-calculate the PV as the parameters are tuned.
--Jon
And isn't any line (including the pv of course) computed by the static evaluation at the final node ?!
So, what do i miss ?
1.e4 x (e5 2.Nf3 Nc6 d4)
case 1: you have a static eval after 1.e4 that gives +20 points.
case 2: you have a static eval after d4 that gives +35 points
case 3: you map the 35 points on the position after e4 (simple search result for e4)
a.
case 1+2 are identical in the sense that you map the static score directly on the position that the score was computed for. I don't see a gain in just changing the position to get another position/score pair. 1.(e4,20) 2.(d4,35)
b.
case 3 is different to 1+2 because the score that is mapped to the position , is a search result.
c.
The only thing that might make a difference in case "a." is the choice of the position/score pair, so it is of some higher quality because it has been choosen as pv (d4,35 compared to e4,20).
Well, a good reason to check if this improves the tuning algorithm, but the meaning would just be to get a better selection of data input.