Zenmastur wrote: ↑Mon Jul 27, 2020 6:30 amDepth 50? I thought about higher depths, but 50 plies is a pipe dream at best. The depths used thus far, do seem way too shallow, but because of the numbers of positions needed I'm not sure how much deeper you can go and still be able to produce sufficient quantities. IIRC SF around low teens or so seems to be a sweet spot for speed vs depth. I'd have to go back and look at my data as I don't recall exactly why I drew this conclusion. It had something to do with me mass analyzing and/or playing games to depth "x" for use in an opening book. I used a few tricks to speed things up a bit. But greater depth is definitely something to try. It will be interesting to see how much better a net gets just because the position have been searched deeper. I'll be very surprised if there is a large improvement.
Regards,
Zenmastur
I know that layers of neurons in an NN are very different in many ways to degrees of polynomial, and it's naughty to compare the two, but given that I have a lot more expertise in polynomials than in NNs, and that I'm not a well behaved person, I'm going to do it anyway!
The normal way to fit polynomials is to minimise the least squares of the differences between the data points and the polynomial value at that point. The higher the degree of the polynomial, the better the fit will be - but this comes at a price: As you add extra degrees to the polynomial, the resulting curve gets lumpy - and it's inevitable that some of the lumps are going to appear in places where you don't want them. The point is very well made in the polynomial fitting tool here. As you increase the degree of the polynomial on the tool (second control on the chart - a green right arrow), you can see the fit getting better and better!
However - when you get to degree 8, and you get a perfect fit, you can see that the polynomial that made the fit is lumpy and wavy. If your data is exactly right, then this could well be correct - but usually the data is not exactly right, and in the case of chess position evaluations, which should be one of { win, draw, lose }, it ABSOLUTELY IS NOT exactly right!
This is why everyone says that using the lowest degree polynomial possible is the way to go, accepting some differences between the data point and the polynomial.
If the "polynomial degree is like an NN layer" has any truth at all, then too many layers would result in a lot of over-fitting, and an outcome curve that would be much too lumpy in places where you wouldn't want lumps.