You can hope for about 100-150 Elo through tuning material and PSTs. If you make your changes small enough to have any hope for convergence, like 10 Elo at a time, then you're looking at 10k games with regard to the error margins, not 256.
RFC: There Can Only Be One (Eval Optimization)
Moderator: Ras
-
- Posts: 2696
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: RFC: There Can Only Be One (Eval Optimization)
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 253
- Joined: Mon Aug 26, 2019 4:34 pm
- Location: Clearwater, Florida USA
- Full name: JoAnn Peeler
Re: RFC: There Can Only Be One (Eval Optimization)
WRT convergence, I may be stacking the deck by including what I am calling an "exemplar" in my initial population. The exemplar is constructed from weights derived from other hand-crafted evaluations. This means that at first the exemplar will almost assuredly win most of the tournaments and as part of the round of 16 will always be contributing its weights back into the gene pool when it crosses over with another set of weights. But of course, I don't consider it a successful solution until the tournament winner is not the exemplar. I may also be polluting the final result since the final solution may look a lot like the exemplar. Still, I think there is enough randomness involved that the solution will still be unique and if I'm not satisfied, I can always regenerate the gene pool and include the winner as the new exemplar to generate a better solution. Of course, that's theoretical until I know whether or not convergence can be achieved in some reasonable amount of time.lithander wrote: ↑Sun Dec 18, 2022 10:10 pm My gutfeel is that with 860 individual weights it could take much longer than 100 generations and rather large populations because each individual weight has only a small impact on the fitness. But that wouldn't be a problem if the generations could be computed very quickly. Sadly playing matches with a reasonable search-depth always takes a lot of cycles. And so I've been wondering if a different fitness function could be used?
For example what I'm currently doing in Leorik is that I use gradient descent to minimize the mean-squared error of my evaluation function over a set of annotated positions. That converges to a usable set of weights within minutes. And now I'm wondering: What if I used GA instead of gradient descent? As a fitness function the MSE of the eval on a set of positions doesn't sound too bad.
On the other hand, creating a true fitness function derived from the process used by Texel might work even better. The iterative process of reducing the sum of errors could instead be turned over to the stochastic process of the GA. And if like I have seen on other problem sets, it may produce good solutions much faster on average.