RFC: There Can Only Be One (Eval Optimization)

Ras · Post by **Ras** » Sun Dec 18, 2022 11:47 pm

JoAnnP38 wrote: ↑Tue Dec 13, 2022 4:47 amIf everything runs as hoped each generation will consist of 256 games in the tournament followed by 64 games in sudden death.

You can hope for about 100-150 Elo through tuning material and PSTs. If you make your changes small enough to have any hope for convergence, like 10 Elo at a time, then you're looking at 10k games with regard to the error margins, not 256.

JoAnnP38 · Post by **JoAnnP38** » Mon Dec 19, 2022 10:39 am

lithander wrote: ↑Sun Dec 18, 2022 10:10 pm My gutfeel is that with 860 individual weights it could take much longer than 100 generations and rather large populations because each individual weight has only a small impact on the fitness. But that wouldn't be a problem if the generations could be computed very quickly. Sadly playing matches with a reasonable search-depth always takes a lot of cycles. And so I've been wondering if a different fitness function could be used?

For example what I'm currently doing in Leorik is that I use gradient descent to minimize the mean-squared error of my evaluation function over a set of annotated positions. That converges to a usable set of weights within minutes. And now I'm wondering: What if I used GA instead of gradient descent? As a fitness function the MSE of the eval on a set of positions doesn't sound too bad.

WRT convergence, I may be stacking the deck by including what I am calling an "exemplar" in my initial population. The exemplar is constructed from weights derived from other hand-crafted evaluations. This means that at first the exemplar will almost assuredly win most of the tournaments and as part of the round of 16 will always be contributing its weights back into the gene pool when it crosses over with another set of weights. But of course, I don't consider it a successful solution until the tournament winner is not the exemplar. I may also be polluting the final result since the final solution may look a lot like the exemplar. Still, I think there is enough randomness involved that the solution will still be unique and if I'm not satisfied, I can always regenerate the gene pool and include the winner as the new exemplar to generate a better solution. Of course, that's theoretical until I know whether or not convergence can be achieved in some reasonable amount of time.

On the other hand, creating a true fitness function derived from the process used by Texel might work even better. The iterative process of reducing the sum of errors could instead be turned over to the stochastic process of the GA. And if like I have seen on other problem sets, it may produce good solutions much faster on average.

RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)

Re: RFC: There Can Only Be One (Eval Optimization)