Reinforcement learning to tune handcrafted evals

mar · Post by **mar** » Fri Dec 03, 2021 9:01 am

dangi12012 wrote: ↑Thu Dec 02, 2021 9:30 pm Nah I have been programming in different fields my whole life. https://github.com/Gigantua/Raytrace

lol, that is so underwhelming.... so you are completely clueless even outside the domain of chess, interesting.
you overestimate yourself by several orders of magnitude, I'd say a book example of Dunning-Kruger.

(pushing binaries like pdb and obj to a github repo also speaks volumes)

btw calling a dumb raytracer a "game engine", good one!

so - you keep spamming the programming forum with nonsense, feeling the urge to comment on every thread - that is what I call trolling

you called a forum member a racist for no reason at all (apparently that's a serious allegation), you keep pointing fingers at others calling them trolls
in reality the one who's trolling here - your behavior apparently annoys a lot of people.

dangi12012 · Post by **dangi12012** » Fri Dec 03, 2021 1:44 pm

mar wrote: ↑Fri Dec 03, 2021 9:01 am

Salty mind. I feel sad for you.
Also you didnt provide any Information to OP. Just trying to derail the topic.

Moderators should enforce forum eticuette. Its so easy to critizise - while its hard to build something yourself!

Back to topic: I gave advice no one refuted it - so it is sound.

j.t. · Post by **j.t.** » Fri Dec 03, 2021 2:28 pm

dangi12012 wrote: ↑Fri Dec 03, 2021 1:44 pm Back to topic: I gave advice no one refuted it - so it is sound.

Regarding the genetic optimization: I don't think this is directly something to do with reinforcement learning. Sure, you can combine genetic learning with reinforcement learning, but I believe for many HCE the gradient descent approach may be more performant. There is a reason why in practice most neural networks are tuned using backpropagation, and not genetic algorithms.

dangi12012 · Post by **dangi12012** » Fri Dec 03, 2021 2:42 pm

j.t. wrote: ↑Fri Dec 03, 2021 2:28 pm
dangi12012 wrote: ↑Fri Dec 03, 2021 1:44 pm Back to topic: I gave advice no one refuted it - so it is sound.
Regarding the genetic optimization: I don't think this is directly something to do with reinforcement learning. Sure, you can combine genetic learning with reinforcement learning, but I believe for many HCE the gradient descent approach may be more performant. There is a reason why in practice most neural networks are tuned using backpropagation, and not genetic algorithms.

Yes they all gradually descent into one of the very many optimums. Its like a mountian where you dont reach the top because you get stuck on the very first hill.
You need a lot of random starting values and gradient descent them - and then you can pick the best out of the pool.

Madeleine Birchfield · Fri Dec 03, 2021 8:24 pm

dangi12012 wrote: ↑Wed Dec 01, 2021 4:48 pm Sure I did that already. Normally you tune the weights of a neuronal netowork - but you can also tune the parameters of your algorithm via genetic optimisation.

Its really just a Darwin approach with multiple populations (to not get stuck in a local minimum).

So you generate 5 populations of N engines with a randomized seed.
Then you find out which 10% performed best and let the rest die. (die = copy the best 10% over the 90% and only slightly change some values again)
Then you repeat these steps and also enable cross intersection - where you take the best engine of each population and cross some of its "dna" = float values (this makes training faster) and copy this over an engine marked for deletion.

This mirrors the real world optimisation of life. Reproduction, Mutation, Selection and will optimize towards a score. Each population will get stuck in a (good) local minumum but crossbreeding among populations will find even better solutions.

I'm already aware what genetic algorithms and population-based training methods are, but they are not really related to the question at hand.

Reinforcement learning to tune handcrafted evals

Re: Reinforcement learning to tune handcrafted evals

Re: Reinforcement learning to tune handcrafted evals

Re: Reinforcement learning to tune handcrafted evals

Re: Reinforcement learning to tune handcrafted evals

Re: Reinforcement learning to tune handcrafted evals