Chess AI engine in 5 years.

hgm · Post by **hgm** » Mon Oct 14, 2024 6:50 pm

Mike Sherwin wrote: ↑Mon Oct 14, 2024 5:11 pm Thanks for the explanation. It still sounds static. Once the NN is trained with a billion positions it just creates static tables in that one huge set of generalized values are used for each specific position. That cannot be optimal. Spending some time in the beginning of a search to learn better values for the tables for the specific position on the board will destroy a static only NN.

I guess the idea behind it is that a large 'library' of static evaluations (even if only of the simplistic PST type) can mimic a dynamic evaluation very well. If the billion positions it has been trained on cover nearly every situation one encounters in games, it can have learned and remembered the 'dynamic' evaluation that belongs to that situation. Combined with learning to recognize the characteristics that must be present to warrant use of a certain evaluation, (e.g. many/few pieces present, good/poor King Safety, being ahead/behind in material, having a Pawn majority/minority...), it can then just draw the 'dynamic' evaluation parameters from the library rather than having to learn them from scratch during the search of the position.

It is a bit like End-Game Tables. You can either let an engine search the position for some time in order to find the fastest path to checkmate, or you can pre-calculate that for every position and put in in a table, so that the engine only has to probe the table.

Even though it might not be optimal, it appears to be a whole lot better than even the most extensive HCE people ever created.

Mike Sherwin · Post by **Mike Sherwin** » Mon Oct 14, 2024 7:41 pm

hgm wrote: ↑Mon Oct 14, 2024 6:50 pm
Mike Sherwin wrote: ↑Mon Oct 14, 2024 5:11 pm Thanks for the explanation. It still sounds static. Once the NN is trained with a billion positions it just creates static tables in that one huge set of generalized values are used for each specific position. That cannot be optimal. Spending some time in the beginning of a search to learn better values for the tables for the specific position on the board will destroy a static only NN.
I guess the idea behind it is that a large 'library' of static evaluations (even if only of the simplistic PST type) can mimic a dynamic evaluation very well. If the billion positions it has been trained on cover nearly every situation one encounters in games, it can have learned and remembered the 'dynamic' evaluation that belongs to that situation. Combined with learning to recognize the characteristics that must be present to warrant use of a certain evaluation, (e.g. many/few pieces present, good/poor King Safety, being ahead/behind in material, having a Pawn majority/minority...), it can then just draw the 'dynamic' evaluation parameters from the library rather than having to learn them from scratch during the search of the position.

It is a bit like End-Game Tables. You can either let an engine search the position for some time in order to find the fastest path to checkmate, or you can pre-calculate that for every position and put in in a table, so that the engine only has to probe the table.

Even though it might not be optimal, it appears to be a whole lot better than even the most extensive HCE people ever created.

I don't want to be stuck in time with, "it appears to be a whole lot better than even the most extensive HCE people ever created". I want to progress with, 'it appears that NNUE with pre search learning is a whole lot better than the basic static NNUE'. Any library will have errors that will be perpetrated throughout the local data. With pre search learning many of those errors can be eliminated before the search begins.

EDIT: If a table in NNUE has a bad high value for a piece on a particular square the search will always try to accommodate putting that piece on that square. Tactics of the search may counter that but that bad value and other bad values will skew the evaluation nonetheless. Eliminating those bad values in pre search learning will lead to much better results.

Viz · Post by **Viz** » Mon Oct 14, 2024 8:16 pm

You can think it appears w/e you want, honestly.
From what I know there are multiple examples of SF nets to "preevaluate" tactics (aka showing really big static eval in position that has this eval only achievable by some tactic) despite tactical positions being specifically excluded from training data.
This is just how it is more or less, your theorycrafting wouldn't change this fact.
And yes, adding tactical positions makes really bad nets.

JacquesRW · Post by **JacquesRW** » Mon Oct 14, 2024 8:40 pm

Mike Sherwin wrote: ↑Mon Oct 14, 2024 5:11 am I don't understand NNUE very well but from what I have read it is a piece-square table for every position of the kings.

These are referred to as "king buckets", or "input buckets", and they have **zero** relevance to the core concept of NNUE. There are plenty (the majority of strong non-SF engines) of engines with either unbucketed networks (like Alexandria) or bucketed networks that use a smaller number of king regions (rather than a bucket for every square).

Mike Sherwin wrote: ↑Mon Oct 14, 2024 7:41 pm Any library will have errors that will be perpetrated throughout the local data.

I don't think its fair to call NNUE just a "library". As Viz just mentioned there are examples of networks appearing to generalise beyond the data they were trained on.

Mike Sherwin wrote: ↑Mon Oct 14, 2024 7:41 pm With pre search learning many of those errors can be eliminated before the search begins.

You might be able to get this working with some tiny number of parameters, e.g. an HCE, but a full-fledged NNUE is IMO going to either be pretty resistant to learning much new stuff from some tiny amount of pre-search learning, or completely overfit to such a small number of samples.

I think so far you've mostly been speculating, and real relevant evidence is needed to back up your claims in the context of strong modern engines.

P.S. The recent and very effective "correction history" (family of) search heuristics effectively does live learning for the eval without actually changing the weights.

Mike Sherwin · Post by **Mike Sherwin** » Mon Oct 14, 2024 9:29 pm

JacquesRW wrote: ↑Mon Oct 14, 2024 8:40 pm I think so far you've mostly been speculating, and real relevant evidence is needed to back up your claims in the context of strong modern engines.

It is more than speculation. After game RL like in RomiChess has demonstrated that as little as 19 training games per position can gain 1000 elo in performance against Glaurung2. It is not speculation that bringing this method into real time would also work. The games would not be deeply searched but there would be thousands of training games played that will bring information (the outcomes of the games) back from far beyond the search horizon. And that is why it would work and also work very well.

Mike Sherwin · Post by **Mike Sherwin** » Mon Oct 14, 2024 10:45 pm

I don't understand the pushback that I have received over the last 18+ years from fellow programmers about my RL algorithm. Robert Hyatt pushed hard against it. His argument was that his huge opening book was so huge that even if RomiChess found a way to defeat Crafty in any particular line Crafty would just no longer play "that move". So someone (I forget who, Robin somebody, maybe?) played 6 1000 game matches between RomiChess and Crafty using Bob's humongous opening book. In the 6th match Romi played +50 elo stronger. There are hundreds of thousands if not over a million lines in Bob's humongous opening book. So how in the world could only 6000 training games result in a +50 elo gain? This is how. Results from the learned games are back propagated to every move of the game to the root move. If Romi had better results with 1. d4 then Romi would play 1. d4. And this is true about any position that Romi has encountered before. Even one learned game can affect the move that Romi would play giving Romi a small chance of winning instead of losing. So no matter what Crafty threw at Romi, Romi played lines it had better results in. The next pushback was that chess engine authors just declared Romi's learning, book learning. And they said that as soon as Romi was out of book Romi was no better. That also was not true. Romi loaded the subtree from the learn file into the position hash. It is those learned values that helped guide the search. Just because Romi was no longer in the learn file the residual values of the learn file were still in the position hash and still guiding the search. Their effect while diminishing slowly with each search was still present in every new move stored in the position hash. So the influence of the learn file did not simply end when the game position was no longer in the learn file. And no one ever says, oh I see what you mean. And now I only get pushback about real time learning. And someday real time learning will happen and the pushback will once again be proven wrong. My name won't be mentioned though. My name is not ever mentioned now in any talk about RL in chess engines even though RomiChess was the only successful (in that it performed better) RL engine when it came out in 2006. In Leo's tournaments Romi (same version) promoted at least two classes and was about to promote again when Leo's hard drive failed and he lost Romi's learn file. So yeah just say I'm speculating!

Viz · Post by **Viz** » Mon Oct 14, 2024 11:01 pm

Yes, you are just speculating.
And yes, chess engine development have gone long way since your 20 years ago or w/e stuff so whatever conclusions you had back then have almost as much relevance as economic policy of USSR.

Mike Sherwin · Post by **Mike Sherwin** » Mon Oct 14, 2024 11:20 pm

Viz wrote: ↑Mon Oct 14, 2024 11:01 pm Yes, you are just speculating.
And yes, chess engine development have gone long way since your 20 years ago or w/e stuff so whatever conclusions you had back then have almost as much relevance as economic policy of USSR.

Or you are just wrong! Am I supposed to know what "w/e stuff" means? Where is there any substance in your words? Anyone can say anything like you have just done. Prove me wrong!

JacquesRW · Post by **JacquesRW** » Mon Oct 14, 2024 11:34 pm

Mike Sherwin wrote: ↑Mon Oct 14, 2024 9:29 pm
JacquesRW wrote: ↑Mon Oct 14, 2024 8:40 pm I think so far you've mostly been speculating, and real relevant evidence is needed to back up your claims in the context of strong modern engines.
It is more than speculation. After game RL like in RomiChess has demonstrated that as little as 19 training games per position can gain 1000 elo in performance against Glaurung2.

TBH you can make this claim and I have no idea if its reproducible, nor the conditions under which it originally allegedly took place.
If you could provide me with (/point me to) a copy of RomiChess and I can reproduce the claim, I'll happily look into it more.

I said "in the context of strong modern engines" for a reason, and I also explicitly mentioned why this may be feasible for HCE, but perhaps not for NNUE.

Mike Sherwin wrote: ↑Mon Oct 14, 2024 9:29 pm It is not speculation that bringing this method into real time would also work. The games would not be deeply searched but there would be thousands of training games played that will bring information (the outcomes of the games) back from far beyond the search horizon. And that is why it would work and also work very well.

For every Stockfish release, there are tens to hundreds of well-reasoned and logical patches that fail on fishtest. If you don't have evidence of this actually gaining elo in a relevant context, you cannot make such claims.

Mike Sherwin wrote: ↑Mon Oct 14, 2024 11:20 pm Anyone can say anything like you have just done. Prove me wrong!

Anyone can suggest an idea that isn't immediately refutable. It is not on us to disprove the effectiveness of the idea in practise.
Fishtest exists and as does many an OpenBench instance where you could try your ideas. You have had the opportunity to actually try to implement since the advent of NNUE for several years now and it isn't our fault that you limit yourself to talking about it on this forum.

The general idea of "hey lets train weights in real time" isn't exactly revolutionary, and its a pretty damn broad spectrum, so if you could provide a sufficiently detailed description of what exactly you want to try (and I mean actually detailed), then perhaps someone will do you the service of testing the idea for you.

EDIT: split a quote up incorrectly

Viz · Post by **Viz** » Mon Oct 14, 2024 11:48 pm

Well let's be real no one is there to test smth extremely vague spitted out on the forum to see if it gains.
I have a pretty big experience in developing my own ideas as well as implementing other people ideas but in general the best way would be to DYI, after all it's not that hard to try stuff for sf if you are a capable programmer, also it really kills any need in you having hardware to test it since fishtest provides it for everyone.
What an opportunity for you, don't you think?

Chess AI engine in 5 years.

Re: Chess AI engine in 5 years.

Re: Chess AI engine in 5 years.

Re: Chess AI engine in 5 years.

Re: Chess AI engine in 5 years.

Re: Chess AI engine in 5 years.

Re: Chess AI engine in 5 years.

Re: Chess AI engine in 5 years.

Re: Chess AI engine in 5 years.

Re: Chess AI engine in 5 years.

Re: Chess AI engine in 5 years.