I don’t think I said someone should use probabilities within the evaluation function since I also realised it would probably be too costly to motivate it. If I would do the evaluation function I might go for probabilities though (just for the fun of it) and not consider the Hugh extra cost. I would probably just define a couple of operator overrides to make it simple for me. What I said or wanted to say is that is no problem to use probabilities in a NegaMax alpha beta framework and that it could make the storage place for an entry in the TT smaller, thus maybe would make it possible to squeeze in an extra entry in every bucket. That should probably give a couple of extra Elos. The other advantage I realised was that it might be better to use probabilities in pruning or reduction thresholds for example.Alayan wrote: ↑Mon Jul 06, 2020 6:31 pmThe main point of evaluation is to produce position ordering. If position A has a better eval than position B, prefer position A.Pio wrote: ↑Sun Jul 05, 2020 10:13 pm That was not my point however. My point is that it is simple to convert an existing alpha beta engine to work with win probabilities (or should I say win or draw probabilities to satisfy you). Using my way of probabilities in alpha beta has many obvious advantages as I mentioned in my previous post. An additional gain is that can compress the storage space for the transposition table since a size of 10 bits should be more than enough and 8 bits might be sufficient. With 8 bits you could let 1 bit represent if the score is a special score or not and with the rest of the 7 bits tell either the “win or draw”-probability with a granularity of less than 1 % granularity or the distance to mate.
The secondary point of evaluation is to guide search. Some feature might not be worth the weight it is assigned and would be incorrect in a leaf node that's backed up to the root, but it will push search to consider more the positions with this feature, and descendant leaf nodes where the feature isn't there anymore will tell if there was actually something or not there.
Internal winpct gives no advantage over standard internal units in either case. Increased granularity close to 0.00 isn't an advantage if your evaluation is too inaccurate anyway to take meaningful advantage of it. Meanwhile, you just made serial computation of the position's evaluation a total headache, as you can't just add winpct the way one can add cp. If you use conversion functions to go back and forth from an additive model (or something equivalent), then you're just wasting a lot of energy on useless computations. If you don't go with a linear model, there is nothing "simple" about converting an existing engine. It will be very complex, which also mean hard to tune and improve, and you'll lose elo because even if the model allows good enough values to stay on par with the linear one, you won't find them.
Besides, centipawn output makes no false promise. It gives an estimated advantage, but if someone with a clue makes the mental effort to think of it in winning probabilities, contextual information will be used - type of position, engine depth, eval trends...
Meanwhile, with raw WDL output, many people make the gross mistake of forgetting context. The actual WDL values will be off for almost all situations. They are tuned for training conditions, but even so, they are only a guess. WDL may look more serious, but if the engine is missing something important, its number will be way off. And for any games played in different conditions, the WDL predictions are not applicable. If Stockfish proclaims +2 in a human blitz game position that white goes on to lose, the interpretation "White blundered a +2 position" still stands. If Stockfish were to proclaim 95% win instead, the interpretation "White blundered a 95% win position" would be completely wrong because in the context of that game, there never was a 95% win for white position.
That's completely irrelevant to the discussion at hand. The NN takes input feature, then output an eval in SF-internal units. It doesn't need winpct to work at all, because winpct gives nothing for position ordering and search exploration - and to perform well with SF's search without heavy modifications, the eval needs similarities with SF's original eval.Milos wrote: ↑Mon Jul 06, 2020 2:01 pmNo need, NN eval for SF is already there and for fixed nodes is already noticably stroger than original SF eval.syzygy wrote: ↑Sun Jul 05, 2020 10:47 pm.
But I hope you do realise that you will have to rewrite Stockfish's evaluation almost completely. You can't just convert SF's current (usually additive) scoring components into probabilities or "probability components". (And this thread has "Stockfish" in the title.)
Converting SF's hand-written eval to use winpct instead of cp is what was suggested ant it's unfeasible without severe elo loss.
What I wanted to say with my posts was/is that there is nothing in an alpha beta framework that prevents you to use probabilities in the same way as there is nothing that prevents a neural network to use alpha beta instead of a MCTS like algorithm.
Maybe I will make a neural network engine in the future. I have an idea of how to do it. I would not use many of the techniques that is used for image classification and I think I could make a network that is very small but still okay. The big problem would be to come up with a good way to train it. Maybe it is best to start training it from known table base positions and mate in n problems and after learning some basic knowledge of mating material and Mating patterns it could learn from playing itself and weighing the positions close to the end more.
If you want some ideas that has potential to give big Elo gains I could share those.