How much ELO should I expect to gain from killer moves?

lithander · Post by **lithander** » Fri Jul 16, 2021 2:27 pm

algerbrex wrote: ↑Fri Jul 16, 2021 1:02 pm

Score of BlunderOld vs Blunder 1.0.0: 28 - 41 - 131 [0.468]
...      BlunderOld playing White: 17 - 19 - 64  [0.490] 100
...      BlunderOld playing Black: 11 - 22 - 67  [0.445] 100
...      White vs Black: 39 - 30 - 131  [0.522] 200
Elo difference: -22.6 +/- 28.3, LOS: 5.9 %, DrawRatio: 65.5 %
200 of 200 games finished.

100 games is not enough to draw conclusions. The statistical uncertainty in the result (+/- 28.3) is higher than the ELO difference. I personally do around 2000 or more games in my tests on multiple threads (my computer has 6 cores, 12 threads) and leave the computer running over night at the expense of a higher electricity bill.

algerbrex wrote: ↑Fri Jul 16, 2021 1:34 pm I have to admit though, I find the whole tuning process pretty daunting, and after reading the article about it on the Chess Programming Wiki, I still don't have a good idea about how to go about implementing it concretely. Any resources you'd recommend?

I found the article on CPW pretty good and sufficient to implement the tuner. The only other resource I used was a set of quiet positions from zurichess available here: https://bitbucket.org/zurichess/tuner/downloads/

Do you have any specific questions? Imo it's not very difficult conceptually or requires deep math skills if you are okay with using simple local optimization. An example what local optimization does is shown in pseudo code and as you can see it's really just trying out different values and keeping those that performed best.

And what does perform best mean? After each change you iterate over all positions in your set of annotated positions and evaluate it with the current set of PST values. Then you compare the evaluation of the position with the outcome of the game. And you try to improve the average predictive quality of your PST based evaluation.

How do you improve the predicitve quality of the PST? You try to minimize the mean squared error of all positions where the error is a positive number between 0 and 2 expressing how much the evaluation mispredicted the actual outcome of the game.

How do you compare the evaluation given in CP with the outcome of the game given as -1, 0, or 1? That's where the Sigmoid-function comes into play which maps the score given in centipawns into an interval between -1 and 1. Look at the plot of the sigmoid. Look how it changes when you change what the article calls K. So before you can start with tuning you need to settle on a value of K you want to use.

And then you just write code that changes the values in your PST until no further change get's the average error closer to 0. (Consider multithreading!) Then you're done.

That's the gist of it.

mvanthoor · Post by **mvanthoor** » Fri Jul 16, 2021 4:10 pm

algerbrex wrote: ↑Fri Jul 16, 2021 1:34 pm Right about the credit, of course.

I originally started off by writing my own, but I'm even less of a decent player than you are, and even though I did some research into chess theory and piece placement, I never got very good results compared to using other engines PSQT.

Don't write your own PST's if you don't have something like an 1800+ (preferably even higher) over the board rating. It just takes too much time, and you probably don't have enough theoretical chess knowledge. Just grab some PST's from somewhere, credit the author, and use them in your engine until you can implement a tapered evaluation and then tune the tables. (You can tune them without having a tapered evaluation, but I see no point in that, except if you want to try and write the strongest possible engine without a tapered evaluation.)

But I want this engine to be original, so after it gets off the ground, I'll definitely be looking into tuning my own tables. I have to admit though, I find the whole tuning process pretty daunting, and after reading the article about it on the Chess Programming Wiki, I still don't have a good idea about how to go about implementing it concretely. Any resources you'd recommend?

I'm right at the point where I'm going to look into writing a tuner, and yes, it's somewhat daunting. I've been putting it off for some time actually, fixing stuff in my engine here and there. (I still need to do some stuff on the time management. In _some_ time controls, the engine still loses on time, even if it has an increment...) I don't have a clear idea about how to write the tuner either. I'll probably look into it in another 2-3 weeks, when I'm on holiday.

j.t. · Post by **j.t.** » Fri Jul 16, 2021 4:15 pm

If you use late move reductions, then history heuristic and killer moves (and possibly other move ordering techniques) become more important, as they help that LMR doesn't reduce good moves.

algerbrex · Post by **algerbrex** » Fri Jul 16, 2021 5:09 pm

lithander wrote: ↑Fri Jul 16, 2021 2:27 pm
100 games is not enough to draw conclusions. The statistical uncertainty in the result (+/- 28.3) is higher than the ELO difference. I personally do around 2000 or more games in my tests on multiple threads (my computer has 6 cores, 12 threads) and leave the computer running over night at the expense of a higher electricity bill.

Yeah, I kinda suspected as much, even though I didn't really want it to be the case. At some point I'll chess need to invest in a dedicated computer upgrade.

lithander wrote: ↑Fri Jul 16, 2021 2:27 pm I found the article on CPW pretty good and sufficient to implement the tuner. The only other resource I used was a set of quiet positions from zurichess available here: https://bitbucket.org/zurichess/tuner/downloads/

Do you have any specific questions? Imo it's not very difficult conceptually or requires deep math skills if you are okay with using simple local optimization. An example what local optimization does is shown in pseudo code and as you can see it's really just trying out different values and keeping those that performed best.

And what does perform best mean? After each change you iterate over all positions in your set of annotated positions and evaluate it with the current set of PST values. Then you compare the evaluation of the position with the outcome of the game. And you try to improve the average predictive quality of your PST based evaluation.

How do you improve the predicitve quality of the PST? You try to minimize the mean squared error of all positions where the error is a positive number between 0 and 2 expressing how much the evaluation mispredicted the actual outcome of the game.

How do you compare the evaluation given in CP with the outcome of the game given as -1, 0, or 1? That's where the Sigmoid-function comes into play which maps the score given in centipawns into an interval between -1 and 1. Look at the plot of the sigmoid. Look how it changes when you change what the article calls K. So before you can start with tuning you need to settle on a value of K you want to use.

And then you just write code that changes the values in your PST until no further change get's the average error closer to 0. (Consider multithreading!) Then you're done.

That's the gist of it.

Well from your description, I think I'm starting to understand things, so it's becoming pretty clear. I suppose I just need to go back right now and more judicially read the entry in the CPW. And yeah, just like with engine testing, I really need to invest in a quality computer at some point so things can go decently fast.

mvanthoor · Post by **mvanthoor** » Fri Jul 16, 2021 5:21 pm

lithander wrote: ↑Fri Jul 16, 2021 2:27 pm I found the article on CPW pretty good and sufficient to implement the tuner. The only other resource I used was a set of quiet positions from zurichess available here: https://bitbucket.org/zurichess/tuner/downloads/

You just wrote the tuner from the pseudo-code?

If so, that would be the reason why it's slow. There ways such as "SPSA" to make a much faster tuner. There are many people that just use the Stockfish tuner, or re-implement that. Maybe I should just implement the pseudo-code from the wiki page and have it parallelized by Rayon, so I have a tuner at least.

mvanthoor · Post by **mvanthoor** » Fri Jul 16, 2021 5:24 pm

lithander wrote: ↑Fri Jul 16, 2021 2:27 pm And what does perform best mean? After each change you iterate over all positions in your set of annotated positions and evaluate it with the current set of PST values. Then you compare the evaluation of the position with the outcome of the game. And you try to improve the average predictive quality of your PST based evaluation.

But how can you compare a -50 or +32 evaluation with a 1, 0 or draw outcome? That's the thing I don't really understand. I could understand it if the EPD's had evaluations in them, made by stronger engines, and you would compare against those.

Edit: Ah, it's him again. The friggin' Sigmoid.

I've always disliked Freud. It seems his predictions where wrong most of the time.

Wait. Maybe I've got some stuff mixed up...

algerbrex wrote: ↑Fri Jul 16, 2021 5:09 pm Well from your description, I think I'm starting to understand things, so it's becoming pretty clear. I suppose I just need to go back right now and more judicially read the entry in the CPW. And yeah, just like with engine testing, I really need to invest in a quality computer at some point so things can go decently fast.

More cores is better. It would be better to get a second-hand 12 or 16 core cpu instead of a new 6 or 8 core CPU. I'm going to build an 8 core computer in a few weeks (mainly to help out with CCRL-testing and to off-load some of Rustic's testing), but when I build a new system for testing Rustic itself, it'll probably be a 16 core machine. (Which will then later become my main computer at some point.)

lithander · Post by **lithander** » Fri Jul 16, 2021 6:05 pm

mvanthoor wrote: ↑Fri Jul 16, 2021 5:21 pm You just wrote the tuner from the pseudo-code?

If so, that would be the reason why it's slow.

Always this fixation on speed... sometimes speed matters but if you always add speed to your requirements even though you don't have to it can make your goals harder to achieve. If you want to tune your own PSTs with a tuner written from scratch but are not comfortable with implementing state-of-the-art optimizers there's a very simple and obvious solution to your problem. Why don't you take it?

mvanthoor wrote: ↑Fri Jul 16, 2021 5:24 pm But how can you compare a -50 or +32 evaluation with a 1, 0 or draw outcome? That's the thing I don't really understand. I could understand it if the EPD's had evaluations in them, made by stronger engines, and you would compare against those.

Imagine there would exist a mathematical function f(x) and for x = 0 it returns 0 and for x > 0 it returns a positive value and for x < 0 it returns a negative value but regardless of how big or small your x is: The result of f(x) is always between -1 and 1! In that case you could stuff your centipawn values in and get exactly what you need out of the function.

I'm sure if you'd get to know Mr. Sigmoid a little better you'd be great friends!

mvanthoor · Post by **mvanthoor** » Fri Jul 16, 2021 6:16 pm

lithander wrote: ↑Fri Jul 16, 2021 6:05 pm Always this fixation on speed... sometimes speed matters but if you always add speed to your requirements even though you don't have to it is sometimes making your goals harder to achieve. If you want to tune your own PSTs with a tuner written from scratch but are not comfortable with implementing state-of-the-art optimizers there's a very simple and obvious solution to your problem. Why don't you take it?

You said yourself that your tuner takes 6 hours to run. If you start adding evaluation terms, you'll have to re-tune the evaluation all over again. If I can help it, I'd like to not add another 6 hours to the testing procedure that takes one night and sometimes even days already. (How did people even write chess engines before 2015?)

I'd want the tuner to at least do multithreading so I'll have to wait 75% less.

Today I need at least a 4-6 core machine having 16 GB of RAM to basically do the same things I did in 2001 with a single-core CPU and 256 MB. The biggest difference is that current-day software looks nicer. (But it's not always easier to use.)

mvanthoor wrote: ↑Fri Jul 16, 2021 5:24 pm Imagine there would exist a mathematical function f(x) and for x = 0 it returns 0 and for x > 0 it returns a positive value and for x < 0 it returns a negative value but regardless of how big or small your x is: The result of f(x) is always between -1 and 1! In that case you could stuff your centipawn values in and get exactly what you need out of the function.

I'm sure if you'd get to know Mr. Sigmoid a little better you'd be great friends!

Probably. I understand the concept of Sigmoids (and Freudian psychology as well), but up until now, I never had a use for them.

lithander · Post by **lithander** » Fri Jul 16, 2021 6:35 pm

mvanthoor wrote: ↑Fri Jul 16, 2021 6:16 pm I'd want the tuner to at least do multithreading so I'll have to wait 75% less.

You can do that quite easily. The part where you compute the error of *each* of the thousands or millions of positions is easily parallelized.

mvanthoor wrote: ↑Fri Jul 16, 2021 6:16 pm If you start adding evaluation terms, you'll have to re-tune the evaluation all over again

At what point in development do you plan adding more evaluation terms? Maybe you can use a simple tuner for now and upgrade it once you need the speed? (of course you can do it properly right from the start but then I won't be able to provide much help)

Jakob Progsch · Post by **Jakob Progsch** » Fri Jul 16, 2021 8:57 pm

Why is everyone writing their own tuner anyway? Just export the positions/features and scores and throw them at tensorflow or so. After all a PSQT is just a tiny single layer network and using a well optimized framework will have those converge within minutes instead of the hours people often quote for their DIY tuners?
Using these python based frameworks also makes for very convenient experimentation when adding more terms etc.

How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?

Re: How much ELO should I expect to gain from killer moves?