Eval development: is it better to tune or add new terms?

sedicla · Post by **sedicla** » Sun Mar 17, 2013 9:01 pm

My engine has an elo of about 2500. I was wondering if I have to focus more on tuning current evaluation terms or add new ideas.
I am going to start a new development cycle, and I'm not sure how much I can improve by tuning current items, and when I should stop this. I have a feeling that I can improve, but maybe I will waste time and not improve that much. On the other hand if I introduce new items, will add more variables to the process. Anyway, I appreciate if anyone can comment on that...

ZirconiumX · Post by **ZirconiumX** » Sun Mar 17, 2013 10:23 pm

Tuning what you've got is the classical "beancounter" approach with a fast but dumb approach, whereas adding more untuned terms is the Diep approach - slow but smart.

The latter is not seen very often nowadays - it would be interesting to see a publically released slow but smart engine.

Matthew:out

velmarin · Post by **velmarin** » Sun Mar 17, 2013 10:26 pm

It seems that there are some basic settings essential.

There are other ideas that do not require much adjustment, rather hint to the engine you can do this, evaluate this move, ect.

You answered yourself, try new things will also be more fun.

lucasart · Post by **lucasart** » Mon Mar 18, 2013 12:15 am

sedicla wrote:My engine has an elo of about 2500. I was wondering if I have to focus more on tuning current evaluation terms or add new ideas.
I am going to start a new development cycle, and I'm not sure how much I can improve by tuning current items, and when I should stop this. I have a feeling that I can improve, but maybe I will waste time and not improve that much. On the other hand if I introduce new items, will add more variables to the process. Anyway, I appreciate if anyone can comment on that...

Adding more and more eval terms is rarely the right way to go. You should have few terms, that are as orthogonal as possible and well tuned. Often a clumsy eval can be improved by simply removing useless and/or harmful terms.
Testing is crucial: every patch should be tested (non functional should be tested by node count, and functional by playing thousands of games to make sure the improvement is measurable beyond error bar).
Probably the most important eval terms, in my experience, are:
- mobility
- passed pawns
- king safety
- hanging pieces

lucasart · Post by **lucasart** » Mon Mar 18, 2013 12:19 am

ZirconiumX wrote:Tuning what you've got is the classical "beancounter" approach with a fast but dumb approach, whereas adding more untuned terms is the Diep approach - slow but smart.

I completely disagree with that statement. Tuning is smart, and adding is dumb! And since when is adding eval terms called the "Diep approach" ? Tuning is much more than just optimizing values. It's also understanding interactions between eval features, and adressing those by trying to minimize the number of eval terms and making them more orthogonal.

velmarin · Post by **velmarin** » Mon Mar 18, 2013 12:54 am

Very good in Visual Studio, in three minutes compiled,
with Intel compiler alike.

With Wb2UCI works perfect on fritz.

It will be nice to look slow, seems very nice.

ZirconiumX · Post by **ZirconiumX** » Mon Mar 18, 2013 8:35 am

lucasart wrote:
ZirconiumX wrote:Tuning what you've got is the classical "beancounter" approach with a fast but dumb approach, whereas adding more untuned terms is the Diep approach - slow but smart.
I completely disagree with that statement. Tuning is smart, and adding is dumb! And since when is adding eval terms called the "Diep approach" ? Tuning is much more than just optimizing values. It's also understanding interactions between eval features, and adressing those by trying to minimize the number of eval terms and making them more orthogonal.

By "fast but dumb" I mean a really selective search (fast), but little eval knowledge (dumb) - an extremely tactical search. Vice versa for "slow but smart"

I called it the Diep approach because I couldn't think of any other engines with a massive eval; perhaps "approach used by Diep" would have been better.

I didn't say anything against tuning, I actually tune quite a lot.

Matthew:out

Evert · Post by **Evert** » Mon Mar 18, 2013 11:03 am

One way to answer that is to try to get an understanding of what evaluation terms are present in programs stronger than yours. The easiest is of course to look at the source for open-source programs, but the problem with that is that the distinction between idea and implementation can become blurred and it's not so obvious what the idea is. I prefer reading descriptions of what's in there myself, and come up with my own way to implement it.

A perhaps more interesting way to go about this is to look at a few games. Perhaps it's obvious to you when you see the game where the engine makes a mistake (it fails to see that a passed pawn is dangerous, for instance, or it doesn't appreciate a king-side attack until it is too late), or perhaps you need to analyse the game with a stronger program. This can tell you whether there is something that is lacking in your evaluation (it can of course also tell you that a term that is there isn't actually doing what it should, or is hindered by another term). Be sure to test any changes you make properly though!

sedicla · Post by **sedicla** » Mon Mar 18, 2013 11:27 am

Thanks Matthew and Lucas and Jose, I understand what you say, maybe Matt used strong words to describe the 2 "schools" of engines:

1) the ones that have few knowledge in evals, fast search and,
2) the ones that have lot of knowledge in evals and have slower search.

I guess we can say some notables 1's are fruit, discocheck. And number 2's includes Dieep, Junior, Prodeo etc.

I think at this point I have to decide where to go, type 1 or 2.
Type 1 would give give a strong engine, but maybe type 2 would give me a not so strong engine but it will be more fun to do and maybe more pleasant for users....

I can mention that when doing king safety, at one moment my engine was so aggressive that would make sacrifices to get to the enemy king. Was unsound most of the time, but it was very fun to watch.

Thanks, I'll have to decide...

ZirconiumX wrote:
lucasart wrote:
ZirconiumX wrote:Tuning what you've got is the classical "beancounter" approach with a fast but dumb approach, whereas adding more untuned terms is the Diep approach - slow but smart.
I completely disagree with that statement. Tuning is smart, and adding is dumb! And since when is adding eval terms called the "Diep approach" ? Tuning is much more than just optimizing values. It's also understanding interactions between eval features, and adressing those by trying to minimize the number of eval terms and making them more orthogonal.
By "fast but dumb" I mean a really selective search (fast), but little eval knowledge (dumb) - an extremely tactical search. Vice versa for "slow but smart"

I called it the Diep approach because I couldn't think of any other engines with a massive eval; perhaps "approach used by Diep" would have been better.

I didn't say anything against tuning, I actually tune quite a lot.

Matthew:out

velmarin · Post by **velmarin** » Mon Mar 18, 2013 11:55 am

I have not looked enough your code, I will.

One way to put those ideas is to make them indirectly,
to score directly (opening, endgame),
done in an intermediate
score on another variable (good_attack example) then as we reach certain goals, the score had to in sections (opening, endgame),
It's nothing new, but it seems very effective, and settings need not be so drastic.

is clear that the test mass of games with different parameters can be refined, but besides expensive (money and boring) is very professional, I do not like. takes away the charm.

Eval development: is it better to tune or add new terms?

Eval development: is it better to tune or add new terms?

Re: Eval development: is it better to tune or add new terms?

Re: Eval development: is it better to tune or add new terms?

Re: Eval development: is it better to tune or add new terms?

Re: Eval development: is it better to tune or add new terms?

Re: Eval development: is it better to tune or add new terms?

Re: Eval development: is it better to tune or add new terms?

Re: Eval development: is it better to tune or add new terms?

Re: Eval development: is it better to tune or add new terms?

Re: Eval development: is it better to tune or add new terms?