My engine has an elo of about 2500. I was wondering if I have to focus more on tuning current evaluation terms or add new ideas.
I am going to start a new development cycle, and I'm not sure how much I can improve by tuning current items, and when I should stop this. I have a feeling that I can improve, but maybe I will waste time and not improve that much. On the other hand if I introduce new items, will add more variables to the process. Anyway, I appreciate if anyone can comment on that...
Eval development: is it better to tune or add new terms?
Moderators: hgm, Rebel, chrisw
-
- Posts: 178
- Joined: Sat Jan 08, 2011 12:51 am
- Location: USA
- Full name: Alcides Schulz
-
- Posts: 1334
- Joined: Sun Jul 17, 2011 11:14 am
Re: Eval development: is it better to tune or add new terms?
Tuning what you've got is the classical "beancounter" approach with a fast but dumb approach, whereas adding more untuned terms is the Diep approach - slow but smart.
The latter is not seen very often nowadays - it would be interesting to see a publically released slow but smart engine.
Matthew:out
The latter is not seen very often nowadays - it would be interesting to see a publically released slow but smart engine.
Matthew:out
Some believe in the almighty dollar.
I believe in the almighty printf statement.
I believe in the almighty printf statement.
-
- Posts: 1600
- Joined: Mon Feb 21, 2011 9:48 am
Re: Eval development: is it better to tune or add new terms?
It seems that there are some basic settings essential.
There are other ideas that do not require much adjustment, rather hint to the engine you can do this, evaluate this move, ect.
You answered yourself, try new things will also be more fun.
There are other ideas that do not require much adjustment, rather hint to the engine you can do this, evaluate this move, ect.
You answered yourself, try new things will also be more fun.
-
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Eval development: is it better to tune or add new terms?
Adding more and more eval terms is rarely the right way to go. You should have few terms, that are as orthogonal as possible and well tuned. Often a clumsy eval can be improved by simply removing useless and/or harmful terms.sedicla wrote:My engine has an elo of about 2500. I was wondering if I have to focus more on tuning current evaluation terms or add new ideas.
I am going to start a new development cycle, and I'm not sure how much I can improve by tuning current items, and when I should stop this. I have a feeling that I can improve, but maybe I will waste time and not improve that much. On the other hand if I introduce new items, will add more variables to the process. Anyway, I appreciate if anyone can comment on that...
Testing is crucial: every patch should be tested (non functional should be tested by node count, and functional by playing thousands of games to make sure the improvement is measurable beyond error bar).
Probably the most important eval terms, in my experience, are:
- mobility
- passed pawns
- king safety
- hanging pieces
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Eval development: is it better to tune or add new terms?
I completely disagree with that statement. Tuning is smart, and adding is dumb! And since when is adding eval terms called the "Diep approach" ? Tuning is much more than just optimizing values. It's also understanding interactions between eval features, and adressing those by trying to minimize the number of eval terms and making them more orthogonal.ZirconiumX wrote:Tuning what you've got is the classical "beancounter" approach with a fast but dumb approach, whereas adding more untuned terms is the Diep approach - slow but smart.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 1600
- Joined: Mon Feb 21, 2011 9:48 am
Re: Eval development: is it better to tune or add new terms?
Very good in Visual Studio, in three minutes compiled,
with Intel compiler alike.
With Wb2UCI works perfect on fritz.
It will be nice to look slow, seems very nice.
with Intel compiler alike.
With Wb2UCI works perfect on fritz.
It will be nice to look slow, seems very nice.
-
- Posts: 1334
- Joined: Sun Jul 17, 2011 11:14 am
Re: Eval development: is it better to tune or add new terms?
By "fast but dumb" I mean a really selective search (fast), but little eval knowledge (dumb) - an extremely tactical search. Vice versa for "slow but smart"lucasart wrote:I completely disagree with that statement. Tuning is smart, and adding is dumb! And since when is adding eval terms called the "Diep approach" ? Tuning is much more than just optimizing values. It's also understanding interactions between eval features, and adressing those by trying to minimize the number of eval terms and making them more orthogonal.ZirconiumX wrote:Tuning what you've got is the classical "beancounter" approach with a fast but dumb approach, whereas adding more untuned terms is the Diep approach - slow but smart.
I called it the Diep approach because I couldn't think of any other engines with a massive eval; perhaps "approach used by Diep" would have been better.
I didn't say anything against tuning, I actually tune quite a lot.
Matthew:out
Some believe in the almighty dollar.
I believe in the almighty printf statement.
I believe in the almighty printf statement.
-
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: Eval development: is it better to tune or add new terms?
One way to answer that is to try to get an understanding of what evaluation terms are present in programs stronger than yours. The easiest is of course to look at the source for open-source programs, but the problem with that is that the distinction between idea and implementation can become blurred and it's not so obvious what the idea is. I prefer reading descriptions of what's in there myself, and come up with my own way to implement it.
A perhaps more interesting way to go about this is to look at a few games. Perhaps it's obvious to you when you see the game where the engine makes a mistake (it fails to see that a passed pawn is dangerous, for instance, or it doesn't appreciate a king-side attack until it is too late), or perhaps you need to analyse the game with a stronger program. This can tell you whether there is something that is lacking in your evaluation (it can of course also tell you that a term that is there isn't actually doing what it should, or is hindered by another term). Be sure to test any changes you make properly though!
A perhaps more interesting way to go about this is to look at a few games. Perhaps it's obvious to you when you see the game where the engine makes a mistake (it fails to see that a passed pawn is dangerous, for instance, or it doesn't appreciate a king-side attack until it is too late), or perhaps you need to analyse the game with a stronger program. This can tell you whether there is something that is lacking in your evaluation (it can of course also tell you that a term that is there isn't actually doing what it should, or is hindered by another term). Be sure to test any changes you make properly though!
-
- Posts: 178
- Joined: Sat Jan 08, 2011 12:51 am
- Location: USA
- Full name: Alcides Schulz
Re: Eval development: is it better to tune or add new terms?
Thanks Matthew and Lucas and Jose, I understand what you say, maybe Matt used strong words to describe the 2 "schools" of engines:
1) the ones that have few knowledge in evals, fast search and,
2) the ones that have lot of knowledge in evals and have slower search.
I guess we can say some notables 1's are fruit, discocheck. And number 2's includes Dieep, Junior, Prodeo etc.
I think at this point I have to decide where to go, type 1 or 2.
Type 1 would give give a strong engine, but maybe type 2 would give me a not so strong engine but it will be more fun to do and maybe more pleasant for users....
I can mention that when doing king safety, at one moment my engine was so aggressive that would make sacrifices to get to the enemy king. Was unsound most of the time, but it was very fun to watch.
Thanks, I'll have to decide...
1) the ones that have few knowledge in evals, fast search and,
2) the ones that have lot of knowledge in evals and have slower search.
I guess we can say some notables 1's are fruit, discocheck. And number 2's includes Dieep, Junior, Prodeo etc.
I think at this point I have to decide where to go, type 1 or 2.
Type 1 would give give a strong engine, but maybe type 2 would give me a not so strong engine but it will be more fun to do and maybe more pleasant for users....
I can mention that when doing king safety, at one moment my engine was so aggressive that would make sacrifices to get to the enemy king. Was unsound most of the time, but it was very fun to watch.
Thanks, I'll have to decide...
ZirconiumX wrote:By "fast but dumb" I mean a really selective search (fast), but little eval knowledge (dumb) - an extremely tactical search. Vice versa for "slow but smart"lucasart wrote:I completely disagree with that statement. Tuning is smart, and adding is dumb! And since when is adding eval terms called the "Diep approach" ? Tuning is much more than just optimizing values. It's also understanding interactions between eval features, and adressing those by trying to minimize the number of eval terms and making them more orthogonal.ZirconiumX wrote:Tuning what you've got is the classical "beancounter" approach with a fast but dumb approach, whereas adding more untuned terms is the Diep approach - slow but smart.
I called it the Diep approach because I couldn't think of any other engines with a massive eval; perhaps "approach used by Diep" would have been better.
I didn't say anything against tuning, I actually tune quite a lot.
Matthew:out
-
- Posts: 1600
- Joined: Mon Feb 21, 2011 9:48 am
Re: Eval development: is it better to tune or add new terms?
I have not looked enough your code, I will.
One way to put those ideas is to make them indirectly,
to score directly (opening, endgame),
done in an intermediate
score on another variable (good_attack example) then as we reach certain goals, the score had to in sections (opening, endgame),
It's nothing new, but it seems very effective, and settings need not be so drastic.
is clear that the test mass of games with different parameters can be refined, but besides expensive (money and boring) is very professional, I do not like. takes away the charm.
One way to put those ideas is to make them indirectly,
to score directly (opening, endgame),
done in an intermediate
score on another variable (good_attack example) then as we reach certain goals, the score had to in sections (opening, endgame),
It's nothing new, but it seems very effective, and settings need not be so drastic.
is clear that the test mass of games with different parameters can be refined, but besides expensive (money and boring) is very professional, I do not like. takes away the charm.