I recall a tournament where all the engines (mostly Polish ones + Glaurung by Tord Romstad) were competing using simplified evaluation function described here: https://chessprogramming.wikispaces.com ... n+function
From what I remember, Glaurung won as expected, but its advantage was clearly lesser that advantage of normal Glaurung over normal versions of opposing programs.
Evaluation values help
Moderators: hgm, Rebel, chrisw
-
- Posts: 895
- Joined: Mon Jan 15, 2007 11:23 am
- Location: Warsza
-
- Posts: 199
- Joined: Sun Nov 03, 2013 9:32 am
Re: Evaluation values help
I did a test on my program: Swapped the eval for a simple pst only version.
The results on a middle game position to 11 ply.
Complex __________ Simple
---------------------------------------
nodes: 3,289,145 _____ 4,773,066 (70%)
time: 73 secs _____ 69 secs
move: e3e4 _____ e3e4
Bcuts: 2,549,650 _____ 3,761,488 (68%)
nullcuts: 11,784 _____ 15,049 (78%)
TThits: 38,948 _____ 51,745 (75%)
NPS: 45,057 _____ 69,175
So the simple version is quicker (nps) obviously. The Bcuts/Nullcuts/TThits is roughly the same proportion as the total nodes visited but total time to search is about the same. I can only assume that a more accurate eval function makes better search decisions.
Has anyone got any more thoughts on this, or an explanation ?
Regards
Laurie
The results on a middle game position to 11 ply.
Complex __________ Simple
---------------------------------------
nodes: 3,289,145 _____ 4,773,066 (70%)
time: 73 secs _____ 69 secs
move: e3e4 _____ e3e4
Bcuts: 2,549,650 _____ 3,761,488 (68%)
nullcuts: 11,784 _____ 15,049 (78%)
TThits: 38,948 _____ 51,745 (75%)
NPS: 45,057 _____ 69,175
So the simple version is quicker (nps) obviously. The Bcuts/Nullcuts/TThits is roughly the same proportion as the total nodes visited but total time to search is about the same. I can only assume that a more accurate eval function makes better search decisions.
Has anyone got any more thoughts on this, or an explanation ?
Regards
Laurie
-
- Posts: 12662
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Evaluation values help
Is it possible that the advanced version has an eval term with the sign flipped (e.g. solving the term for white when it should be black)?lauriet wrote:I did a test on my program: Swapped the eval for a simple pst only version.
The results on a middle game position to 11 ply.
Complex __________ Simple
---------------------------------------
nodes: 3,289,145 _____ 4,773,066 (70%)
time: 73 secs _____ 69 secs
move: e3e4 _____ e3e4
Bcuts: 2,549,650 _____ 3,761,488 (68%)
nullcuts: 11,784 _____ 15,049 (78%)
TThits: 38,948 _____ 51,745 (75%)
NPS: 45,057 _____ 69,175
So the simple version is quicker (nps) obviously. The Bcuts/Nullcuts/TThits is roughly the same proportion as the total nodes visited but total time to search is about the same. I can only assume that a more accurate eval function makes better search decisions.
Has anyone got any more thoughts on this, or an explanation ?
Regards
Laurie
Try testing these positions, which are simply color inverted and rotated:
8/pp6/1p2p1p1/Pk4PP/8/8/1pPK4/8 w - - bm c4+; ce 11492; pm c4+; pv c4+; id "P1b";
8/6pp/1p1p2p1/PP4kP/8/8/4KPp1/8 w - - bm f4+; ce 11492; pm f4+; pv f4+; id "P1a";
8/4kpP1/8/8/pp4Kp/1P1P2P1/6PP/8 b - - bm f5+; ce 11492; pm f5+; pv f5+; id "P1c";
8/1Ppk4/8/8/pK4pp/1P2P1P1/PP6/8 b - - bm c5+; ce 11492; pm c5+; pv c5+; id "P1d";
You should have the same eval exactly for all of them. Search may differ a bit because move order might not be identical, but it should be extremely similar.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 12662
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Evaluation values help
These are probably better, because they should show most features of an eval:
[d]r2kq3/p2nb3/8/8/6Pp/8/P2NB3/R2KQ3 b - g3 ce 63; pv Rc8 Ne4 Qh8 Qa5+ Rc7 Rc1 Qd4+ Nd2 Nc5 Qc3 Qg1+ Kc2 Qxc1+ Kxc1 Nd3+ Kc2 Nb4+ Kb3 Rxc3+ Kxc3 Nxa2+ Kb2 Nb4 Kc3 Kc7 Nf3 a5 g5 h3 Bf1 Nd5+ Kd4 Nf4 g6 a4 g7 Ne6+ Kc3 h2 g8=Q h1=Q Qxe6 Qxf3+ Bd3 ; bm Rc8; id "P1d";
[d]r2kq3/p2nb3/8/6pP/8/8/P2NB3/R2KQ3 w - g6 ce 63; pv Rc1 Ne5 Qh1 Qa4+ Rc2 Rc8 Qd5+ Nd7 Nc4 Qc6 Qg8+ Kc7 Qxc8+ Kxc8 Nd6+ Kc7 Nb5+ Kb6 Rxc6+ Kxc6 Nxa7+ Kb7 Nb5 Kc6 Kc2 Nf6 a4 g4 h6 Bf8 Nd4+ Kd5 Nf5 g3 a5 g2 Ne3+ Kc6 h7 g1=Q h8=Q Qxe3 Qxf6+ Bd6 ; bm Rc1; id "P1b";
[d]3qk2r/3bn2p/8/Pp6/8/8/3BN2P/3QK2R w Kk b6 ce 63; pv Rf1 Nd5 Qa1 Qh4+ Rf2 Rf8 Qe5+ Ne7 Nf4 Qf6 Qb8+ Kf7 Qxf8+ Kxf8 Ne6+ Kf7 Ng5+ Kg6 Rxf6+ Kxf6 Nxh7+ Kg7 Ng5 Kf6 Kf2 Nc6 h4 b4 a6 Bc8 Ne4+ Ke5 Nc5 b3 h5 b2 Nd3+ Kf6 a7 b1=Q a8=Q Qxd3 Qxc6+ Be6 ; bm Rf1; id "P1a";
[d]3qk2r/3bn2p/8/8/pP6/8/3BN2P/3QK2R b Kk b3 ce 63; pv Rf8 Nd4 Qa8 Qh5+ Rf7 Rf1 Qe4+ Ne2 Nf5 Qf3 Qb1+ Kf2 Qxf1+ Kxf1 Ne3+ Kf2 Ng4+ Kg3 Rxf3+ Kxf3 Nxh2+ Kg2 Ng4 Kf3 Kf7 Nc3 h5 b5 a3 Bc1 Ne5+ Ke4 Nc4 b6 h4 b7 Nd6+ Kf3 a2 b8=Q a1=Q Qxd6 Qxc3+ Be3 ; bm Rf8; id "P1c";
These can have a difference in that castling is only valid for half of them.
But they should be identical in pairs.
[d]r2kq3/p2nb3/8/8/6Pp/8/P2NB3/R2KQ3 b - g3 ce 63; pv Rc8 Ne4 Qh8 Qa5+ Rc7 Rc1 Qd4+ Nd2 Nc5 Qc3 Qg1+ Kc2 Qxc1+ Kxc1 Nd3+ Kc2 Nb4+ Kb3 Rxc3+ Kxc3 Nxa2+ Kb2 Nb4 Kc3 Kc7 Nf3 a5 g5 h3 Bf1 Nd5+ Kd4 Nf4 g6 a4 g7 Ne6+ Kc3 h2 g8=Q h1=Q Qxe6 Qxf3+ Bd3 ; bm Rc8; id "P1d";
[d]r2kq3/p2nb3/8/6pP/8/8/P2NB3/R2KQ3 w - g6 ce 63; pv Rc1 Ne5 Qh1 Qa4+ Rc2 Rc8 Qd5+ Nd7 Nc4 Qc6 Qg8+ Kc7 Qxc8+ Kxc8 Nd6+ Kc7 Nb5+ Kb6 Rxc6+ Kxc6 Nxa7+ Kb7 Nb5 Kc6 Kc2 Nf6 a4 g4 h6 Bf8 Nd4+ Kd5 Nf5 g3 a5 g2 Ne3+ Kc6 h7 g1=Q h8=Q Qxe3 Qxf6+ Bd6 ; bm Rc1; id "P1b";
[d]3qk2r/3bn2p/8/Pp6/8/8/3BN2P/3QK2R w Kk b6 ce 63; pv Rf1 Nd5 Qa1 Qh4+ Rf2 Rf8 Qe5+ Ne7 Nf4 Qf6 Qb8+ Kf7 Qxf8+ Kxf8 Ne6+ Kf7 Ng5+ Kg6 Rxf6+ Kxf6 Nxh7+ Kg7 Ng5 Kf6 Kf2 Nc6 h4 b4 a6 Bc8 Ne4+ Ke5 Nc5 b3 h5 b2 Nd3+ Kf6 a7 b1=Q a8=Q Qxd3 Qxc6+ Be6 ; bm Rf1; id "P1a";
[d]3qk2r/3bn2p/8/8/pP6/8/3BN2P/3QK2R b Kk b3 ce 63; pv Rf8 Nd4 Qa8 Qh5+ Rf7 Rf1 Qe4+ Ne2 Nf5 Qf3 Qb1+ Kf2 Qxf1+ Kxf1 Ne3+ Kf2 Ng4+ Kg3 Rxf3+ Kxf3 Nxh2+ Kg2 Ng4 Kf3 Kf7 Nc3 h5 b5 a3 Bc1 Ne5+ Ke4 Nc4 b6 h4 b7 Nd6+ Kf3 a2 b8=Q a1=Q Qxd6 Qxc3+ Be3 ; bm Rf8; id "P1c";
These can have a difference in that castling is only valid for half of them.
But they should be identical in pairs.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 199
- Joined: Sun Nov 03, 2013 9:32 am
Re: Evaluation values help
I tried a couple of these test positions with mixed results.
Normal Eval_____Simple Eval
----------------------------------------
8761148 nodes_____3210065 nodes (37%)
154 seconds______48 seconds (31%)
move D7B6______move E7F6
56890 nps______66876 nps
10295830 nodes_____5268232 nodes (51%)
135 seconds________63 seconds (47%)
move D2C3______move D2C3
76265 nps_____77474 nps
So it seems the reduced times are mostly due to the reduced nodes visited
NPS is pretty close. Once again the different evals must reshape the tree.
I was expecting that the simple eval would just speed up the search, maybe allowing an extra ply or two, but it seems that its not so straight forward.
Maybe I need to play 1000 games........
Laurie
Normal Eval_____Simple Eval
----------------------------------------
8761148 nodes_____3210065 nodes (37%)
154 seconds______48 seconds (31%)
move D7B6______move E7F6
56890 nps______66876 nps
10295830 nodes_____5268232 nodes (51%)
135 seconds________63 seconds (47%)
move D2C3______move D2C3
76265 nps_____77474 nps
So it seems the reduced times are mostly due to the reduced nodes visited
NPS is pretty close. Once again the different evals must reshape the tree.
I was expecting that the simple eval would just speed up the search, maybe allowing an extra ply or two, but it seems that its not so straight forward.
Maybe I need to play 1000 games........
Laurie
-
- Posts: 411
- Joined: Thu Dec 30, 2010 4:48 am
Re: Evaluation values help
It can be helpful when thinking about the size of improvements to view your search with the simplistic formula
Time = Constant * (Branching Factor ^ Depth)
A simpler/faster evaluation function can only help the constant term.
A more complex but more complete evaluation may recognise the solution at an earlier depth, and it may reduce branching factor by helping make better pruning decisions. There is necessarily a lot more gain to be had here.
Time = Constant * (Branching Factor ^ Depth)
A simpler/faster evaluation function can only help the constant term.
A more complex but more complete evaluation may recognise the solution at an earlier depth, and it may reduce branching factor by helping make better pruning decisions. There is necessarily a lot more gain to be had here.
-
- Posts: 12662
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Evaluation values help
Did you test just the eval function with no search?lauriet wrote:I tried a couple of these test positions with mixed results.
Normal Eval_____Simple Eval
----------------------------------------
8761148 nodes_____3210065 nodes (37%)
154 seconds______48 seconds (31%)
move D7B6______move E7F6
56890 nps______66876 nps
10295830 nodes_____5268232 nodes (51%)
135 seconds________63 seconds (47%)
move D2C3______move D2C3
76265 nps_____77474 nps
So it seems the reduced times are mostly due to the reduced nodes visited
NPS is pretty close. Once again the different evals must reshape the tree.
I was expecting that the simple eval would just speed up the search, maybe allowing an extra ply or two, but it seems that its not so straight forward.
Maybe I need to play 1000 games........
Laurie
The first set should give you the exact same number for all 4 positions or something is broken.
The second set should give you two identical pairs of evaluations (unless the castle flag does not come into play, in which all 4 of these should give identical evals).
The idea is to find out if there is an error in the eval or something badly asymmetrical.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 199
- Joined: Sun Nov 03, 2013 9:32 am
Re: Evaluation values help
Sorry Dann I misunderstood your intention.
I will do those tests later tonight.
Thanks
Laurie.
I will do those tests later tonight.
Thanks
Laurie.
-
- Posts: 199
- Joined: Sun Nov 03, 2013 9:32 am
Re: Evaluation values help
Dann Corbit wrote:Is it possible that the advanced version has an eval term with the sign flipped (e.g. solving the term for white when it should be black)?lauriet wrote:I did a test on my program: Swapped the eval for a simple pst only version.
The results on a middle game position to 11 ply.
Complex __________ Simple
---------------------------------------
nodes: 3,289,145 _____ 4,773,066 (70%)
time: 73 secs _____ 69 secs
move: e3e4 _____ e3e4
Bcuts: 2,549,650 _____ 3,761,488 (68%)
nullcuts: 11,784 _____ 15,049 (78%)
TThits: 38,948 _____ 51,745 (75%)
NPS: 45,057 _____ 69,175
So the simple version is quicker (nps) obviously. The Bcuts/Nullcuts/TThits is roughly the same proportion as the total nodes visited but total time to search is about the same. I can only assume that a more accurate eval function makes better search decisions.
Has anyone got any more thoughts on this, or an explanation ?
Regards
Laurie
Try testing these positions, which are simply color inverted and rotated:
8/pp6/1p2p1p1/Pk4PP/8/8/1pPK4/8 w - - bm c4+; ce 11492; pm c4+; pv c4+; id "P1b";
8/6pp/1p1p2p1/PP4kP/8/8/4KPp1/8 w - - bm f4+; ce 11492; pm f4+; pv f4+; id "P1a";
8/4kpP1/8/8/pp4Kp/1P1P2P1/6PP/8 b - - bm f5+; ce 11492; pm f5+; pv f5+; id "P1c";
8/1Ppk4/8/8/pK4pp/1P2P1P1/PP6/8 b - - bm c5+; ce 11492; pm c5+; pv c5+; id "P1d";
You should have the same eval exactly for all of them. Search may differ a bit because move order might not be identical, but it should be extremely similar.
Hi Dann,
I have tried these 4 positions ->
8/pp6/1p2p1p1/Pk4PP/8/8/1pPK4/8 w
8/6pp/1p1p2p1/PP4kP/8/8/4KPp1/8 w;
8/4kpP1/8/8/pp4Kp/1P1P2P1/6PP/8 b
8/1Ppk4/8/8/pK4pp/1P2P1P1/PP6/8 b
and all give the equivalent evaluation score......so I guess my eval function is symetrical and correct.
Regards
Laurie.
-
- Posts: 12662
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Evaluation values help
Is it the same for both the simple and the advanced eval?lauriet wrote:Dann Corbit wrote:Is it possible that the advanced version has an eval term with the sign flipped (e.g. solving the term for white when it should be black)?lauriet wrote:I did a test on my program: Swapped the eval for a simple pst only version.
The results on a middle game position to 11 ply.
Complex __________ Simple
---------------------------------------
nodes: 3,289,145 _____ 4,773,066 (70%)
time: 73 secs _____ 69 secs
move: e3e4 _____ e3e4
Bcuts: 2,549,650 _____ 3,761,488 (68%)
nullcuts: 11,784 _____ 15,049 (78%)
TThits: 38,948 _____ 51,745 (75%)
NPS: 45,057 _____ 69,175
So the simple version is quicker (nps) obviously. The Bcuts/Nullcuts/TThits is roughly the same proportion as the total nodes visited but total time to search is about the same. I can only assume that a more accurate eval function makes better search decisions.
Has anyone got any more thoughts on this, or an explanation ?
Regards
Laurie
Try testing these positions, which are simply color inverted and rotated:
8/pp6/1p2p1p1/Pk4PP/8/8/1pPK4/8 w - - bm c4+; ce 11492; pm c4+; pv c4+; id "P1b";
8/6pp/1p1p2p1/PP4kP/8/8/4KPp1/8 w - - bm f4+; ce 11492; pm f4+; pv f4+; id "P1a";
8/4kpP1/8/8/pp4Kp/1P1P2P1/6PP/8 b - - bm f5+; ce 11492; pm f5+; pv f5+; id "P1c";
8/1Ppk4/8/8/pK4pp/1P2P1P1/PP6/8 b - - bm c5+; ce 11492; pm c5+; pv c5+; id "P1d";
You should have the same eval exactly for all of them. Search may differ a bit because move order might not be identical, but it should be extremely similar.
Hi Dann,
I have tried these 4 positions ->
8/pp6/1p2p1p1/Pk4PP/8/8/1pPK4/8 w
8/6pp/1p1p2p1/PP4kP/8/8/4KPp1/8 w;
8/4kpP1/8/8/pp4Kp/1P1P2P1/6PP/8 b
8/1Ppk4/8/8/pK4pp/1P2P1P1/PP6/8 b
and all give the equivalent evaluation score......so I guess my eval function is symetrical and correct.
Regards
Laurie.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.