I have done some test, but up to now I don't have a tool to do it automatically for every engineRobert Pope wrote:On a related note, I would like to run a large number of epd positions through one of the better engines and collect the static evaluation (or, barring that, a one ply search), so I can identify positions where my own program's evaluation is significantly different. Is there maybe already a tool that could do that?
uci eval command
Moderators: hgm, Rebel, chrisw
-
- Posts: 855
- Joined: Sun May 23, 2010 1:32 pm
Re: UCI eval command.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: uci eval command
I just did a quick check. Crafty has had this as far back as I looked. I looked at the 19.7/19.7beta versions since they were handy and they had it, although there were no separate MG/EG/composite scores shown since those versions didn't do the fruit-like interpolation. That was added in version 22.2, which also goes back many years. But it broke the scores out in more detail as today's Crafty still does.michiguel wrote:Gaviota, the commands are "score" and "items"elcabesa wrote:hi,
Stockfish implement a non standard uci command "Eval" that help debugging the engine. I have impemented it in Vajolet too.
The Eval debug print the static eval of the position and it also print some term of th evaluation. Do you know any other engine (UCI if possibile) implementing this command or a sort of it?
I know I can use a search at depth1 and take the cp but it's not like the static eval
I am sure Crafty has a similar thing too.
That is not a "non standard UCI command". It is just not a UCI command at all (any engine can have that regardless of protocol).
Miguel
I did this for debugging more than anything else.
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: UCI eval command.
I have such a tool, it only supports uci engines.Robert Pope wrote:On a related note, I would like to run a large number of epd positions through one of the better engines and collect the static evaluation (or, barring that, a one ply search), so I can identify positions where my own program's evaluation is significantly different. Is there maybe already a tool that could do that?
What I did is collect pgn games, convert it to epd by pgn2fen program. Then send those positions to sf6 and send the eval command, process the returned value and append it to new epd as Opcode c7, like below.
Code: Select all
r1bqr1k1/ppp2ppp/2n2n2/2Pp4/3P1b2/5N2/PP1NBPPP/R1BQR1K1 w - - fmvn 12; hmvc 3; pm Nf1; c7 "-33";
After creating those epds, I run my engine thru a program parsing those epds and compare my engine's score to that of value in c7. My engine will accept eval command and will reply with,
That is side POV. I have some conditions when to output the interesting positions. If eval of sf is negative and my eval is positive and the difference is 100 cp or more and eval of sf is within +/-500 cp, then save it.evalscore cp 120
Sample.
[d]r3brk1/pp2qpp1/2n4B/2b1p3/8/3BP1Q1/PPP2PPP/R4RK1 b - - fmvn 17; hmvc 0; pm f6; c7 "-109"; c8 "168";
My eval is 168, sf is -109. Who is right?
Runing sf in analysis mode (not yet automatic), it seems my eval is right. But that is only 1 example, there a lot of cases where my engine is wrong.
Stockfish 6 64 POPCNT:
Code: Select all
23/37 00:20 22,931k 1,117k +1.90 1. ... f6 2.Be4 Rd8 3.Rad1 Bh5 4.Bd5+ Kh7 5.Qh3 Kxh6 6.g4 g6
24/37- 00:24 27,423k 1,126k +1.84 1. ... f6 2.Be4
It is better to say that I am bad in eval because possibly the engine will try to work its search hard to find the best move and improve its position.
Another example.
[d]1R1b1r1k/3q2pp/3p4/p1p1p3/P3P3/1Q2Bn1P/1P3P2/2R2K2 w - - fmvn 41; hmvc 0; pm Ke2; c7 "-108"; c8 "50";
BTW I do not include positions where the pm in the epd is a capture, a promote and a check, thru detection of characters x, = and +.
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: UCI eval command.
This will save eval of sf in an epd. The eval is converted in cp and side POV, like the ff. The sf eval is in c7 opcode.
The original epd is an output from pgn2fen program, like the ff.
It is important that the epd has fmvn, because the program will not consider saving epd's with fmvn below 12.
Sample command-line in batch file.
To control num_threads use the option -t, as in
It might happen that future version of sf will change default threads to more than 1.
-f is the option for input epd file.
This program is only capable of reading the sf output string after std uci position fen command and eval command
Download:
http://www.mediafire.com/download/sn8op ... alPack.rar
Code: Select all
r1bqr1k1/ppp2ppp/2n2n2/2Pp4/3P1b2/5N2/PP1NBPPP/R1BQR1K1 w - - fmvn 12; hmvc 3; pm Nf1; c7 "-33";
Code: Select all
r1bqr1k1/ppp2ppp/2n2n2/2Pp4/3P1b2/5N2/PP1NBPPP/R1BQR1K1 w - - fmvn 12; hmvc 3; pm Nf1;
Sample command-line in batch file.
Code: Select all
SaveEngineEval -f "sample_from_pgn2fen.epd" -e "Stockfish 6.exe" -h 64 --output "sf_output_from_sample.epd"
Code: Select all
SaveEngineEval -f "sample_from_pgn2fen.epd" -e "Stockfish 6.exe" -h 64 --output "sf_output_from_sample.epd" -t 1
-f is the option for input epd file.
This program is only capable of reading the sf output string after std uci position fen command and eval command
Code: Select all
Total Evaluation: 0.07 (white side)
http://www.mediafire.com/download/sn8op ... alPack.rar
-
- Posts: 558
- Joined: Sat Mar 25, 2006 8:27 pm
Re: UCI eval command.
This is great, thanks! Only problem is it is choking on my c0 comments, but I can strip those out.
Code: Select all
1B1b1rk1/3P1b2/p4p1p/3N2p1/1Ppp4/7P/2P2PP1/RR4K1 b - - c0 Tytan 9.32-x64-npmess ver 0.9.2.5 4th division WBEC Edition 19 PHENOM_QUAD 2012.02.25; fmvn 12; hm 71; res 1-0;
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: UCI eval command.
Put the fmvn in front like the ff, as this is the output of pgn2fen that I use. Later I will revise this, so that it will be read wherever fmvn is located.Robert Pope wrote:This is great, thanks! Only problem is it is choking on my c0 comments, but I can strip those out.
Code: Select all
1B1b1rk1/3P1b2/p4p1p/3N2p1/1Ppp4/7P/2P2PP1/RR4K1 b - - c0 Tytan 9.32-x64-npmess ver 0.9.2.5 4th division WBEC Edition 19 PHENOM_QUAD 2012.02.25; fmvn 12; hm 71; res 1-0;
Code: Select all
1B1b1rk1/3P1b2/p4p1p/3N2p1/1Ppp4/7P/2P2PP1/RR4K1 b - - fmvn 12; c0 "Tytan 9.32-x64-npmess ver 0.9.2.5 4th division WBEC Edition 19 PHENOM_QUAD 2012.02.25"; hm 71; res 1-0;
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: UCI eval command.
Here v2 with improved fmvn parsing.
Download:
http://www.mediafire.com/download/h62m3 ... v2Pack.rar
Download:
http://www.mediafire.com/download/h62m3 ... v2Pack.rar
-
- Posts: 855
- Joined: Sun May 23, 2010 1:32 pm
Re: UCI eval command.
this post is just to report some result I got comparing eval of egines
standard deviation of eval of a selected pool of positions
stockfish6 vs stockfish -> 0.20
critter vs stockfish6 -> 0.62
vajolet vs stockfish6 -> 0.7
gaviot vs stockfish6 -> 1.23
this is incidentally the same order of elo strenght that we have in CCRL, can we draw any conclusion?? I don't think so
standard deviation of eval of a selected pool of positions
stockfish6 vs stockfish -> 0.20
critter vs stockfish6 -> 0.62
vajolet vs stockfish6 -> 0.7
gaviot vs stockfish6 -> 1.23
this is incidentally the same order of elo strenght that we have in CCRL, can we draw any conclusion?? I don't think so
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: UCI eval command.
Which version is stockfish in stockfish6 vs stockfish -> 0.20?elcabesa wrote:this post is just to report some result I got comparing eval of egines
standard deviation of eval of a selected pool of positions
stockfish6 vs stockfish -> 0.20
critter vs stockfish6 -> 0.62
vajolet vs stockfish6 -> 0.7
gaviot vs stockfish6 -> 1.23
this is incidentally the same order of elo strenght that we have in CCRL, can we draw any conclusion?? I don't think so
What do you mean by "same order of elo strenght"?
-
- Posts: 855
- Joined: Sun May 23, 2010 1:32 pm
Re: UCI eval command.
1) stockfish 6 vs stockfish_14053109Ferdy wrote:Which version is stockfish in stockfish6 vs stockfish -> 0.20?elcabesa wrote:this post is just to report some result I got comparing eval of egines
standard deviation of eval of a selected pool of positions
stockfish6 vs stockfish -> 0.20
critter vs stockfish6 -> 0.62
vajolet vs stockfish6 -> 0.7
gaviota vs stockfish6 -> 1.23
this is incidentally the same order of elo strenght that we have in CCRL, can we draw any conclusion?? I don't think so
What do you mean by "same order of elo strenght"?
2) I considered the elo strenght in CCRL 40 1 cpu, I considered stockfish evalutation to be the strongest and evalutated the strenght of the other looking at the std deviation from stockfish.
It seems that the higher the standard deviation, the weaker is the engine.