uci eval command

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

elcabesa
Posts: 855
Joined: Sun May 23, 2010 1:32 pm

Re: UCI eval command.

Post by elcabesa »

Robert Pope wrote:On a related note, I would like to run a large number of epd positions through one of the better engines and collect the static evaluation (or, barring that, a one ply search), so I can identify positions where my own program's evaluation is significantly different. Is there maybe already a tool that could do that?
I have done some test, but up to now I don't have a tool to do it automatically for every engine
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: uci eval command

Post by bob »

michiguel wrote:
elcabesa wrote:hi,
Stockfish implement a non standard uci command "Eval" that help debugging the engine. I have impemented it in Vajolet too.

The Eval debug print the static eval of the position and it also print some term of th evaluation. Do you know any other engine (UCI if possibile) implementing this command or a sort of it?

I know I can use a search at depth1 and take the cp but it's not like the static eval :)
Gaviota, the commands are "score" and "items"

I am sure Crafty has a similar thing too.

That is not a "non standard UCI command". It is just not a UCI command at all (any engine can have that regardless of protocol).

Miguel
I just did a quick check. Crafty has had this as far back as I looked. I looked at the 19.7/19.7beta versions since they were handy and they had it, although there were no separate MG/EG/composite scores shown since those versions didn't do the fruit-like interpolation. That was added in version 22.2, which also goes back many years. But it broke the scores out in more detail as today's Crafty still does.

I did this for debugging more than anything else.
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI eval command.

Post by Ferdy »

Robert Pope wrote:On a related note, I would like to run a large number of epd positions through one of the better engines and collect the static evaluation (or, barring that, a one ply search), so I can identify positions where my own program's evaluation is significantly different. Is there maybe already a tool that could do that?
I have such a tool, it only supports uci engines.
What I did is collect pgn games, convert it to epd by pgn2fen program. Then send those positions to sf6 and send the eval command, process the returned value and append it to new epd as Opcode c7, like below.

Code: Select all

r1bqr1k1/ppp2ppp/2n2n2/2Pp4/3P1b2/5N2/PP1NBPPP/R1BQR1K1 w - - fmvn 12; hmvc 3; pm Nf1; c7 "-33";
That value is now side POV and in cp, which is different in sf6 which is white POV and in 1 pawn unit.
After creating those epds, I run my engine thru a program parsing those epds and compare my engine's score to that of value in c7. My engine will accept eval command and will reply with,
evalscore cp 120
That is side POV. I have some conditions when to output the interesting positions. If eval of sf is negative and my eval is positive and the difference is 100 cp or more and eval of sf is within +/-500 cp, then save it.
Sample.
[d]r3brk1/pp2qpp1/2n4B/2b1p3/8/3BP1Q1/PPP2PPP/R4RK1 b - - fmvn 17; hmvc 0; pm f6; c7 "-109"; c8 "168";
My eval is 168, sf is -109. Who is right?
Runing sf in analysis mode (not yet automatic), it seems my eval is right. But that is only 1 example, there a lot of cases where my engine is wrong.
Stockfish 6 64 POPCNT:

Code: Select all

 23/37	00:20	 22,931k	1,117k	+1.90	1. ... f6 2.Be4 Rd8 3.Rad1 Bh5 4.Bd5+ Kh7 5.Qh3 Kxh6 6.g4 g6
 24/37-	00:24	 27,423k	1,126k	+1.84	1. ... f6 2.Be4
What I have observed looking at these positions is that, sf generally will avoid saying that its eval is good.
It is better to say that I am bad in eval because possibly the engine will try to work its search hard to find the best move and improve its position.

Another example.
[d]1R1b1r1k/3q2pp/3p4/p1p1p3/P3P3/1Q2Bn1P/1P3P2/2R2K2 w - - fmvn 41; hmvc 0; pm Ke2; c7 "-108"; c8 "50";

BTW I do not include positions where the pm in the epd is a capture, a promote and a check, thru detection of characters x, = and +.
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI eval command.

Post by Ferdy »

This will save eval of sf in an epd. The eval is converted in cp and side POV, like the ff. The sf eval is in c7 opcode.

Code: Select all

r1bqr1k1/ppp2ppp/2n2n2/2Pp4/3P1b2/5N2/PP1NBPPP/R1BQR1K1 w - - fmvn 12; hmvc 3; pm Nf1; c7 "-33";
The original epd is an output from pgn2fen program, like the ff.

Code: Select all

r1bqr1k1/ppp2ppp/2n2n2/2Pp4/3P1b2/5N2/PP1NBPPP/R1BQR1K1 w - - fmvn 12; hmvc 3; pm Nf1;
It is important that the epd has fmvn, because the program will not consider saving epd's with fmvn below 12.

Sample command-line in batch file.

Code: Select all

SaveEngineEval -f "sample_from_pgn2fen.epd" -e "Stockfish 6.exe" -h 64 --output "sf_output_from_sample.epd"
To control num_threads use the option -t, as in

Code: Select all

SaveEngineEval -f "sample_from_pgn2fen.epd" -e "Stockfish 6.exe" -h 64 --output "sf_output_from_sample.epd" -t 1
It might happen that future version of sf will change default threads to more than 1.

-f is the option for input epd file.

This program is only capable of reading the sf output string after std uci position fen command and eval command

Code: Select all

Total Evaluation: 0.07 (white side)
Download:
http://www.mediafire.com/download/sn8op ... alPack.rar
Robert Pope
Posts: 558
Joined: Sat Mar 25, 2006 8:27 pm

Re: UCI eval command.

Post by Robert Pope »

This is great, thanks! Only problem is it is choking on my c0 comments, but I can strip those out.

Code: Select all

1B1b1rk1/3P1b2/p4p1p/3N2p1/1Ppp4/7P/2P2PP1/RR4K1 b - - c0 Tytan 9.32-x64-npmess ver 0.9.2.5 4th division WBEC Edition 19 PHENOM_QUAD 2012.02.25; fmvn 12; hm 71; res 1-0;
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI eval command.

Post by Ferdy »

Robert Pope wrote:This is great, thanks! Only problem is it is choking on my c0 comments, but I can strip those out.

Code: Select all

1B1b1rk1/3P1b2/p4p1p/3N2p1/1Ppp4/7P/2P2PP1/RR4K1 b - - c0 Tytan 9.32-x64-npmess ver 0.9.2.5 4th division WBEC Edition 19 PHENOM_QUAD 2012.02.25; fmvn 12; hm 71; res 1-0;
Put the fmvn in front like the ff, as this is the output of pgn2fen that I use. Later I will revise this, so that it will be read wherever fmvn is located.

Code: Select all

1B1b1rk1/3P1b2/p4p1p/3N2p1/1Ppp4/7P/2P2PP1/RR4K1 b - - fmvn 12; c0 "Tytan 9.32-x64-npmess ver 0.9.2.5 4th division WBEC Edition 19 PHENOM_QUAD 2012.02.25"; hm 71; res 1-0;
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI eval command.

Post by Ferdy »

Here v2 with improved fmvn parsing.

Download:
http://www.mediafire.com/download/h62m3 ... v2Pack.rar
elcabesa
Posts: 855
Joined: Sun May 23, 2010 1:32 pm

Re: UCI eval command.

Post by elcabesa »

this post is just to report some result I got comparing eval of egines

standard deviation of eval of a selected pool of positions

stockfish6 vs stockfish -> 0.20
critter vs stockfish6 -> 0.62
vajolet vs stockfish6 -> 0.7
gaviot vs stockfish6 -> 1.23

this is incidentally the same order of elo strenght that we have in CCRL, can we draw any conclusion?? I don't think so
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: UCI eval command.

Post by Ferdy »

elcabesa wrote:this post is just to report some result I got comparing eval of egines

standard deviation of eval of a selected pool of positions

stockfish6 vs stockfish -> 0.20
critter vs stockfish6 -> 0.62
vajolet vs stockfish6 -> 0.7
gaviot vs stockfish6 -> 1.23

this is incidentally the same order of elo strenght that we have in CCRL, can we draw any conclusion?? I don't think so
Which version is stockfish in stockfish6 vs stockfish -> 0.20?

What do you mean by "same order of elo strenght"?
elcabesa
Posts: 855
Joined: Sun May 23, 2010 1:32 pm

Re: UCI eval command.

Post by elcabesa »

Ferdy wrote:
elcabesa wrote:this post is just to report some result I got comparing eval of egines

standard deviation of eval of a selected pool of positions

stockfish6 vs stockfish -> 0.20
critter vs stockfish6 -> 0.62
vajolet vs stockfish6 -> 0.7
gaviota vs stockfish6 -> 1.23

this is incidentally the same order of elo strenght that we have in CCRL, can we draw any conclusion?? I don't think so
Which version is stockfish in stockfish6 vs stockfish -> 0.20?

What do you mean by "same order of elo strenght"?
1) stockfish 6 vs stockfish_14053109
2) I considered the elo strenght in CCRL 40 1 cpu, I considered stockfish evalutation to be the strongest and evalutated the strenght of the other looking at the std deviation from stockfish.
It seems that the higher the standard deviation, the weaker is the engine.