Thank you, I'm not too much used to xboard prtocol.
I have just modified a python script to analyze gaviota score output and compare them with stockfish and vajolet ones.
comparing the eval of 2 version of stockfish , vajolet and gaviota with a set of 40k positions I found the following differences between eval:
Code: Select all
stock vs stock
mean:-0.02218114175214394
stdev:0.2074777368943154
vajolet-stock
mean:-0.04282243147805616
stdev:1.2495892743322266
gaviota-stock
mean:-0.04624327391962333
stdev:1.4140427187737394
vajolet-gaviota
mean:-0.003420842441567176
stdev:0.9193769746719599
I just calculate men and standard deviation, it's not very meaningful because I didn't found the best iterpolating line and then calculate the std:dev, but I'll try as soon as possibile
as you can see std::dev between the 2 stockfih version s very low as expected. I'd like to repeat the test with 2 very strong engines but I dont know whether they implements the eval/score command.
as a last resource I can modify/compile GULL/Texel/senpai myself