Unfortunately those links don't seem to work for anyone except me. Here are the correct links:
giraffe_te
texelGi
Alternatively you can start from my public dropbox chess folder and browse from there.
Moderators: hgm, Rebel, chrisw
Unfortunately those links don't seem to work for anyone except me. Here are the correct links:
Interesting results.petero2 wrote:I have now created a (hacky) version of texel that uses the giraffe evaluation function, available here.petero2 wrote:I created a modified version of Giraffe that uses the texel evaluation function. The source code is here.
I then played some test games using the following programs:The result was:Code: Select all
giraffe : Giraffe latest version from bitbucket.org (earlier version rated 2457 on CCRL 40/40) giraffe_te : Same Giraffe version but using evaluation from latest texel development version
The time control is expressed in base time in seconds + increment per move in seconds.Code: Select all
prog1 tc1 prog2 tc2 elodiff draws depth1 depth2 nGames giraffe_te 24+0.24 giraffe 24+0.24 -3 13% 17.1 14.1 16172
The following observations can be made:
* Using the texel evaluation function in giraffe has a very small effect on the playing strength, even though it makes giraffe search about 3 ply deeper.
* It would probably be interesting to insert the giraffe evaluation function in texel to be able to compare the evaluation functions in an engine that has a more conventional search function.
The giraffe evaluation function makes the NPS drop by roughly a factor of 10. By playing games between the original texel version and the modified version, I got the following results:The last match used a fixed 500000 nodes/move limit.Code: Select all
prog1 tc1 prog2 tc2 elodiff draws depth1 depth2 nGames texel 6+0.06 texel_gi 6+0.06 358 9% 12.0 8.8 438 texel 60+0.6 texel_gi 60+0.6 255 19% 16.1 12.5 406 texel 6+0.06 texel_gi 60+0.6 -118 21% 12.4 12.9 1396 texel 500kN texel_gi 500kN -104 23% 14.2 14.4 2066
From the results it can be seen that the giraffe evaluation function makes texel around 250-350 elo weaker depending on time control. This is caused by the giraffe evaluation function being very slow. If it was somehow possible to make the giraffe evaluation function run as fast as the texel evaluation function, the giraffe eval version would actually be around 100-120 elo stronger than the texel eval version.
Whether future hardware and software improvements will make it possible to run an ANN evaluator as quickly as a traditional evaluator remains to be seen.
Will be nice and sure there is something to be won. But in Andscacs I have several improvements over pawn structure evaluation that depend on the other pieces, so at the end to have a good pawn eval you need more info than pawns alone.lucasart wrote: Perhaps the place where DNN would be useful is the pawn evaluation, rather than the entire eval. That's because you have the pawn hash table to reduce the slowdown by an order of magnitude. Intuitively, the idea of using DNN for analysing pawn patterns seems logical. If they are so good at image recognition, they should be good at evaluating pawn structures.
Possibly. It is worth noting however that texel is 250 elo behind stockfish 8 on ccrl 40/40. If we assume that half of that is caused by inferior evaluation, an estimate would be that a 10x faster giraffe eval function would be about the same strength as the stockfish evaluation function.lucasart wrote:Interesting results.petero2 wrote:From the results it can be seen that the giraffe evaluation function makes texel around 250-350 elo weaker depending on time control. This is caused by the giraffe evaluation function being very slow. If it was somehow possible to make the giraffe evaluation function run as fast as the texel evaluation function, the giraffe eval version would actually be around 100-120 elo stronger than the texel eval version.
Whether future hardware and software improvements will make it possible to run an ANN evaluator as quickly as a traditional evaluator remains to be seen.
Perhaps the place where DNN would be useful is the pawn evaluation, rather than the entire eval. That's because you have the pawn hash table to reduce the slowdown by an order of magnitude. Intuitively, the idea of using DNN for analysing pawn patterns seems logical. If they are so good at image recognition, they should be good at evaluating pawn structures.
Note that these were hyper bullet games though. The time control was 6s+0.06s/move which corresponds to about 0.14 seconds/move on average.jorose wrote:This is really interesting to me. I find it really interesting that this factor 10 speed handicap seems to result in an Elo delta of 450 points
I did not change any search related things at all. The two evaluation functions may have been more compatible though than two evaluation functions chosen from random chess programs. This is because both the texel and the giraffe evaluation functions have been calibrated against an "estimated score" scale. For texel the formula is:How did you deal with things like futility and razoring margins?
Code: Select all
expected score = 1/(1+10^(-1.13 * texel_score / 400))
Code: Select all
expected score = (10000 + giraffe_score) / 20000
Code: Select all
texel_score = -log(20000 / (10000 + giraffe_score) - 1) / log(10) / 1.13 * 400
~= -log(20000 / (10000 + giraffe_score) - 1) / .0065048
Possibly, but it is also possible that the giraffe eval has learned to recognize some simple tactical patterns. I really have no idea what it does.It would be interesting if we could discover patterns of where Giraffe performs well, I would imagine positions with complex tactics to be tough for the slow eval function.