Stockfish testing: one question

Discussion of chess software programming and technical issues.

Moderator: Ras

Jouni
Posts: 3792
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Stockfish testing: one question

Post by Jouni »

AFAIK all testing is so far selftest. Have there been serious consideration about testing with foreign engines? There is no shortage from strong and free opponents. Lets' use Rybka, Critter, Houdini, Spike and Protector etc. Of course it takes same days to get reference score to SF3. After that maybe more effective to find real improvements?
Jouni
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Stockfish testing: one question

Post by zamar »

- So far I haven't seen a single example of the case when patch would do well in self-play, but fail against other opponents. At least if such cases exist, they are very rear.

- Gauntlet requires 2x more games, and still error bars are sqrt(2) times higher. Very bad trade.

- In self-play the ELO change is around 2x compared to matches against other engines. This is a very good thing for determining small improvements. To get the same resolution in gauntlets, we would 4x more games.
Joona Kiiski
Uri Blass
Posts: 11150
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish testing: one question

Post by Uri Blass »

Note that I also saw no example when A>B>C>A in selftesting.

I made some improvement in the mobility evaluation of stockfish by changing
the mobility array.

Let call the mobility vector M
I simply changed that array and let call the new vector M+d

I thought to try changing it again to the same direction and test M+3d against M+d but got objection because of the claim that I may fall into the trap A>B>C>A and they cannot do regression tests for every change.

I would like to know if there is a single case in computer chess that somebody got significant result of A beat B, B beat C and C beat A.

In theory it can happen but I do not know about a single case.
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: Stockfish testing: one question

Post by zamar »

Uri Blass wrote:Note that I also saw no example when A>B>C>A in selftesting.

I made some improvement in the mobility evaluation of stockfish by changing
the mobility array.

Let call the mobility vector M
I simply changed that array and let call the new vector M+d

I thought to try changing it again to the same direction and test M+3d against M+d but got objection because of the claim that I may fall into the trap A>B>C>A and they cannot do regression tests for every change.

I would like to know if there is a single case in computer chess that somebody got significant result of A beat B, B beat C and C beat A.

In theory it can happen but I do not know about a single case.
I think it's completely logical to schedule M+3d against M+d. But I haven't objected this at any occasion.
Joona Kiiski
Uri Blass
Posts: 11150
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish testing: one question

Post by Uri Blass »

Correct

It was Marco's opinion and because Marco has the final word
I decided not even to try it.