stockfish development testing

Discussion of chess software programming and technical issues.

Moderator: Ras

benstoker
Posts: 342
Joined: Tue Jan 19, 2010 2:05 am

stockfish development testing

Post by benstoker »

I appreciate how daunting the task of chess engine testing is and that to do it well, developers need a whole lot of cpus to crunching. Does the stockfish devel team have access to clusters like Dr. Hyatt has? How do they accomplish the necessary testing?

What if the developers had available a pool of volunteers willing to devote their CPUs to testing for the stockfish team? Would that be beneficial to them? The team could assign test tasks to volunteer testers, under specified testing conditions. For instance, Dr. Hyatt recently ran tests on the SE in stockfish, comparing the engine with it on and with it off. It seems to me that it would be quite handy to be able to send "outsource" a test assignment to one or more volunteers to get quicker feedback on the validity of new search and eval techniques.
zamar
Posts: 613
Joined: Sun Jan 18, 2009 7:03 am

Re: stockfish development testing

Post by zamar »

benstoker wrote:I appreciate how daunting the task of chess engine testing is and that to do it well, developers need a whole lot of cpus to crunching. Does the stockfish devel team have access to clusters like Dr. Hyatt has? How do they accomplish the necessary testing?
I have quad, Marco has quad, Tord has quad. And we have a few reliable volunteer beta tester we use just before a release. That's it! We'd love to have cluster of course! ;-)
What if the developers had available a pool of volunteers willing to devote their CPUs to testing for the stockfish team? Would that be beneficial to them? The team could assign test tasks to volunteer testers, under specified testing conditions. For instance, Dr. Hyatt recently ran tests on the SE in stockfish, comparing the engine with it on and with it off. It seems to me that it would be quite handy to be able to send "outsource" a test assignment to one or more volunteers to get quicker feedback on the validity of new search and eval techniques.
Sounds nice, but:

a) who is going to organize this?
b) running test sounds simple, but it's easy to make mistakes. How do we know who is actually capable of doing reliable testing?
c) Who can we trust? One cheater could easily spoil all testing.
d) Many people would probably sign in and get bored in a week or two.

I've considered your proposition for some time, but there are many practical problems to solve...
Joona Kiiski
benstoker
Posts: 342
Joined: Tue Jan 19, 2010 2:05 am

Re: stockfish development testing

Post by benstoker »

zamar wrote:
benstoker wrote:I appreciate how daunting the task of chess engine testing is and that to do it well, developers need a whole lot of cpus to crunching. Does the stockfish devel team have access to clusters like Dr. Hyatt has? How do they accomplish the necessary testing?
I have quad, Marco has quad, Tord has quad. And we have a few reliable volunteer beta tester we use just before a release. That's it! We'd love to have cluster of course! ;-)
What if the developers had available a pool of volunteers willing to devote their CPUs to testing for the stockfish team? Would that be beneficial to them? The team could assign test tasks to volunteer testers, under specified testing conditions. For instance, Dr. Hyatt recently ran tests on the SE in stockfish, comparing the engine with it on and with it off. It seems to me that it would be quite handy to be able to send "outsource" a test assignment to one or more volunteers to get quicker feedback on the validity of new search and eval techniques.
Sounds nice, but:

a) who is going to organize this?
b) running test sounds simple, but it's easy to make mistakes. How do we know who is actually capable of doing reliable testing?
c) Who can we trust? One cheater could easily spoil all testing.
d) Many people would probably sign in and get bored in a week or two.

I've considered your proposition for some time, but there are many practical problems to solve...
Maybe the way to do it is for charitable linux brethren to simply add an account to their box so you can ssh in and run and control your own tests during the dark of night.