Multiple change testing

Discussion of chess software programming and technical issues.

Moderators: chrisw, Rebel, Ras

User avatar
shawn
Posts: 95
Joined: Fri Jun 28, 2024 9:24 am
Full name: Wallace Shawn

Re: Multiple change testing

Post by shawn »

Try SPRT https://www.chessprogramming.org/Sequen ... Ratio_Test. This is the better testing method everyone uses nowadays. The inverted results you got with those changes are just an effect of small sample sizes.
User avatar
hgm
Posts: 28268
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Multiple change testing

Post by hgm »

Indeed, if you have infinite computer power available so that you won't have to worry much about efficiency that is a very reliable method.
Viz
Posts: 223
Joined: Tue Apr 09, 2024 6:24 am
Full name: Michael Chaly

Re: Multiple change testing

Post by Viz »

hgm wrote: Mon Jul 22, 2024 5:55 pm Indeed, if you have infinite computer power available so that you won't have to worry much about efficiency that is a very reliable method.
This is THE most efficient method of testing regardless of the amount of compuer power you have, period. Everything else is just bad more or less.
User avatar
hgm
Posts: 28268
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Multiple change testing

Post by hgm »

But the question is: "for what, and under which conditions?". If you would pick patches from a pool that in 75% of the cases would increase the strength by 1 Elo, and in 25% would decrease it by 1 Elo, it would be infinitely faster to just pick 100 patches and accept them all without any testing whatsoever, than to test any of those with SPRT and only accept those that pass. You would have already +50 Elo before the SPRT methodology even gave you +1 Elo...
Viz
Posts: 223
Joined: Tue Apr 09, 2024 6:24 am
Full name: Michael Chaly

Re: Multiple change testing

Post by Viz »

hgm wrote: Mon Jul 22, 2024 8:05 pm But the question is: "for what, and under which conditions?". If you would pick patches from a pool that in 75% of the cases would increase the strength by 1 Elo, and in 25% would decrease it by 1 Elo, it would be infinitely faster to just pick 100 patches and accept them all without any testing whatsoever, than to test any of those with SPRT and only accept those that pass. You would have already +50 Elo before the SPRT methodology even gave you +1 Elo...
And how do you conclude if patch has 75% probability of increasing strength by 1 elo and 25% probability of decreasing it by 1 elo? From your astral spirit vibes?
Even this really specific task is better concluded with, surprise, SPRT.
Make [-1;1] bounds and stop at ln 4 LLR - voila, you have SPRT that concludes EXACTLY what you describe and it's the way to conclude it with playing minimum number of games.
AndrewGrant
Posts: 1934
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Multiple change testing

Post by AndrewGrant »

hgm wrote: Mon Jul 22, 2024 8:05 pm But the question is: "for what, and under which conditions?". If you would pick patches from a pool that in 75% of the cases would increase the strength by 1 Elo, and in 25% would decrease it by 1 Elo, it would be infinitely faster to just pick 100 patches and accept them all without any testing whatsoever, than to test any of those with SPRT and only accept those that pass. You would have already +50 Elo before the SPRT methodology even gave you +1 Elo...
If you can pick patches that are winners at a 3-to-1 rate, then you don't need to test at all...
So it is hardly a counter argument to doing things with meaningful statistical power.
When you can't win an argument, you censor it.
When you can't win an election, you remove your opponents.
Just because you've been doing something for a long time, does not mean you are any good at it.
Ciekce
Posts: 171
Joined: Sun Oct 30, 2022 5:26 pm
Full name: Conor Anstey

Re: Multiple change testing

Post by Ciekce »

if it was not obvious, hgm's opinion here is horribly wrong - another vote for literally the only good option in SPRT
Whiskers
Posts: 231
Joined: Tue Jan 31, 2023 4:34 pm
Full name: Adam Kulju

Re: Multiple change testing

Post by Whiskers »

Ciekce wrote: Mon Jul 22, 2024 10:13 pm if it was not obvious, hgm's opinion here is horribly wrong - another vote for literally the only good option in SPRT
hi ciekce
Viz
Posts: 223
Joined: Tue Apr 09, 2024 6:24 am
Full name: Michael Chaly

Re: Multiple change testing

Post by Viz »

Ciekce wrote: Mon Jul 22, 2024 10:13 pm if it was not obvious, hgm's opinion here is horribly wrong - another vote for literally the only good option in SPRT
There is quite literally no "voting" at this topic.
If any dev claims he can spot gainers with 75% probability he is either a liar or a Jesus.
AndrewGrant
Posts: 1934
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Multiple change testing

Post by AndrewGrant »

Viz wrote: Tue Jul 23, 2024 6:05 am
Ciekce wrote: Mon Jul 22, 2024 10:13 pm if it was not obvious, hgm's opinion here is horribly wrong - another vote for literally the only good option in SPRT
There is quite literally no "voting" at this topic.
If any dev claims he can spot gainers with 75% probability he is either a liar or a Jesus.
75% chance is easy... if you start a new project and know the current state of the art.
Otherwise, impossible.
When you can't win an argument, you censor it.
When you can't win an election, you remove your opponents.
Just because you've been doing something for a long time, does not mean you are any good at it.