Multiple change testing
Moderator: Ras
-
- Posts: 95
- Joined: Fri Jun 28, 2024 9:24 am
- Full name: Wallace Shawn
Re: Multiple change testing
Try SPRT https://www.chessprogramming.org/Sequen ... Ratio_Test. This is the better testing method everyone uses nowadays. The inverted results you got with those changes are just an effect of small sample sizes.
-
- Posts: 28273
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Multiple change testing
Indeed, if you have infinite computer power available so that you won't have to worry much about efficiency that is a very reliable method.
-
- Posts: 223
- Joined: Tue Apr 09, 2024 6:24 am
- Full name: Michael Chaly
Re: Multiple change testing
This is THE most efficient method of testing regardless of the amount of compuer power you have, period. Everything else is just bad more or less.
-
- Posts: 28273
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Multiple change testing
But the question is: "for what, and under which conditions?". If you would pick patches from a pool that in 75% of the cases would increase the strength by 1 Elo, and in 25% would decrease it by 1 Elo, it would be infinitely faster to just pick 100 patches and accept them all without any testing whatsoever, than to test any of those with SPRT and only accept those that pass. You would have already +50 Elo before the SPRT methodology even gave you +1 Elo...
-
- Posts: 223
- Joined: Tue Apr 09, 2024 6:24 am
- Full name: Michael Chaly
Re: Multiple change testing
And how do you conclude if patch has 75% probability of increasing strength by 1 elo and 25% probability of decreasing it by 1 elo? From your astral spirit vibes?hgm wrote: ↑Mon Jul 22, 2024 8:05 pm But the question is: "for what, and under which conditions?". If you would pick patches from a pool that in 75% of the cases would increase the strength by 1 Elo, and in 25% would decrease it by 1 Elo, it would be infinitely faster to just pick 100 patches and accept them all without any testing whatsoever, than to test any of those with SPRT and only accept those that pass. You would have already +50 Elo before the SPRT methodology even gave you +1 Elo...
Even this really specific task is better concluded with, surprise, SPRT.
Make [-1;1] bounds and stop at ln 4 LLR - voila, you have SPRT that concludes EXACTLY what you describe and it's the way to conclude it with playing minimum number of games.
-
- Posts: 1935
- Joined: Tue Apr 19, 2016 6:08 am
- Location: U.S.A
- Full name: Andrew Grant
Re: Multiple change testing
If you can pick patches that are winners at a 3-to-1 rate, then you don't need to test at all...hgm wrote: ↑Mon Jul 22, 2024 8:05 pm But the question is: "for what, and under which conditions?". If you would pick patches from a pool that in 75% of the cases would increase the strength by 1 Elo, and in 25% would decrease it by 1 Elo, it would be infinitely faster to just pick 100 patches and accept them all without any testing whatsoever, than to test any of those with SPRT and only accept those that pass. You would have already +50 Elo before the SPRT methodology even gave you +1 Elo...
So it is hardly a counter argument to doing things with meaningful statistical power.
-
- Posts: 174
- Joined: Sun Oct 30, 2022 5:26 pm
- Full name: Conor Anstey
Re: Multiple change testing
if it was not obvious, hgm's opinion here is horribly wrong - another vote for literally the only good option in SPRT
-
- Posts: 234
- Joined: Tue Jan 31, 2023 4:34 pm
- Full name: Adam Kulju
-
- Posts: 223
- Joined: Tue Apr 09, 2024 6:24 am
- Full name: Michael Chaly
Re: Multiple change testing
There is quite literally no "voting" at this topic.
If any dev claims he can spot gainers with 75% probability he is either a liar or a Jesus.
-
- Posts: 1935
- Joined: Tue Apr 19, 2016 6:08 am
- Location: U.S.A
- Full name: Andrew Grant