I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6. Looking at these values I think I'm misunderstanding the stopping rule. What do the ELO bounds mean?
it's measuring a ratio of goodness of fit for two specific cases. Since of ELOdiff = 0, 4, 6, or 8 : 6 is the best model for your data and the comparison model is the same in both cases (elodiff = 0), LLR (0, 6) > LLR (0, 4)
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
Yes but this is not what you compare.
You did not compare different patches and you used the same patch.
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).
I think that it is depended on the value of the patch and this formula is not accurate but I agree that you need more games for smaller interval.
If the value of the patch is 0 then this formula is accurate but if the value is not 0 this formula is not accurate.
Let take an extreme example.
1)
elo0=0 elo1=400
It is obvious that the difference in number of games that you need to get a conclusion between 0 elo patch and 1 elo patch is very small.
2)
elo0=0 elo1=2
1 elo patch has 50% probability to pass
0 elo patch has 95% probability to pass
and the difference between 0 elo patch and 1 elo patch in number of games is high.
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).
I think that it is depended on the value of the patch and this formula is not accurate but I agree that you need more games for smaller interval.
If the value of the patch is 0 then this formula is accurate but if the value is not 0 this formula is not accurate.
Let take an extreme example.
1)
elo0=0 elo1=400
It is obvious that the difference in number of games that you need to get a conclusion between 0 elo patch and 1 elo patch is very small.
2)
elo0=0 elo1=2
1 elo patch has 50% probability to pass
0 elo patch has 95% probability to pass
and the difference between 0 elo patch and 1 elo patch in number of games is high.
The formula is accurat. Please read it carefully before commenting.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).
I think that it is depended on the value of the patch and this formula is not accurate but I agree that you need more games for smaller interval.
If the value of the patch is 0 then this formula is accurate but if the value is not 0 this formula is not accurate.
Let take an extreme example.
1)
elo0=0 elo1=400
It is obvious that the difference in number of games that you need to get a conclusion between 0 elo patch and 1 elo patch is very small.
2)
elo0=0 elo1=2
1 elo patch has 50% probability to pass
0 elo patch has 95% probability to pass
and the difference between 0 elo patch and 1 elo patch in number of games is high.
The formula is accurat. Please read it carefully before commenting.
You are right
The relevant part is if you fix (elo-elo0)/(elo1-elo0) that is not
what happened.
Practically elo and elo0 are constants when the person changed only elo1.