SPRT LLR question

brtzsnr · Post by **brtzsnr** » Mon Apr 27, 2015 10:46 pm

Hi,

I've implemented SPRT stopping rule in my testing framework based on cutechess-cli implementation.

I found something strange. For this score:

Code: Select all

14573 @ 40/60+0.05
5828 - 5567 - 3178
ELO 6.22±4.99

I get these values for LLR

Code: Select all

 Elo0: 0.00 Elo1: 4.00
Alpha: 0.03 Beta: 0.15
LLR: 2.61 [-1.87:+3.34]

 Elo0: 0.00 Elo1: 6.00
Alpha: 0.03 Beta: 0.15
LLR: 2.99 [-1.87:+3.34]

 Elo0: 0.00 Elo1: 8.00
Alpha: 0.03 Beta: 0.15
LLR: 2.75 [-1.87:+3.34]

I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6. Looking at these values I think I'm misunderstanding the stopping rule. What do the ELO bounds mean?

kbhearn · Post by **kbhearn** » Tue Apr 28, 2015 10:04 am

it's measuring a ratio of goodness of fit for two specific cases. Since of ELOdiff = 0, 4, 6, or 8 : 6 is the best model for your data and the comparison model is the same in both cases (elodiff = 0), LLR (0, 6) > LLR (0, 4)

lucasart · Post by **lucasart** » Tue Apr 28, 2015 12:24 pm

brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6

Why ?

brtzsnr · Post by **brtzsnr** » Tue Apr 28, 2015 3:33 pm

lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?

A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.

lucasart · Post by **lucasart** » Wed Apr 29, 2015 1:07 am

brtzsnr wrote:
lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.

What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).

Uri Blass · Post by **Uri Blass** » Wed Apr 29, 2015 1:22 am

brtzsnr wrote:
lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.

Yes but this is not what you compare.
You did not compare different patches and you used the same patch.

Uri Blass · Post by **Uri Blass** » Wed Apr 29, 2015 1:35 am

lucasart wrote:
brtzsnr wrote:
lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).

I think that it is depended on the value of the patch and this formula is not accurate but I agree that you need more games for smaller interval.

If the value of the patch is 0 then this formula is accurate but if the value is not 0 this formula is not accurate.

Let take an extreme example.

1)
elo0=0 elo1=400

It is obvious that the difference in number of games that you need to get a conclusion between 0 elo patch and 1 elo patch is very small.

2)
elo0=0 elo1=2

1 elo patch has 50% probability to pass
0 elo patch has 95% probability to pass
and the difference between 0 elo patch and 1 elo patch in number of games is high.

lucasart · Post by **lucasart** » Thu Apr 30, 2015 1:59 am

Uri Blass wrote:
lucasart wrote:
brtzsnr wrote:
lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).
I think that it is depended on the value of the patch and this formula is not accurate but I agree that you need more games for smaller interval.

If the value of the patch is 0 then this formula is accurate but if the value is not 0 this formula is not accurate.

Let take an extreme example.

1)
elo0=0 elo1=400

It is obvious that the difference in number of games that you need to get a conclusion between 0 elo patch and 1 elo patch is very small.

2)
elo0=0 elo1=2

1 elo patch has 50% probability to pass
0 elo patch has 95% probability to pass
and the difference between 0 elo patch and 1 elo patch in number of games is high.

The formula is accurat. Please read it carefully before commenting.

Uri Blass · Post by **Uri Blass** » Sat May 02, 2015 5:50 am

lucasart wrote:
Uri Blass wrote:
lucasart wrote:
brtzsnr wrote:
lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).
I think that it is depended on the value of the patch and this formula is not accurate but I agree that you need more games for smaller interval.

If the value of the patch is 0 then this formula is accurate but if the value is not 0 this formula is not accurate.

Let take an extreme example.

1)
elo0=0 elo1=400

It is obvious that the difference in number of games that you need to get a conclusion between 0 elo patch and 1 elo patch is very small.

2)
elo0=0 elo1=2

1 elo patch has 50% probability to pass
0 elo patch has 95% probability to pass
and the difference between 0 elo patch and 1 elo patch in number of games is high.
The formula is accurat. Please read it carefully before commenting.

You are right
The relevant part is if you fix (elo-elo0)/(elo1-elo0) that is not
what happened.

Practically elo and elo0 are constants when the person changed only elo1.

SPRT LLR question

SPRT LLR question

Re: SPRT LLR question

Re: SPRT LLR question

Re: SPRT LLR question

Re: SPRT LLR question

Re: SPRT LLR question

Re: SPRT LLR question

Re: SPRT LLR question

Re: SPRT LLR question