SPRT LLR question

Discussion of chess software programming and technical issues.

Moderator: Ras

brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 4:02 pm

SPRT LLR question

Post by brtzsnr »

Hi,

I've implemented SPRT stopping rule in my testing framework based on cutechess-cli implementation.

I found something strange. For this score:

Code: Select all

14573 @ 40/60+0.05
5828 - 5567 - 3178
ELO 6.22±4.99
I get these values for LLR

Code: Select all

 Elo0: 0.00 Elo1: 4.00
Alpha: 0.03 Beta: 0.15
LLR: 2.61 [-1.87:+3.34]

 Elo0: 0.00 Elo1: 6.00
Alpha: 0.03 Beta: 0.15
LLR: 2.99 [-1.87:+3.34]

 Elo0: 0.00 Elo1: 8.00
Alpha: 0.03 Beta: 0.15
LLR: 2.75 [-1.87:+3.34]
I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6. Looking at these values I think I'm misunderstanding the stopping rule. What do the ELO bounds mean?
kbhearn
Posts: 411
Joined: Thu Dec 30, 2010 4:48 am

Re: SPRT LLR question

Post by kbhearn »

it's measuring a ratio of goodness of fit for two specific cases. Since of ELOdiff = 0, 4, 6, or 8 : 6 is the best model for your data and the comparison model is the same in both cases (elodiff = 0), LLR (0, 6) > LLR (0, 4)
lucasart
Posts: 3243
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: SPRT LLR question

Post by lucasart »

brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
brtzsnr
Posts: 433
Joined: Fri Jan 16, 2015 4:02 pm

Re: SPRT LLR question

Post by brtzsnr »

lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
lucasart
Posts: 3243
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: SPRT LLR question

Post by lucasart »

brtzsnr wrote:
lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Uri Blass
Posts: 11188
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: SPRT LLR question

Post by Uri Blass »

brtzsnr wrote:
lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
Yes but this is not what you compare.
You did not compare different patches and you used the same patch.
Uri Blass
Posts: 11188
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: SPRT LLR question

Post by Uri Blass »

lucasart wrote:
brtzsnr wrote:
lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).
I think that it is depended on the value of the patch and this formula is not accurate but I agree that you need more games for smaller interval.

If the value of the patch is 0 then this formula is accurate but if the value is not 0 this formula is not accurate.

Let take an extreme example.

1)
elo0=0 elo1=400

It is obvious that the difference in number of games that you need to get a conclusion between 0 elo patch and 1 elo patch is very small.

2)
elo0=0 elo1=2

1 elo patch has 50% probability to pass
0 elo patch has 95% probability to pass
and the difference between 0 elo patch and 1 elo patch in number of games is high.
lucasart
Posts: 3243
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: SPRT LLR question

Post by lucasart »

Uri Blass wrote:
lucasart wrote:
brtzsnr wrote:
lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).
I think that it is depended on the value of the patch and this formula is not accurate but I agree that you need more games for smaller interval.

If the value of the patch is 0 then this formula is accurate but if the value is not 0 this formula is not accurate.

Let take an extreme example.

1)
elo0=0 elo1=400

It is obvious that the difference in number of games that you need to get a conclusion between 0 elo patch and 1 elo patch is very small.

2)
elo0=0 elo1=2

1 elo patch has 50% probability to pass
0 elo patch has 95% probability to pass
and the difference between 0 elo patch and 1 elo patch in number of games is high.
The formula is accurat. Please read it carefully before commenting.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Uri Blass
Posts: 11188
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: SPRT LLR question

Post by Uri Blass »

lucasart wrote:
Uri Blass wrote:
lucasart wrote:
brtzsnr wrote:
lucasart wrote:
brtzsnr wrote:I expected LLR for ELO bounds 0-4 to be higher than LLR for ELO bounds 0-6
Why ?
A 6ELO patch should require fewer games to be outside of 0-4 ELO range than a 4ELO patch. If that's not the case then the stopping rule is not very efficient.
What? You hoped that changing the resolution from 6 to 4 elo was free? Of course not! In statistics, when you want to make a measure N times more precise, you need N^2 times more observations. More specifically, if you fix (elo-elo0)/(elo1-elo0), then the expected SPRT stopping time is proportional to (elo1-elo0)^(-2).
I think that it is depended on the value of the patch and this formula is not accurate but I agree that you need more games for smaller interval.

If the value of the patch is 0 then this formula is accurate but if the value is not 0 this formula is not accurate.

Let take an extreme example.

1)
elo0=0 elo1=400

It is obvious that the difference in number of games that you need to get a conclusion between 0 elo patch and 1 elo patch is very small.

2)
elo0=0 elo1=2

1 elo patch has 50% probability to pass
0 elo patch has 95% probability to pass
and the difference between 0 elo patch and 1 elo patch in number of games is high.
The formula is accurat. Please read it carefully before commenting.
You are right
The relevant part is if you fix (elo-elo0)/(elo1-elo0) that is not
what happened.

Practically elo and elo0 are constants when the person changed only elo1.