Usage sprt / cutechess-cli
Moderators: hgm, Rebel, chrisw
-
- Posts: 36
- Joined: Fri Oct 03, 2008 3:16 pm
Re: Usage sprt / cutechess-cli.
It seems strange that your elo0 param is greater than elo1 in first setup.
-
- Posts: 879
- Joined: Mon Dec 15, 2008 11:45 am
Re: Usage sprt / cutechess-cli.
I thought about it already in the previous post (just before my last one).Sery wrote:It seems strange that your elo0 param is greater than elo1 in first setup.
But finally H0 and H1 are independent criteria, so it should not matter.
Even if it matters, the second setup should continue too.
Thx, for the reply, but i still think there is sth. wrong.
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Usage sprt / cutechess-cli.
Desperado wrote: Setup2:
=====
games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01
Again, the test should continue!, because not stronger (>=) 10 is still possible.Code: Select all
Score of Omen0003 vs Omen0002: 1477 - 1351 - 2942 [0.511] 5770 ELO difference: 8 SPRT: llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted Finished match
Summary:
=======
Maybe there is somthing mixed up/incorrect in the description.
But without understanding the maths i do understand " is at least stronger than" + "not stronger than at least by"
(with respect to the given uncertainties), and further that these are the requirements to stop the test!!!
So, this is simply wrong
Come on, please tell me i miss something essential, and please do not tell me that everybody is just happy about a "randomly" shortend test.
Although I use cutechess-cli, I use sf sprt to verify elo improvement. If I input that stats in sf sprt, I get this,Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01
The status unclear indicates that the test should be continued.enter alpha? 0.01
enter beta? 0.01
enter elo0? 0
enter elo1? 10
enter losses? 1351
enter draws? 2942
enter wins? 1477
elo: 8, err: +/-6, drawelo: 195.6, LOS: 0.99125
llr: 2.81, [-4.60, 4.60]
status: unclear
There is probably something wrong with cutechess-cli sprt.
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Usage sprt / cutechess-cli.
Convert your elo to bayes elo before applying the sf sprt.Desperado wrote: Setup2:
=====
games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01
Code: Select all
Score of Omen0003 vs Omen0002: 1477 - 1351 - 2942 [0.511] 5770 ELO difference: 8 SPRT: llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted Finished match
enter alpha? 0.01
enter beta? 0.01
enter logistic elo0? 0
enter logistic elo1? 10
enter losses? 1351
enter draws? 2942
enter wins? 1477
bayes elo0: 0.0
bayes elo1: 13.5
sf sprt
elo: 8, err: +/-6, drawelo: 195.6, LOS: 0.99125
llr: 2.53, [-4.60, 4.60]
state:
"Not rejected and not accepted"
-
- Posts: 879
- Joined: Mon Dec 15, 2008 11:45 am
Re: Usage sprt / cutechess-cli.
Hi, Ferdi,Ferdy wrote:Convert your elo to bayes elo before applying the sf sprt.Desperado wrote: Setup2:
=====
games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01
Code: Select all
Score of Omen0003 vs Omen0002: 1477 - 1351 - 2942 [0.511] 5770 ELO difference: 8 SPRT: llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted Finished match
enter alpha? 0.01
enter beta? 0.01
enter logistic elo0? 0
enter logistic elo1? 10
enter losses? 1351
enter draws? 2942
enter wins? 1477
bayes elo0: 0.0
bayes elo1: 13.5
sf sprt
elo: 8, err: +/-6, drawelo: 195.6, LOS: 0.99125
llr: 2.53, [-4.60, 4.60]
state:
"Not rejected and not accepted"
so, "not rejected and not accepted" means that the test should have been continued, but it was stopped, which is wrong.
Can you confirm that H0 and H1 are independent tests, and that it is irrelevant how the setup of the sprt refering to elo0/elo1 is?
I conclude this because either H0 or H1 needs to be accepted.
I mean it really should not matter if elo0 is ">" or "<" than elo1.
According to the description i expect for H1: stop if(P1 >= P2 + elo0)
According to the description i expect for H0: stop if(P1 <= P2 + elo1)
Many thanks for your replies.
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Usage sprt / cutechess-cli.
Yes test should have been continued.Desperado wrote:Hi, Ferdi,Ferdy wrote:Convert your elo to bayes elo before applying the sf sprt.Desperado wrote: Setup2:
=====
games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01
Code: Select all
Score of Omen0003 vs Omen0002: 1477 - 1351 - 2942 [0.511] 5770 ELO difference: 8 SPRT: llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted Finished match
enter alpha? 0.01
enter beta? 0.01
enter logistic elo0? 0
enter logistic elo1? 10
enter losses? 1351
enter draws? 2942
enter wins? 1477
bayes elo0: 0.0
bayes elo1: 13.5
sf sprt
elo: 8, err: +/-6, drawelo: 195.6, LOS: 0.99125
llr: 2.53, [-4.60, 4.60]
state:
"Not rejected and not accepted"
so, "not rejected and not accepted" means that the test should have been continued, but it was stopped, which is wrong.
According to this source,Desperado wrote: Can you confirm that H0 and H1 are independent tests, and that it is irrelevant how the setup of the sprt refering to elo0/elo1 is?
I conclude this because either H0 or H1 needs to be accepted.
I mean it really should not matter if elo0 is ">" or "<" than elo1.
According to the description i expect for H1: stop if(P1 >= P2 + elo0)
According to the description i expect for H0: stop if(P1 <= P2 + elo1)
Many thanks for your replies.
https://en.wikipedia.org/wiki/Sequentia ... ratio_test
value of elo1 is greater than value of of elo0.
Code: Select all
The hypotheses are simply H_0: \theta=\theta_0 and H_1: \theta=\theta_1, with \theta_1>\theta_0
Code: Select all
// Log-Likelyhood Ratio
status.llr = m_wins * std::log(p1.pWin() / p0.pWin()) +
m_losses * std::log(p1.pLoss() / p0.pLoss()) +
m_draws * std::log(p1.pDraw() / p0.pDraw());
But what I am trying to understand is why in your setup2, cutechess sprt does not continue but in sf sprt it will continue the test.
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Usage sprt / cutechess-cli.
I took cutechess 0.7.1 sprt code and tried to input those values above and I get.Desperado wrote: Setup2:
=====
games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01
Code: Select all
Score of Omen0003 vs Omen0002: 1477 - 1351 - 2942 [0.511] 5770 ELO difference: 8 SPRT: llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted Finished match
Code: Select all
llr: 2.52 [-4.60, 4.60]
Did that data in setup2 and sprt results coming from cutechess-cli 0.7.1?
-
- Posts: 879
- Joined: Mon Dec 15, 2008 11:45 am
Re: Usage sprt / cutechess-cli.
I did use cutechess-cli 0.7.1Ferdy wrote:I took cutechess 0.7.1 sprt code and tried to input those values above and I get.Desperado wrote: Setup2:
=====
games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01
Code: Select all
Score of Omen0003 vs Omen0002: 1477 - 1351 - 2942 [0.511] 5770 ELO difference: 8 SPRT: llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted Finished match
And that is similar to what I get from sf sprt.Code: Select all
llr: 2.52 [-4.60, 4.60]
Did that data in setup2 and sprt results coming from cutechess-cli 0.7.1?
-
- Posts: 750
- Joined: Mon Mar 27, 2006 7:45 pm
- Location: Finland
Re: Usage sprt / cutechess-cli.
I can confirm Ferdy's result - cutechess-cli's SPRT does give an llr of 2.52509 when using the parameters and results that you got.Desperado wrote:I did use cutechess-cli 0.7.1Ferdy wrote:I took cutechess 0.7.1 sprt code and tried to input those values above and I get.Desperado wrote: Setup2:
=====
games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01
Code: Select all
Score of Omen0003 vs Omen0002: 1477 - 1351 - 2942 [0.511] 5770 ELO difference: 8 SPRT: llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted Finished match
And that is similar to what I get from sf sprt.Code: Select all
llr: 2.52 [-4.60, 4.60]
Did that data in setup2 and sprt results coming from cutechess-cli 0.7.1?
BUT: that doesn't mean that there's not a bug somewhere else in cutechess-cli. I'll try to debug it...
-
- Posts: 750
- Joined: Mon Mar 27, 2006 7:45 pm
- Location: Finland
Re: Usage sprt / cutechess-cli.
I couldn't reproduce the isssue after running some tests with both the Windows and Linux versions.
Michael: Could you please use the "-ratinginterval 10" parameter in your next run so I could see how the llr value progresses throughout the run? And please try cutechess-cli 0.7.2 just in case: https://github.com/cutechess/cutechess
Michael: Could you please use the "-ratinginterval 10" parameter in your next run so I could see how the llr value progresses throughout the run? And please try cutechess-cli 0.7.2 just in case: https://github.com/cutechess/cutechess