Usage sprt / cutechess-cli

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Sery
Posts: 31
Joined: Fri Oct 03, 2008 1:16 pm

Re: Usage sprt / cutechess-cli.

Post by Sery » Fri Sep 04, 2015 6:01 pm

It seems strange that your elo0 param is greater than elo1 in first setup.

User avatar
Desperado
Posts: 638
Joined: Mon Dec 15, 2008 10:45 am

Re: Usage sprt / cutechess-cli.

Post by Desperado » Fri Sep 04, 2015 6:43 pm

Sery wrote:It seems strange that your elo0 param is greater than elo1 in first setup.
I thought about it already in the previous post (just before my last one).
But finally H0 and H1 are independent criteria, so it should not matter.
Even if it matters, the second setup should continue too.

Thx, for the reply, but i still think there is sth. wrong.

Ferdy
Posts: 4113
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: Usage sprt / cutechess-cli.

Post by Ferdy » Fri Sep 04, 2015 8:38 pm

Desperado wrote: Setup2:
=====

games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01

Code: Select all

Score of Omen0003 vs Omen0002: 1477 - 1351 - 2942  [0.511] 5770
ELO difference: 8
SPRT: llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted
Finished match
Again, the test should continue!, because not stronger (>=) 10 is still possible.

Summary:
=======

Maybe there is somthing mixed up/incorrect in the description.
But without understanding the maths i do understand " is at least stronger than" + "not stronger than at least by"
(with respect to the given uncertainties), and further that these are the requirements to stop the test!!!

So, this is simply wrong :!:

Come on, please tell me i miss something essential, and please do not tell me that everybody is just happy about a "randomly" shortend test. :shock:
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01
Although I use cutechess-cli, I use sf sprt to verify elo improvement. If I input that stats in sf sprt, I get this,
enter alpha? 0.01
enter beta? 0.01

enter elo0? 0
enter elo1? 10

enter losses? 1351
enter draws? 2942
enter wins? 1477

elo: 8, err: +/-6, drawelo: 195.6, LOS: 0.99125
llr: 2.81, [-4.60, 4.60]
status: unclear
The status unclear indicates that the test should be continued.
There is probably something wrong with cutechess-cli sprt.

Ferdy
Posts: 4113
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: Usage sprt / cutechess-cli.

Post by Ferdy » Sat Sep 05, 2015 6:21 am

Desperado wrote: Setup2:
=====

games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01

Code: Select all

Score of Omen0003 vs Omen0002: 1477 - 1351 - 2942  [0.511] 5770
ELO difference: 8
SPRT: llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted
Finished match
Convert your elo to bayes elo before applying the sf sprt.
enter alpha? 0.01
enter beta? 0.01

enter logistic elo0? 0
enter logistic elo1? 10

enter losses? 1351
enter draws? 2942
enter wins? 1477

bayes elo0: 0.0
bayes elo1: 13.5

sf sprt
elo: 8, err: +/-6, drawelo: 195.6, LOS: 0.99125
llr: 2.53, [-4.60, 4.60]
state:


"Not rejected and not accepted"

User avatar
Desperado
Posts: 638
Joined: Mon Dec 15, 2008 10:45 am

Re: Usage sprt / cutechess-cli.

Post by Desperado » Sat Sep 05, 2015 8:02 am

Ferdy wrote:
Desperado wrote: Setup2:
=====

games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01

Code: Select all

Score of Omen0003 vs Omen0002: 1477 - 1351 - 2942  [0.511] 5770
ELO difference: 8
SPRT: llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted
Finished match
Convert your elo to bayes elo before applying the sf sprt.
enter alpha? 0.01
enter beta? 0.01

enter logistic elo0? 0
enter logistic elo1? 10

enter losses? 1351
enter draws? 2942
enter wins? 1477

bayes elo0: 0.0
bayes elo1: 13.5

sf sprt
elo: 8, err: +/-6, drawelo: 195.6, LOS: 0.99125
llr: 2.53, [-4.60, 4.60]
state:


"Not rejected and not accepted"
Hi, Ferdi,

so, "not rejected and not accepted" means that the test should have been continued, but it was stopped, which is wrong.

Can you confirm that H0 and H1 are independent tests, and that it is irrelevant how the setup of the sprt refering to elo0/elo1 is?
I conclude this because either H0 or H1 needs to be accepted.
I mean it really should not matter if elo0 is ">" or "<" than elo1.

According to the description i expect for H1: stop if(P1 >= P2 + elo0)
According to the description i expect for H0: stop if(P1 <= P2 + elo1)

Many thanks for your replies.

Ferdy
Posts: 4113
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: Usage sprt / cutechess-cli.

Post by Ferdy » Sat Sep 05, 2015 10:24 am

Desperado wrote:
Ferdy wrote:
Desperado wrote: Setup2:
=====

games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01

Code: Select all

Score of Omen0003 vs Omen0002&#58; 1477 - 1351 - 2942  &#91;0.511&#93; 5770
ELO difference&#58; 8
SPRT&#58; llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted
Finished match
Convert your elo to bayes elo before applying the sf sprt.
enter alpha? 0.01
enter beta? 0.01

enter logistic elo0? 0
enter logistic elo1? 10

enter losses? 1351
enter draws? 2942
enter wins? 1477

bayes elo0: 0.0
bayes elo1: 13.5

sf sprt
elo: 8, err: +/-6, drawelo: 195.6, LOS: 0.99125
llr: 2.53, [-4.60, 4.60]
state:


"Not rejected and not accepted"
Hi, Ferdi,

so, "not rejected and not accepted" means that the test should have been continued, but it was stopped, which is wrong.
Yes test should have been continued.
Desperado wrote: Can you confirm that H0 and H1 are independent tests, and that it is irrelevant how the setup of the sprt refering to elo0/elo1 is?
I conclude this because either H0 or H1 needs to be accepted.
I mean it really should not matter if elo0 is ">" or "<" than elo1.

According to the description i expect for H1: stop if(P1 >= P2 + elo0)
According to the description i expect for H0: stop if(P1 <= P2 + elo1)

Many thanks for your replies.
According to this source,
https://en.wikipedia.org/wiki/Sequentia ... ratio_test
value of elo1 is greater than value of of elo0.

Code: Select all

The hypotheses are simply H_0&#58; \theta=\theta_0 and H_1&#58; \theta=\theta_1, with \theta_1>\theta_0
In cutechess, LLR is calculated like this,

Code: Select all

// Log-Likelyhood Ratio
	status.llr = m_wins * std&#58;&#58;log&#40;p1.pWin&#40;) / p0.pWin&#40;)) +
		     m_losses * std&#58;&#58;log&#40;p1.pLoss&#40;) / p0.pLoss&#40;)) +
		     m_draws * std&#58;&#58;log&#40;p1.pDraw&#40;) / p0.pDraw&#40;));
which is similar to sf sprt, there is a ratio of P1/P0. The stopping rule depends on this llr value.

But what I am trying to understand is why in your setup2, cutechess sprt does not continue but in sf sprt it will continue the test.

Ferdy
Posts: 4113
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: Usage sprt / cutechess-cli.

Post by Ferdy » Sat Sep 05, 2015 2:04 pm

Desperado wrote: Setup2:
=====

games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01

Code: Select all

Score of Omen0003 vs Omen0002&#58; 1477 - 1351 - 2942  &#91;0.511&#93; 5770
ELO difference&#58; 8
SPRT&#58; llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted
Finished match
I took cutechess 0.7.1 sprt code and tried to input those values above and I get.

Code: Select all

llr&#58; 2.52 &#91;-4.60, 4.60&#93;
And that is similar to what I get from sf sprt.

Did that data in setup2 and sprt results coming from cutechess-cli 0.7.1?

User avatar
Desperado
Posts: 638
Joined: Mon Dec 15, 2008 10:45 am

Re: Usage sprt / cutechess-cli.

Post by Desperado » Sat Sep 05, 2015 3:32 pm

Ferdy wrote:
Desperado wrote: Setup2:
=====

games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01

Code: Select all

Score of Omen0003 vs Omen0002&#58; 1477 - 1351 - 2942  &#91;0.511&#93; 5770
ELO difference&#58; 8
SPRT&#58; llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted
Finished match
I took cutechess 0.7.1 sprt code and tried to input those values above and I get.

Code: Select all

llr&#58; 2.52 &#91;-4.60, 4.60&#93;
And that is similar to what I get from sf sprt.

Did that data in setup2 and sprt results coming from cutechess-cli 0.7.1?
I did use cutechess-cli 0.7.1

User avatar
ilari
Posts: 750
Joined: Mon Mar 27, 2006 5:45 pm
Location: Finland
Contact:

Re: Usage sprt / cutechess-cli.

Post by ilari » Sat Sep 05, 2015 5:20 pm

Desperado wrote:
Ferdy wrote:
Desperado wrote: Setup2:
=====

games: 35000 (max)
Set eng=%eng% -sprt elo0=0 elo1=10 alpha=0.01 beta=0.01

Code: Select all

Score of Omen0003 vs Omen0002&#58; 1477 - 1351 - 2942  &#91;0.511&#93; 5770
ELO difference&#58; 8
SPRT&#58; llr -4.71, lbound -4.6, ubound 4.6 - H0 was accepted
Finished match
I took cutechess 0.7.1 sprt code and tried to input those values above and I get.

Code: Select all

llr&#58; 2.52 &#91;-4.60, 4.60&#93;
And that is similar to what I get from sf sprt.

Did that data in setup2 and sprt results coming from cutechess-cli 0.7.1?
I did use cutechess-cli 0.7.1
I can confirm Ferdy's result - cutechess-cli's SPRT does give an llr of 2.52509 when using the parameters and results that you got.
BUT: that doesn't mean that there's not a bug somewhere else in cutechess-cli. I'll try to debug it...

User avatar
ilari
Posts: 750
Joined: Mon Mar 27, 2006 5:45 pm
Location: Finland
Contact:

Re: Usage sprt / cutechess-cli.

Post by ilari » Sat Sep 05, 2015 6:20 pm

I couldn't reproduce the isssue after running some tests with both the Windows and Linux versions.

Michael: Could you please use the "-ratinginterval 10" parameter in your next run so I could see how the llr value progresses throughout the run? And please try cutechess-cli 0.7.2 just in case: https://github.com/cutechess/cutechess

Post Reply