At the Halfway Mark- Houdini & Rainbow UNLtd.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

At the Halfway Mark- Houdini & Rainbow UNLtd.

Post by geots »

Houdini 2.0c x64 vs Rainbow UNLimited


At a time when he had to slow Houdini down- or else, he faces "or else!" As far as I can tell, if you slow Houdini down, it is extremely temporary. When you have to face Houdini- I see a lot more questions than answers, and awfully tough now after Houdini stretched the lead out to 20 games. We shall see.


Intel i5 w/4TCs
Fritz 13 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

5'+5"
Match=500 games


[thru game 250]


Code: Select all

Houdini 2.0c x64     +28    +77/-57/=116   54.00%   135.0/250
Rainbow UNLimited    -28    +57/-77/=116   46.00%   115.0/250


Well, I'm out of answers- if I ever had any- and really can't remember the questions. I suppose maybe if one of the top programs were to run against Houdini- giving them both 12 or 16 cores, and set a control of 40 moves in the first year and a half- then the rest within a limit of a couple weeks- and the planets were in the right alignment, it might get lucky. Who knows. Maybe wouldn't hurt to bring a few voodoo dolls just in case.



Tomorrow-

george
User avatar
Ajedrecista
Posts: 2126
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: At the Halfway Mark - Houdini & Rainbow UNLtd.

Post by Ajedrecista »

Hello:
geots wrote:Houdini 2.0c x64 vs Rainbow UNLimited


At a time when he had to slow Houdini down- or else, he faces "or else!" As far as I can tell, if you slow Houdini down, it is extremely temporary. When you have to face Houdini- I see a lot more questions than answers, and awfully tough now after Houdini stretched the lead out to 20 games. We shall see.


Intel i5 w/4TCs
Fritz 13 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

5'+5"
Match=500 games


[thru game 250]


Code: Select all

Houdini 2.0c x64     +28    +77/-57/=116   54.00%   135.0/250
Rainbow UNLimited    -28    +57/-77/=116   46.00%   115.0/250


Well, I'm out of answers- if I ever had any- and really can't remember the questions. I suppose maybe if one of the top programs were to run against Houdini- giving them both 12 or 16 cores, and set a control of 40 moves in the first year and a half- then the rest within a limit of a couple weeks- and the planets were in the right alignment, it might get lucky. Who knows. Maybe wouldn't hurt to bring a few voodoo dolls just in case.



Tomorrow-

george
Although the match is in its half, it seems that Rainbow will fail again in the task of beat Houdini. Here are my results from Rainbow POV:

Code: Select all

LOS_and_Elo_uncertainties_calculator, ® 2012.

----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------

(The input and output data is referred to the first engine).

Please write down non-negative integers.

Write down the number of wins (up to 1825361100):

57

Write down the number of loses (up to 1825361100):

77

Write down the number of draws (up to 2147483646):

116

 Write down the confidence level (in percentage) between 65% and 99.9% (it will be rounded up to 0.01%):

95

Write down the clock rate of the CPU (in GHz), only for timing the elapsed time of the calculations:

3

---------------------------------------
Elo interval for 95.00 % confidence:

Elo rating difference:    -27.85 Elo

Lower rating difference:  -59.72 Elo
Upper rating difference:    3.55 Elo

Lower bound uncertainty:  -31.86 Elo
Upper bound uncertainty:   31.40 Elo
Average error:        +/-  31.63 Elo

K = (average error)*[sqrt(n)] =  500.16

Elo interval: ] -59.72,    3.55[
---------------------------------------

Number of games of the match:       250
Score: 46.00 %
Elo rating difference:    -27.85 Elo
Draw ratio: 46.40 %

*********************************************************
Standard deviation:  4.5105 % of the points of the match.
*********************************************************

 Error bars were calculated with two-sided tests; values are rounded up to 0.01 Elo, or 0.01 in the case of K.

-------------------------------------------------------------------
Calculation of likelihood of superiority (LOS) in a one-sided test:
-------------------------------------------------------------------

LOS (taking into account draws) is always calculated, if possible.

LOS (not taking into account draws) is only calculated if wins + loses < 16001.

LOS (average value) is calculated only when LOS (not taking into account draws) is calculated.
______________________________________________

LOS:   4.11 % (taking into account draws).
LOS:   4.24 % (not taking into account draws).
LOS:   4.17 % (average value).
______________________________________________

These values of LOS are rounded up to 0.01%

End of the calculations. Approximated elapsed time:   55 ms.

Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
More less -28 ± 32 Elo after 250 games, with 95% confidence. LOS is below 4.3%, so in theory less than 1/23 with the data of this match. It looks like Rainbow is somewhat stuck, although in this level it is completely normal. I stay tuned.

Regards from Spain.

Ajedrecista.
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: At the Halfway Mark - Houdini & Rainbow UNLtd.

Post by geots »

Ajedrecista wrote:Hello:
geots wrote:Houdini 2.0c x64 vs Rainbow UNLimited


At a time when he had to slow Houdini down- or else, he faces "or else!" As far as I can tell, if you slow Houdini down, it is extremely temporary. When you have to face Houdini- I see a lot more questions than answers, and awfully tough now after Houdini stretched the lead out to 20 games. We shall see.


Intel i5 w/4TCs
Fritz 13 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

5'+5"
Match=500 games


[thru game 250]


Code: Select all

Houdini 2.0c x64     +28    +77/-57/=116   54.00%   135.0/250
Rainbow UNLimited    -28    +57/-77/=116   46.00%   115.0/250


Well, I'm out of answers- if I ever had any- and really can't remember the questions. I suppose maybe if one of the top programs were to run against Houdini- giving them both 12 or 16 cores, and set a control of 40 moves in the first year and a half- then the rest within a limit of a couple weeks- and the planets were in the right alignment, it might get lucky. Who knows. Maybe wouldn't hurt to bring a few voodoo dolls just in case.



Tomorrow-

george
Although the match is in its half, it seems that Rainbow will fail again in the task of beat Houdini. Here are my results from Rainbow POV:

Code: Select all

LOS_and_Elo_uncertainties_calculator, ® 2012.

----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------

(The input and output data is referred to the first engine).

Please write down non-negative integers.

Write down the number of wins (up to 1825361100):

57

Write down the number of loses (up to 1825361100):

77

Write down the number of draws (up to 2147483646):

116

 Write down the confidence level (in percentage) between 65% and 99.9% (it will be rounded up to 0.01%):

95

Write down the clock rate of the CPU (in GHz), only for timing the elapsed time of the calculations:

3

---------------------------------------
Elo interval for 95.00 % confidence:

Elo rating difference:    -27.85 Elo

Lower rating difference:  -59.72 Elo
Upper rating difference:    3.55 Elo

Lower bound uncertainty:  -31.86 Elo
Upper bound uncertainty:   31.40 Elo
Average error:        +/-  31.63 Elo

K = (average error)*[sqrt(n)] =  500.16

Elo interval: ] -59.72,    3.55[
---------------------------------------

Number of games of the match:       250
Score: 46.00 %
Elo rating difference:    -27.85 Elo
Draw ratio: 46.40 %

*********************************************************
Standard deviation:  4.5105 % of the points of the match.
*********************************************************

 Error bars were calculated with two-sided tests; values are rounded up to 0.01 Elo, or 0.01 in the case of K.

-------------------------------------------------------------------
Calculation of likelihood of superiority (LOS) in a one-sided test:
-------------------------------------------------------------------

LOS (taking into account draws) is always calculated, if possible.

LOS (not taking into account draws) is only calculated if wins + loses < 16001.

LOS (average value) is calculated only when LOS (not taking into account draws) is calculated.
______________________________________________

LOS:   4.11 % (taking into account draws).
LOS:   4.24 % (not taking into account draws).
LOS:   4.17 % (average value).
______________________________________________

These values of LOS are rounded up to 0.01%

End of the calculations. Approximated elapsed time:   55 ms.

Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
More less -28 ± 32 Elo after 250 games, with 95% confidence. LOS is below 4.3%, so in theory less than 1/23 with the data of this match. It looks like Rainbow is somewhat stuck, although in this level it is completely normal. I stay tuned.

Regards from Spain.

Ajedrecista.



Another update coming shortly. Maybe it will have a few surprises.

george