Houdini vs Rainbow Limited- AN EPIC STRUGGLE!

geots · Post by **geots** » Tue Jul 03, 2012 10:32 am

Houdini 2.0c x64 vs Rainbow Limited- beta 2

There is no way I would stop this match on purpose- which being run in Fritz 13 gui means I have no access to the database of games at this time. This control of 5'+5" obviously goes quite a bit faster, and a while after posting the update last night with Houdini having a 2 game lead- I checked the match again. Houdini had gone on a mini-run of sorts, and upped the lead from 2 games to 10 games! And almost 24 hours later- it still remains at 10 games. As in the other match, Houdini takes a lead and it seems to hold there. I would love to see the PGNs and the games where Houdini went from a 2 game lead to a 10 game lead- but not possible at this time. At least not that I know of.

Intel i5 w/4TCs
Fritz 13 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit
5'+5"
Match=500 games

[thru game 175]

Code: Select all

Houdini 2.0c x64           +20    +51/-41/=83   52.86%   92.5/175                                 
Rainbow Limited- beta 2    -20    +41/-51/=83   47.14%   82.5/175

When I run matches, I don't have favorites. All I want to see is an exciting match- which means the closer the better. So naturally I am pulling for Rainbow to even this thing up. On the one hand- I think he is quite capable of it. But OTOH- Houdini is just so damn strong- that it is just really difficult to deal with him in OTB play. But we have 325 games remaining- and a lot of things can happen in that many games.

Again- we shall see,

george

Ajedrecista · Post by **Ajedrecista** » Tue Jul 03, 2012 12:35 pm

Hi George!

geots wrote:Houdini 2.0c x64 vs Rainbow Limited- beta 2

There is no way I would stop this match on purpose- which being run in Fritz 13 gui means I have no access to the database of games at this time. This control of 5'+5" obviously goes quite a bit faster, and a while after posting the update last night with Houdini having a 2 game lead- I checked the match again. Houdini had gone on a mini-run of sorts, and upped the lead from 2 games to 10 games! And almost 24 hours later- it still remains at 10 games. As in the other match, Houdini takes a lead and it seems to hold there. I would love to see the PGNs and the games where Houdini went from a 2 game lead to a 10 game lead- but not possible at this time. At least not that I know of.

Intel i5 w/4TCs
Fritz 13 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit
5'+5"
Match=500 games

[thru game 175]
Code: Select all
Houdini 2.0c x64           +20    +51/-41/=83   52.86%   92.5/175                                 
Rainbow Limited- beta 2    -20    +41/-51/=83   47.14%   82.5/175
When I run matches, I don't have favorites. All I want to see is an exciting match- which means the closer the better. So naturally I am pulling for Rainbow to even this thing up. On the one hand- I think he is quite capable of it. But OTOH- Houdini is just so damn strong- that it is just really difficult to deal with him in OTB play. But we have 325 games remaining- and a lot of things can happen in that many games.

Again- we shall see,

george

Very interesting match, the same as the other one that you are running in parallel. Congratulations.

I have been waiting a little for posting my results of LOS and error bars until you have a decent amount of games. Of course I could wait until the end of the match, but I can not wait more!

Code: Select all

LOS_and_Elo_uncertainties_calculator, ® 2012.

----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines&#58;
----------------------------------------------------------------

&#40;The input and output data is referred to the first engine&#41;.

Please write down non-negative integers.

Write down the number of wins&#58;

51

Write down the number of loses&#58;

41

Write down the number of draws&#58;

83

Write down the clock rate of the CPU &#40;in GHz&#41;, only for timing the elapsed time of the calculations&#58;

3

&#40;Only 1, 2 and 3-sigma confidence error bars are calculated, if possible&#41;.

***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************

---------------------------------------

Elo interval for 1-sigma confidence&#58;

Elo rating difference&#58;     19.88 Elo

Lower rating difference&#58;    0.87 Elo
Upper rating difference&#58;   39.00 Elo

Lower bound uncertainty&#58;  -19.01 Elo
Upper bound uncertainty&#58;   19.12 Elo
Average error&#58;        +/-  19.07 Elo

K = &#40;average error&#41;*&#91;sqrt&#40;n&#41;&#93; =  252.21

Elo interval&#58; &#93;   0.87,   39.00&#91;
---------------------------------------

Elo interval for 2-sigma confidence&#58;

Elo rating difference&#58;     19.88 Elo

Lower rating difference&#58;  -18.13 Elo
Upper rating difference&#58;   58.36 Elo

Lower bound uncertainty&#58;  -38.01 Elo
Upper bound uncertainty&#58;   38.49 Elo
Average error&#58;        +/-  38.25 Elo

K = &#40;average error&#41;*&#91;sqrt&#40;n&#41;&#93; =  505.96

Elo interval&#58; &#93; -18.13,   58.36&#91;
---------------------------------------

Elo interval for 3-sigma confidence&#58;

Elo rating difference&#58;     19.88 Elo

Lower rating difference&#58;  -37.24 Elo
Upper rating difference&#58;   78.09 Elo

Lower bound uncertainty&#58;  -57.11 Elo
Upper bound uncertainty&#58;   58.22 Elo
Average error&#58;        +/-  57.67 Elo

K = &#40;average error&#41;*&#91;sqrt&#40;n&#41;&#93; =  762.85

Elo interval&#58; &#93; -37.24,   78.09&#91;
---------------------------------------

Number of games of the match&#58;                175
Score&#58; 52.86 %
Elo rating difference&#58;   19.88 Elo
Draw ratio&#58; 47.43 %

**********************************************
1 sigma&#58;  2.7320 % of the points of the match.
2 sigma&#58;  5.4639 % of the points of the match.
3 sigma&#58;  8.1959 % of the points of the match.
**********************************************

 Error bars were calculated with two-sided tests; values are rounded up to 0.01 Elo, or 0.01 in the case of K.

-------------------------------------------------------------------
Calculation of likelihood of superiority &#40;LOS&#41; in a one-sided test&#58;
-------------------------------------------------------------------

LOS&#58;  85.22 %

This value of LOS is rounded up to 0.01%

End of the calculations. Approximated elapsed time&#58;  46 ms.

Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.

I slightly improved this programme (mainly cosmetic changes in extreme cases) with some 'go to' commands for skip a part of code when necessary, especially with very low number of games where LOS was not calculated... although with such low number of games, my result must not be taken seriously; with these 175 games, my LOS value is more less correct (I guess an absolute error of 0.2% at maximum). So, around -20 ± 38 Elo with 2-sigma confidence: Rainbow is in the fight! Also, a LOS value of ~ 85% is not conclusive.

Regarding my other programme, I have done huge improvements over the past days, mainly optimizing internal loops with the initial conditions... now, for the same internal accuracy (or even a little more now, but in no case is noticeable in the output), it takes now around 25 times less time! Here is an example with this match:

Code: Select all

Minimum_score_for_no_regression, ® 2012.

 Calculation of the minimum score for no regression &#40;i.e. negative Elo gain&#41; in a match between two engines&#58;

 Write down the number of games of the match &#40;it must be a positive integer, up to 1073741823&#41;&#58;

175

Write down the draw ratio &#40;in percentage&#41;&#58;

47.4285714285714286

Write down the confidence level &#40;in percentage&#41; between 75% and 99.9%&#58;

95

Write down the clock rate of the CPU &#40;in GHz&#41;, only for timing the elapsed time of the calculations&#58;

3

Theoretical minimum score for no regression&#58; 54.4732 %
Theoretical standard deviation in this case&#58;  4.4732 %

Minimum number of won points for the engine in this match&#58;        95.5 points.

Minimum Elo advantage, which is also the negative part of the error bar&#58;
 31.8545 Elo

End of the calculations. Approximated elapsed time&#58;  19 ms.

Thanks for using Minimum_score_for_no_regression. Press Enter to exit.

Typical times are around 20 ms in my computer, where they were around 515 ms in the past! The development of these two programmes should be finished now. Houdini should be winning with 95.5 points out of 175 for ensuring a LOS value of 95% or slightly more... it is a little far, but only a little! There is still fight.

Regarding Rainbow, is it a new branch of 40x engine? Please keep up the good work.

Regards from Spain.

Ajedrecista.

geots · Post by **geots** » Tue Jul 03, 2012 4:45 pm

Ajedrecista wrote:Hi George!
geots wrote:Houdini 2.0c x64 vs Rainbow Limited- beta 2

There is no way I would stop this match on purpose- which being run in Fritz 13 gui means I have no access to the database of games at this time. This control of 5'+5" obviously goes quite a bit faster, and a while after posting the update last night with Houdini having a 2 game lead- I checked the match again. Houdini had gone on a mini-run of sorts, and upped the lead from 2 games to 10 games! And almost 24 hours later- it still remains at 10 games. As in the other match, Houdini takes a lead and it seems to hold there. I would love to see the PGNs and the games where Houdini went from a 2 game lead to a 10 game lead- but not possible at this time. At least not that I know of.

Intel i5 w/4TCs
Fritz 13 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit
5'+5"
Match=500 games

[thru game 175]
Code: Select all
Houdini 2.0c x64           +20    +51/-41/=83   52.86%   92.5/175                                 
Rainbow Limited- beta 2    -20    +41/-51/=83   47.14%   82.5/175
When I run matches, I don't have favorites. All I want to see is an exciting match- which means the closer the better. So naturally I am pulling for Rainbow to even this thing up. On the one hand- I think he is quite capable of it. But OTOH- Houdini is just so damn strong- that it is just really difficult to deal with him in OTB play. But we have 325 games remaining- and a lot of things can happen in that many games.

Again- we shall see,

george
Very interesting match, the same as the other one that you are running in parallel. Congratulations.

I have been waiting a little for posting my results of LOS and error bars until you have a decent amount of games. Of course I could wait until the end of the match, but I can not wait more!
Code: Select all
LOS_and_Elo_uncertainties_calculator, ® 2012.

----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines&#58;
----------------------------------------------------------------

&#40;The input and output data is referred to the first engine&#41;.

Please write down non-negative integers.

Write down the number of wins&#58;

51

Write down the number of loses&#58;

41

Write down the number of draws&#58;

83

Write down the clock rate of the CPU &#40;in GHz&#41;, only for timing the elapsed time of the calculations&#58;

3

&#40;Only 1, 2 and 3-sigma confidence error bars are calculated, if possible&#41;.

***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************

---------------------------------------

Elo interval for 1-sigma confidence&#58;

Elo rating difference&#58;     19.88 Elo

Lower rating difference&#58;    0.87 Elo
Upper rating difference&#58;   39.00 Elo

Lower bound uncertainty&#58;  -19.01 Elo
Upper bound uncertainty&#58;   19.12 Elo
Average error&#58;        +/-  19.07 Elo

K = &#40;average error&#41;*&#91;sqrt&#40;n&#41;&#93; =  252.21

Elo interval&#58; &#93;   0.87,   39.00&#91;
---------------------------------------

Elo interval for 2-sigma confidence&#58;

Elo rating difference&#58;     19.88 Elo

Lower rating difference&#58;  -18.13 Elo
Upper rating difference&#58;   58.36 Elo

Lower bound uncertainty&#58;  -38.01 Elo
Upper bound uncertainty&#58;   38.49 Elo
Average error&#58;        +/-  38.25 Elo

K = &#40;average error&#41;*&#91;sqrt&#40;n&#41;&#93; =  505.96

Elo interval&#58; &#93; -18.13,   58.36&#91;
---------------------------------------

Elo interval for 3-sigma confidence&#58;

Elo rating difference&#58;     19.88 Elo

Lower rating difference&#58;  -37.24 Elo
Upper rating difference&#58;   78.09 Elo

Lower bound uncertainty&#58;  -57.11 Elo
Upper bound uncertainty&#58;   58.22 Elo
Average error&#58;        +/-  57.67 Elo

K = &#40;average error&#41;*&#91;sqrt&#40;n&#41;&#93; =  762.85

Elo interval&#58; &#93; -37.24,   78.09&#91;
---------------------------------------

Number of games of the match&#58;                175
Score&#58; 52.86 %
Elo rating difference&#58;   19.88 Elo
Draw ratio&#58; 47.43 %

**********************************************
1 sigma&#58;  2.7320 % of the points of the match.
2 sigma&#58;  5.4639 % of the points of the match.
3 sigma&#58;  8.1959 % of the points of the match.
**********************************************

 Error bars were calculated with two-sided tests; values are rounded up to 0.01 Elo, or 0.01 in the case of K.

-------------------------------------------------------------------
Calculation of likelihood of superiority &#40;LOS&#41; in a one-sided test&#58;
-------------------------------------------------------------------

LOS&#58;  85.22 %

This value of LOS is rounded up to 0.01%

End of the calculations. Approximated elapsed time&#58;  46 ms.

Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
I slightly improved this programme (mainly cosmetic changes in extreme cases) with some 'go to' commands for skip a part of code when necessary, especially with very low number of games where LOS was not calculated... although with such low number of games, my result must not be taken seriously; with these 175 games, my LOS value is more less correct (I guess an absolute error of 0.2% at maximum). So, around -20 ± 38 Elo with 2-sigma confidence: Rainbow is in the fight! Also, a LOS value of ~ 85% is not conclusive.

Regarding my other programme, I have done huge improvements over the past days, mainly optimizing internal loops with the initial conditions... now, for the same internal accuracy (or even a little more now, but in no case is noticeable in the output), it takes now around 25 times less time! Here is an example with this match:
Code: Select all
Minimum_score_for_no_regression, ® 2012.

 Calculation of the minimum score for no regression &#40;i.e. negative Elo gain&#41; in a match between two engines&#58;

 Write down the number of games of the match &#40;it must be a positive integer, up to 1073741823&#41;&#58;

175

Write down the draw ratio &#40;in percentage&#41;&#58;

47.4285714285714286

Write down the confidence level &#40;in percentage&#41; between 75% and 99.9%&#58;

95

Write down the clock rate of the CPU &#40;in GHz&#41;, only for timing the elapsed time of the calculations&#58;

3

Theoretical minimum score for no regression&#58; 54.4732 %
Theoretical standard deviation in this case&#58;  4.4732 %

Minimum number of won points for the engine in this match&#58;        95.5 points.

Minimum Elo advantage, which is also the negative part of the error bar&#58;
 31.8545 Elo

End of the calculations. Approximated elapsed time&#58;  19 ms.

Thanks for using Minimum_score_for_no_regression. Press Enter to exit.
Typical times are around 20 ms in my computer, where they were around 515 ms in the past! The development of these two programmes should be finished now. Houdini should be winning with 95.5 points out of 175 for ensuring a LOS value of 95% or slightly more... it is a little far, but only a little! There is still fight.

Regarding Rainbow, is it a new branch of 40x engine? Please keep up the good work.

Regards from Spain.

Ajedrecista.

Your figures are way over my head- but that doesn't matter. What I love is looking at your results. I don't have to know anything about the methods you use to enjoy your results. I look forward to it every match like this I run. I have been hoping you would respond soon. As to your question about Rainbow being possibly a branch of this or that- when I say I think you know who the author is- I'm fairly sure I am answering your question.

Thanks Jesus,

george

carldaman · Post by **carldaman** » Tue Jul 03, 2012 7:50 pm

Hi George,

Will you make the games available for download after the match is over?

Thanks,
Carl

geots · Post by **geots** » Wed Jul 04, 2012 4:14 am

carldaman wrote:Hi George,

Will you make the games available for download after the match is over?

Thanks,
Carl

Hi Carl. This is strictly up to the programmer- not me. But I would say it is not likely. But there is precedent. If I am wrong, I feel quite sure Ingo will correct me. But in his IPON testing, I don't think he EVER makes the pgns available to the public. Which is his choice, and I don't have a problem with it. As for me' I always make them available unless the author requests that I don't.

Best,

george

Robert Flesher · Post by **Robert Flesher** » Wed Jul 04, 2012 10:41 pm

geots wrote:
carldaman wrote:Hi George,

Will you make the games available for download after the match is over?

Thanks,
Carl

Hi Carl. This is strictly up to the programmer- not me. But I would say it is not likely. But there is precedent. If I am wrong, I feel quite sure Ingo will correct me. But in his IPON testing, I don't think he EVER makes the pgns available to the public. Which is his choice, and I don't have a problem with it. As for me' I always make them available unless the author requests that I don't.

Best,

george

Hello George, this is a very interesting match. In your opinion why would an author of a chess engine not want the pgn of the games shown? This seems absurd to me and begs the question? What does the author have to hide?

Houdini vs Rainbow Limited- AN EPIC STRUGGLE!

Houdini vs Rainbow Limited- AN EPIC STRUGGLE!

Re: Houdini vs. Rainbow Limited - AN EPIC STRUGGLE!

Re: Houdini vs. Rainbow Limited - AN EPIC STRUGGLE!

Re: Houdini vs. Rainbow Limited - AN EPIC STRUGGLE!

Re: Houdini vs. Rainbow Limited - AN EPIC STRUGGLE!

Re: Houdini vs. Rainbow Limited - AN EPIC STRUGGLE!