SAT. UPDATE- 40x(2) v Houdini 2.0c

geots · Post by **geots** » Sun Jun 17, 2012 5:05 am

Houdini 2.0c x64 v Engine 40x(2) - Winding Down!

Another 162 games are added here to the update, taking us thru game 880. That leaves 120 games to go and they are running as we speak. In the last update, I believe Houdini's lead had dropped by 10 games. In this update, the lead is back up by 10 games- to a 49 game lead. Actually that is not a hell of a lot of games- considering 880 have been played. After a slow start- 40x(2) has held his own thru most of the match.

Intel i5 w/4TCs
Fritz 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit
10'+10"
Match=1000 games

Code: Select all

Houdini 2.0c x64    +21    +253/-204/=423   53.00%   464.5/880 
Engine 40x(2)       -21    +204/-253/=423   47.00%   415.5/880

If not tomorrow, by Monday for sure I should be posting the conclusion of this match. Stay tuned.

george

Ajedrecista · Post by **Ajedrecista** » Sun Jun 17, 2012 12:45 pm

Hello:

geots wrote:Houdini 2.0c x64 v Engine 40x(2) - Winding Down!

Another 162 games are added here to the update, taking us thru game 880. That leaves 120 games to go and they are running as we speak. In the last update, I believe Houdini's lead had dropped by 10 games. In this update, the lead is back up by 10 games- to a 49 game lead. Actually that is not a hell of a lot of games- considering 880 have been played. After a slow start- 40x(2) has held his own thru most of the match.

Intel i5 w/4TCs
Fritz 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit
10'+10"
Match=1000 games
Code: Select all
Houdini 2.0c x64    +21    +253/-204/=423   53.00%   464.5/880 
Engine 40x(2)       -21    +204/-253/=423   47.00%   415.5/880 
If not tomorrow, by Monday for sure I should be posting the conclusion of this match. Stay tuned.

george

88% of the match is now completed and error bars are narrowing slowly. This is what I get:

Code: Select all

Elo_uncertainties_calculator, ® 2012.

Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------

(The input and output data is referred to the first engine).

Please write down non-negative integers.

Write down the number of wins:

253

Write down the number of loses:

204

Write down the number of draws:

423

***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************

---------------------------------------

Elo interval for 1-sigma confidence:

Elo rating difference:     19.37 Elo

Lower rating difference:   10.93 Elo
Upper rating difference:   27.82 Elo

Lower bound uncertainty:   -8.43 Elo
Upper bound uncertainty:    8.45 Elo
Average error:        +/-   8.44 Elo

K = (average error)*[sqrt(n)] =  250.45

Elo interval: ]  10.93,   27.82[
---------------------------------------

Elo interval for 2-sigma confidence:

Elo rating difference:     19.37 Elo

Lower rating difference:    2.52 Elo
Upper rating difference:   36.31 Elo

Lower bound uncertainty:  -16.85 Elo
Upper bound uncertainty:   16.94 Elo
Average error:        +/-  16.90 Elo

K = (average error)*[sqrt(n)] =  501.20

Elo interval: ]   2.52,   36.31[
---------------------------------------

Elo interval for 3-sigma confidence:

Elo rating difference:     19.37 Elo

Lower rating difference:   -5.90 Elo
Upper rating difference:   44.84 Elo

Lower bound uncertainty:  -25.27 Elo
Upper bound uncertainty:   25.47 Elo
Average error:        +/-  25.37 Elo

K = (average error)*[sqrt(n)] =  752.56

Elo interval: ]  -5.90,   44.84[
---------------------------------------

Number of games of the match:                880
Score: 52.78 %
Elo rating difference:   19.37 Elo
Draw ratio: 48.07 %

**********************************************
1 sigma:  1.2110 % of the points of the match.
2 sigma:  2.4220 % of the points of the match.
3 sigma:  3.6330 % of the points of the match.
**********************************************

End of the calculations. Approximated elapsed time:  31 ms.

Thanks for using Elo_uncertainties_calculator. Press Enter to exit.

Elo difference is ~ 19 instead of 21 for 880 games; you may thought about the Elo difference in the last 162 games, where 400·log(86/76) ~ 21.47. For 2-sigma confidence (~ 95.45% confidence) error bars are now ± 17 Elo, more less. As you see, I learnt to measure the elapsed time of the calculations of my programme and, up to date, this elapsed time is always 31 ms or 47 ms in my PC... so fast! However, I am almost sure that this time is rounded up to 15 or 16 ms (I mean: that the subroutine I use, CLOCK@, is not of get 25 ms for instance because it goes in steps of 15 ms or 16 ms)... so the elapsed time is approximated.

Regarding the other programme, it is clearly slower due to the internal calculations for getting the correct parameter of the confidence interval... anyway, it takes less than 0.6 seconds in my PC. Here are the minimum scores for Houdini (with 880 games played) for remain over Engine 40x(2), including error bars:

Code: Select all

95%   confidence: 461 points (approximated elapsed time: 515 ms).
97.5% confidence: 464 points (approximated elapsed time: 532 ms).
98%   confidence: 465 points (approximated elapsed time: 547 ms).

So, Houdini is better (with the results of this match up to 880 games) between 97.5% and 98% confidence, using my model, maybe similar to EloSTAT. I notice that the draw ratio has raised over 48% for the first time in the match, if I am not wrong. I stay tuned for the end of this 1000-game match. Thank you very much, George.

Regards from Spain.

Ajedrecista.

geots · Post by **geots** » Sun Jun 17, 2012 7:38 pm

Ajedrecista wrote:Hello:
geots wrote:Houdini 2.0c x64 v Engine 40x(2) - Winding Down!

Another 162 games are added here to the update, taking us thru game 880. That leaves 120 games to go and they are running as we speak. In the last update, I believe Houdini's lead had dropped by 10 games. In this update, the lead is back up by 10 games- to a 49 game lead. Actually that is not a hell of a lot of games- considering 880 have been played. After a slow start- 40x(2) has held his own thru most of the match.

Intel i5 w/4TCs
Fritz 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit
10'+10"
Match=1000 games
Code: Select all
Houdini 2.0c x64    +21    +253/-204/=423   53.00%   464.5/880 
Engine 40x(2)       -21    +204/-253/=423   47.00%   415.5/880 
If not tomorrow, by Monday for sure I should be posting the conclusion of this match. Stay tuned.

george
88% of the match is now completed and error bars are narrowing slowly. This is what I get:
Code: Select all
Elo_uncertainties_calculator, ® 2012.

Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------

(The input and output data is referred to the first engine).

Please write down non-negative integers.

Write down the number of wins:

253

Write down the number of loses:

204

Write down the number of draws:

423

***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************

---------------------------------------

Elo interval for 1-sigma confidence:

Elo rating difference:     19.37 Elo

Lower rating difference:   10.93 Elo
Upper rating difference:   27.82 Elo

Lower bound uncertainty:   -8.43 Elo
Upper bound uncertainty:    8.45 Elo
Average error:        +/-   8.44 Elo

K = (average error)*[sqrt(n)] =  250.45

Elo interval: ]  10.93,   27.82[
---------------------------------------

Elo interval for 2-sigma confidence:

Elo rating difference:     19.37 Elo

Lower rating difference:    2.52 Elo
Upper rating difference:   36.31 Elo

Lower bound uncertainty:  -16.85 Elo
Upper bound uncertainty:   16.94 Elo
Average error:        +/-  16.90 Elo

K = (average error)*[sqrt(n)] =  501.20

Elo interval: ]   2.52,   36.31[
---------------------------------------

Elo interval for 3-sigma confidence:

Elo rating difference:     19.37 Elo

Lower rating difference:   -5.90 Elo
Upper rating difference:   44.84 Elo

Lower bound uncertainty:  -25.27 Elo
Upper bound uncertainty:   25.47 Elo
Average error:        +/-  25.37 Elo

K = (average error)*[sqrt(n)] =  752.56

Elo interval: ]  -5.90,   44.84[
---------------------------------------

Number of games of the match:                880
Score: 52.78 %
Elo rating difference:   19.37 Elo
Draw ratio: 48.07 %

**********************************************
1 sigma:  1.2110 % of the points of the match.
2 sigma:  2.4220 % of the points of the match.
3 sigma:  3.6330 % of the points of the match.
**********************************************

End of the calculations. Approximated elapsed time:  31 ms.

Thanks for using Elo_uncertainties_calculator. Press Enter to exit.
Elo difference is ~ 19 instead of 21 for 880 games; you may thought about the Elo difference in the last 162 games, where 400·log(86/76) ~ 21.47. For 2-sigma confidence (~ 95.45% confidence) error bars are now ± 17 Elo, more less. As you see, I learnt to measure the elapsed time of the calculations of my programme and, up to date, this elapsed time is always 31 ms or 47 ms in my PC... so fast! However, I am almost sure that this time is rounded up to 15 or 16 ms (I mean: that the subroutine I use, CLOCK@, is not of get 25 ms for instance because it goes in steps of 15 ms or 16 ms)... so the elapsed time is approximated.

Regarding the other programme, it is clearly slower due to the internal calculations for getting the correct parameter of the confidence interval... anyway, it takes less than 0.6 seconds in my PC. Here are the minimum scores for Houdini (with 880 games played) for remain over Engine 40x(2), including error bars:
Code: Select all
95%   confidence: 461 points (approximated elapsed time: 515 ms).
97.5% confidence: 464 points (approximated elapsed time: 532 ms).
98%   confidence: 465 points (approximated elapsed time: 547 ms).
So, Houdini is better (with the results of this match up to 880 games) between 97.5% and 98% confidence, using my model, maybe similar to EloSTAT. I notice that the draw ratio has raised over 48% for the first time in the match, if I am not wrong. I stay tuned for the end of this 1000-game match. Thank you very much, George.

Regards from Spain.

Ajedrecista.

Thanks Jesus. Can you believe Huggins?! A derivative of Houdini! I wonder if he wants ketchup or mustard to put on his hat.

The end should come tonight or tomorrow. I am fast approaching the time where I will have to close out 1 of the 3 guis running the match as it gets closer to the end. Less than 50 games left.

Best,

george

Ajedrecista · Post by **Ajedrecista** » Sun Jun 17, 2012 8:45 pm

Hello:

geots wrote:Thanks Jesus. Can you believe Huggins?! A derivative of Houdini! I wonder if he wants ketchup or mustard to put on his hat. The end should come tonight or tomorrow. I am fast approaching the time where I will have to close out 1 of the 3 guis running the match as it gets closer to the end. Less than 50 games left.

Best,

george

Well, it could be the case, although it is not.

I must say that I was doing things slightly bad with the confidence intervals I computed (no surprise that I went wrong). The results I gave in all your updates were right in the minimum number of points but wrong with the confidence interval, because I was computing two-sided tests where the correct thing were one-sided tests (I am very careless with those things). I explain a little more: where I wrote 95% confidence, really is 97.5% confidence; where I wrote 98% confidence, really is 99% confidence, and so on. In a general case, where I wrote C% confidence, really is (50 + C/2)% confidence. So, I uploaded Minimum_score_for_no_regression and that version is not correct... it is almost correct and people who downloaded it (I counted six downloads as minimum, which is a total success for me) can correct the results with the trick of C and (50 + C/2). Sorry for the inconvenience.

Before today, I looked here and I immediately noticed that issue. Today, thinking a little brought me the reason of the fail. Now, my results match perfectly with the ones found in CPW.

I also solved the timing issue: CLOCK@ seems to give an accuracy of 1/64 of second, that is, 15.625 ms. What I do now is the following: count the number of CPU clocks between the start and the end (using the intrinsic routine CPU_CLOCK@() of Fortran 95), divide by the clock rate of the CPU (which must be input now; in my case: 3 GHz), then round up to milliseconds. It is an ugly method, but it seems that works fine.

So, running again Minimum_scores_for_no_regression:

Code: Select all

98% confidence: 462 points (approximated elapsed time: 514 ms).
99% confidence: 465 points (approximated elapsed time: 525 ms).

So, Houdini is better than 40x(2) with a confidence between 98% and 99% (after those 880 games), which should make sense now with LOS tables.

Regards from Spain.

Ajedrecista.

geots · Post by **geots** » Sun Jun 17, 2012 9:11 pm

Ajedrecista wrote:Hello:

geots wrote:Thanks Jesus. Can you believe Huggins?! A derivative of Houdini! I wonder if he wants ketchup or mustard to put on his hat. The end should come tonight or tomorrow. I am fast approaching the time where I will have to close out 1 of the 3 guis running the match as it gets closer to the end. Less than 50 games left.

Best,

george
Well, it could be the case, although it is not.

I must say that I was doing things slightly bad with the confidence intervals I computed (no surprise that I went wrong). The results I gave in all your updates were right in the minimum number of points but wrong with the confidence interval, because I was computing two-sided tests where the correct thing were one-sided tests (I am very careless with those things). I explain a little more: where I wrote 95% confidence, really is 97.5% confidence; where I wrote 98% confidence, really is 99% confidence, and so on. In a general case, where I wrote C% confidence, really is (50 + C/2)% confidence. So, I uploaded Minimum_score_for_no_regression and that version is not correct... it is almost correct and people who downloaded it (I counted six downloads as minimum, which is a total success for me) can correct the results with the trick of C and (50 + C/2). Sorry for the inconvenience.

Before today, I looked here and I immediately noticed that issue. Today, thinking a little brought me the reason of the fail. Now, my results match perfectly with the ones found in CPW.

I also solved the timing issue: CLOCK@ seems to give an accuracy of 1/64 of second, that is, 15.625 ms. What I do now is the following: count the number of CPU clocks between the start and the end (using the intrinsic routine CPU_CLOCK@() of Fortran 95), divide by the clock rate of the CPU (which must be input now; in my case: 3 GHz), then round up to milliseconds. It is an ugly method, but it seems that works fine.

So, running again Minimum_scores_for_no_regression:
Code: Select all
98% confidence: 462 points (approximated elapsed time: 514 ms).
99% confidence: 465 points (approximated elapsed time: 525 ms).
So, Houdini is better than 40x(2) with a confidence between 98% and 99% (after those 880 games), which should make sense now with LOS tables.

Regards from Spain.

Ajedrecista.

And thank you again for your interest, time and effort. You are greatly appreciated. Stay close around as we move on to bigger and better things. Plus we still have this one to close out.

george

SAT. UPDATE- 40x(2) v Houdini 2.0c

SAT. UPDATE- 40x(2) v Houdini 2.0c

Re: SATURDAY UPDATE - 40x(2) vs. Houdini 2.0c!

Re: SATURDAY UPDATE - 40x(2) vs. Houdini 2.0c!

Re: SATURDAY UPDATE - 40x(2) vs. Houdini 2.0c!

Re: SATURDAY UPDATE - 40x(2) vs. Houdini 2.0c!