ChessUSA.com TalkChess.com
Hosted by Your Move Chess & Games
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

TUESDAY UPDATE- 40x(2) v Houdini 2.0c!

 
Post new topic       TalkChess.com Forum Index -> Computer Chess Club: Tournaments and Matches Threaded
View previous topic :: View next topic  
Author Message
George Speight



Joined: 10 Mar 2006
Posts: 4636

PostPosted: Wed Jun 13, 2012 3:10 am    Post subject: TUESDAY UPDATE- 40x(2) v Houdini 2.0c! Reply to topic Reply with quote

Houdini 2.0c x64 v Engine 40x(2) - UPDATE 6


This update takes us thru game 606. Meaning 83 games have been added since yesterday's update. Houdini increased his lead by 5 games- to +50 games.

Since I could not access the databases, and it would have been 3 crosstables anyway- I went ahead and computed the elo difference here myself.



Code:
Houdini 2.0c x64    +29   +188/-138/=280   54.13%   328.0/606 
Engine 40x(2)       -29   +138/-188/=280   45.87%   278.0/606




Interesting that before I started this match- 606 games and counting ago- I told a couple people that with an error factor of + or - 5 elo- what the elo difference of these 2 engines were. With a couple elo change in a particular direction in the next 394 games- I could easily be dead-on the mark. But that is certainly nothing to brag about- it isn't like this is brain surgery.



Until tomorrow-

george
Back to top
View user's profile Send private message
Jesús Muñoz



Joined: 13 Jul 2011
Posts: 707
Location: Madrid, Spain.

PostPosted: Wed Jun 13, 2012 9:51 am    Post subject: Re: TUESDAY UPDATE - 40x(2) vs. Houdini 2.0c! Reply to topic Reply with quote

Hi George:

geots wrote:
Houdini 2.0c x64 v Engine 40x(2) - UPDATE 6


This update takes us thru game 606. Meaning 83 games have been added since yesterday's update. Houdini increased his lead by 5 games- to +50 games.

Since I could not access the databases, and it would have been 3 crosstables anyway- I went ahead and computed the elo difference here myself.



Code:
Houdini 2.0c x64    +29   +188/-138/=280   54.13%   328.0/606 
Engine 40x(2)       -29   +138/-188/=280   45.87%   278.0/606




Interesting that before I started this match- 606 games and counting ago- I told a couple people that with an error factor of + or - 5 elo- what the elo difference of these 2 engines were. With a couple elo change in a particular direction in the next 394 games- I could easily be dead-on the mark. But that is certainly nothing to brag about- it isn't like this is brain surgery.



Until tomorrow-

george


I guess that you thought Houdini ahead 40x(2) in (+25 ± 5 ) Elo, that is, between +20 Elo and +30 Elo. If I am right, those Elo advantages will be reached between 529 - 471 and 543 - 457 (Houdini wins, of course). In other words, Houdini should score between 201/394 (~ 51.02%) and 215/394 (~ 54.57%) to be between +20 Elo and +30 Elo ahead after 1000 games... who knows?

Here are my error bars for this match update:

Code:

Elo_uncertainties_calculator, ® 2012.

Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------

(The input and output data is referred to the first engine).

Please write down non-negative integers.

Write down the number of wins:

188

Write down the number of loses:

138

Write down the number of draws:

280

***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************

---------------------------------------

Elo interval for 1-sigma confidence:

Elo rating difference:     28.73 Elo

Lower rating difference:   18.40 Elo
Upper rating difference:   39.12 Elo

Lower bound uncertainty:  -10.33 Elo
Upper bound uncertainty:   10.39 Elo
Average error:        +/-  10.36 Elo

K = (average error)*[sqrt(n)] =  255.02

Elo interval: ]  18.40,   39.12[
---------------------------------------

Elo interval for 2-sigma confidence:

Elo rating difference:     28.73 Elo

Lower rating difference:    8.10 Elo
Upper rating difference:   49.57 Elo

Lower bound uncertainty:  -20.64 Elo
Upper bound uncertainty:   20.84 Elo
Average error:        +/-  20.74 Elo

K = (average error)*[sqrt(n)] =  510.51

Elo interval: ]   8.10,   49.57[
---------------------------------------

Elo interval for 3-sigma confidence:

Elo rating difference:     28.73 Elo

Lower rating difference:   -2.19 Elo
Upper rating difference:   60.12 Elo

Lower bound uncertainty:  -30.92 Elo
Upper bound uncertainty:   31.39 Elo
Average error:        +/-  31.15 Elo

K = (average error)*[sqrt(n)] =  766.93

Elo interval: ]  -2.19,   60.12[
---------------------------------------

Number of games of the match:                606
Score: 54.13 %
Elo rating difference:   28.73 Elo
Draw ratio: 46.20 %

**********************************************
1 sigma:  1.4803 % of the points of the match.
2 sigma:  2.9605 % of the points of the match.
3 sigma:  4.4408 % of the points of the match.
**********************************************

End of the calculations.

Thanks for using Elo_uncertainties_calculator. Press Enter to exit.


I have just refined this programme in the last part of the code (2 sigma ~ 2.9605%, etc.) because I realized that:

Code:
2d-4*nint(1d6*sigma,KIND=3)
3d-4*nint(1d6*sigma,KIND=3)


Is not the same as:

Code:
1d-4*nint(2d6*sigma,KIND=3)
1d-4*nint(3d6*sigma,KIND=3)


The correct one is the last code box (that is now in the code of my programme); the first one is a bad rounding (one more!), so I noticed some little strange things; but I think that finally all is OK.

Regarding the minimum score for avoiding negative Elo gains with a given confidence interval, this is what I get for this update (using my imperfect model):

Code:
90%   confidence: 318   points for Houdini.
95%   confidence: 321   points for Houdini.
98%   confidence: 324   points for Houdini.
99%   confidence: 326.5 points for Houdini.
99.5% confidence: 328.5 points for Houdini.


So, Houdini is better with more than 99% confidence and less than 99.5% confidence after these 606 games! I guess that Houdini is too much Houdini...

In other update I posted the following info:

Code:
Write down the confidence level (in percentage) between 75% and 99.9%:

95

Calculating...

Theoretical minimum score for no regression: 53.5564 %
Theoretical standard deviation in this case:  1.8145 %


This standard deviation is just one standard deviation (roundings included); as 95% confidence is more less 1.96-sigma confidence, you can see that (53.5564 - 50)/1.8145 ~ 1.96... but I should print directly 3.5564% instead of 1.8145% for avoiding confusions, just as I do in Elo_uncertainties_calculator, where I found my last rounding error (my bad), so I go for a very short fix: only two characters (add k*)!

Code:
1d-4*nint(1d6*sigma(5),KIND=3)  ! The one that can bring confusion.

1d-4*nint(1d6*k*sigma(5),KIND=3)  ! The best choice.


Thank you very much for this match. I stay tuned for the next update!

Regards from Spain.

Ajedrecista.
_________________
Six Fortran 95 tools.

Chess will never be solved.
Back to top
View user's profile Send private message Visit poster's website
George Speight



Joined: 10 Mar 2006
Posts: 4636

PostPosted: Wed Jun 13, 2012 9:45 pm    Post subject: Re: TUESDAY UPDATE - 40x(2) vs. Houdini 2.0c! Reply to topic Reply with quote

Ajedrecista wrote:
Hi George:

geots wrote:
Houdini 2.0c x64 v Engine 40x(2) - UPDATE 6


This update takes us thru game 606. Meaning 83 games have been added since yesterday's update. Houdini increased his lead by 5 games- to +50 games.

Since I could not access the databases, and it would have been 3 crosstables anyway- I went ahead and computed the elo difference here myself.



Code:
Houdini 2.0c x64    +29   +188/-138/=280   54.13%   328.0/606 
Engine 40x(2)       -29   +138/-188/=280   45.87%   278.0/606




Interesting that before I started this match- 606 games and counting ago- I told a couple people that with an error factor of + or - 5 elo- what the elo difference of these 2 engines were. With a couple elo change in a particular direction in the next 394 games- I could easily be dead-on the mark. But that is certainly nothing to brag about- it isn't like this is brain surgery.



Until tomorrow-

george


I guess that you thought Houdini ahead 40x(2) in (+25 ± 5 ) Elo, that is, between +20 Elo and +30 Elo. If I am right, those Elo advantages will be reached between 529 - 471 and 543 - 457 (Houdini wins, of course). In other words, Houdini should score between 201/394 (~ 51.02%) and 215/394 (~ 54.57%) to be between +20 Elo and +30 Elo ahead after 1000 games... who knows?

Here are my error bars for this match update:

Code:

Elo_uncertainties_calculator, ® 2012.

Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------

(The input and output data is referred to the first engine).

Please write down non-negative integers.

Write down the number of wins:

188

Write down the number of loses:

138

Write down the number of draws:

280

***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************

---------------------------------------

Elo interval for 1-sigma confidence:

Elo rating difference:     28.73 Elo

Lower rating difference:   18.40 Elo
Upper rating difference:   39.12 Elo

Lower bound uncertainty:  -10.33 Elo
Upper bound uncertainty:   10.39 Elo
Average error:        +/-  10.36 Elo

K = (average error)*[sqrt(n)] =  255.02

Elo interval: ]  18.40,   39.12[
---------------------------------------

Elo interval for 2-sigma confidence:

Elo rating difference:     28.73 Elo

Lower rating difference:    8.10 Elo
Upper rating difference:   49.57 Elo

Lower bound uncertainty:  -20.64 Elo
Upper bound uncertainty:   20.84 Elo
Average error:        +/-  20.74 Elo

K = (average error)*[sqrt(n)] =  510.51

Elo interval: ]   8.10,   49.57[
---------------------------------------

Elo interval for 3-sigma confidence:

Elo rating difference:     28.73 Elo

Lower rating difference:   -2.19 Elo
Upper rating difference:   60.12 Elo

Lower bound uncertainty:  -30.92 Elo
Upper bound uncertainty:   31.39 Elo
Average error:        +/-  31.15 Elo

K = (average error)*[sqrt(n)] =  766.93

Elo interval: ]  -2.19,   60.12[
---------------------------------------

Number of games of the match:                606
Score: 54.13 %
Elo rating difference:   28.73 Elo
Draw ratio: 46.20 %

**********************************************
1 sigma:  1.4803 % of the points of the match.
2 sigma:  2.9605 % of the points of the match.
3 sigma:  4.4408 % of the points of the match.
**********************************************

End of the calculations.

Thanks for using Elo_uncertainties_calculator. Press Enter to exit.


I have just refined this programme in the last part of the code (2 sigma ~ 2.9605%, etc.) because I realized that:

Code:
2d-4*nint(1d6*sigma,KIND=3)
3d-4*nint(1d6*sigma,KIND=3)


Is not the same as:

Code:
1d-4*nint(2d6*sigma,KIND=3)
1d-4*nint(3d6*sigma,KIND=3)


The correct one is the last code box (that is now in the code of my programme); the first one is a bad rounding (one more!), so I noticed some little strange things; but I think that finally all is OK.

Regarding the minimum score for avoiding negative Elo gains with a given confidence interval, this is what I get for this update (using my imperfect model):

Code:
90%   confidence: 318   points for Houdini.
95%   confidence: 321   points for Houdini.
98%   confidence: 324   points for Houdini.
99%   confidence: 326.5 points for Houdini.
99.5% confidence: 328.5 points for Houdini.


So, Houdini is better with more than 99% confidence and less than 99.5% confidence after these 606 games! I guess that Houdini is too much Houdini...

In other update I posted the following info:

Code:
Write down the confidence level (in percentage) between 75% and 99.9%:

95

Calculating...

Theoretical minimum score for no regression: 53.5564 %
Theoretical standard deviation in this case:  1.8145 %


This standard deviation is just one standard deviation (roundings included); as 95% confidence is more less 1.96-sigma confidence, you can see that (53.5564 - 50)/1.8145 ~ 1.96... but I should print directly 3.5564% instead of 1.8145% for avoiding confusions, just as I do in Elo_uncertainties_calculator, where I found my last rounding error (my bad), so I go for a very short fix: only two characters (add k*)!

Code:
1d-4*nint(1d6*sigma(5),KIND=3)  ! The one that can bring confusion.

1d-4*nint(1d6*k*sigma(5),KIND=3)  ! The best choice.


Thank you very much for this match. I stay tuned for the next update!

Regards from Spain.

Ajedrecista.





Thank you for your work. I won't pretend I understand your methods, but I am quite sure it is accurate knowing your ability. I really do appreciate your interest. All I really said before the match that from what I had seen from 40x- I felt like 40x(2), its upgrade- would be 32 elo weaker than Houdini- with a + and - error bar of 5 elo. From my point of view- the biggest question I felt was exactly how much of an improvement 40x(2) would be over 40x. If correct- my observations could easily have been just a lucky guess.

I am not sure if there will be a Wed. update here or not. Windows wanted to do a restart to effectively install its updates, and even tho those security updates can be very important at times- I put him off about as long as I could. I finally clicked on "hold off for 4 more hours" and went to bed. Naturally when I got back no games were playing. I haven't yet checked to see how many have been played since the 606 game mark, but quite likely not enough for an update. I shall see. And again, thank you.


Best,

george
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic       TalkChess.com Forum Index -> Computer Chess Club: Tournaments and Matches All times are GMT
Threaded
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




Powered by phpBB © 2001, 2005 phpBB Group
Enhanced with Moby Threads