40x(2) v Houdini 2.0- NEW UPDATE!!

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

40x(2) v Houdini 2.0- NEW UPDATE!!

Post by geots »

Engine 40x(2) vs Houdini 2.0c x64 - 2nd UPDATE

The following update took place on an odd number of games, as the computer restarted itself, also giving me the opportunity to grab the games played so far from the database. Running at a control of 10'+10", time losses are more frequent than in repeating controls- but after checking all the games, they were all in order and fine.


Intel i5 w/4TCs
Fritz 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

10'+10"
Match=1000 games


Code: Select all

1   Houdini 2.0c x64    +48/-30/=63   56.00%   79.5/141   
2   Engine 40x(2)       +30/-48/=63   44.00%   61.5/141


I guess I may never find out why I received 40x and then an update to 40x- 40x(2), and will very likely never get a chance to see their "big daddy"- the main engine.

But this match is not over quite yet.


george
User avatar
gleperlier
Posts: 1033
Joined: Sat Feb 04, 2012 10:03 pm

Re: 40x(2) v Houdini 2.0- NEW UPDATE!!

Post by gleperlier »

thanks ! :wink:
Uri Blass
Posts: 10889
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: 40x(2) v Houdini 2.0- NEW UPDATE!!

Post by Uri Blass »

geots wrote:Engine 40x(2) vs Houdini 2.0c x64 - 2nd UPDATE

The following update took place on an odd number of games, as the computer restarted itself, also giving me the opportunity to grab the games played so far from the database. Running at a control of 10'+10", time losses are more frequent than in repeating controls- but after checking all the games, they were all in order and fine.


Intel i5 w/4TCs
Fritz 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

10'+10"
Match=1000 games


Code: Select all

1   Houdini 2.0c x64    +48/-30/=63   56.00%   79.5/141   
2   Engine 40x(2)       +30/-48/=63   44.00%   61.5/141


I guess I may never find out why I received 40x and then an update to 40x- 40x(2), and will very likely never get a chance to see their "big daddy"- the main engine.

But this match is not over quite yet.


george
The result seems convincing.

48-30 not including draws
If I flip a coin 78 times the standard deviation is sqrt(19)<4.5
so Houdini scored more than 2 standard deviation above 50% that is 39.

I am almost sure that houdini is stronger.

Uri
ThatsIt
Posts: 992
Joined: Thu Mar 09, 2006 2:11 pm

Re: 40x(2) v Houdini 2.0- NEW UPDATE!!

Post by ThatsIt »

User avatar
Ajedrecista
Posts: 2123
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: 40x(2) vs. Houdini 2.0 - NEW UPDATE!

Post by Ajedrecista »

Hello Uri:
Uri Blass wrote:
geots wrote:Engine 40x(2) vs Houdini 2.0c x64 - 2nd UPDATE

The following update took place on an odd number of games, as the computer restarted itself, also giving me the opportunity to grab the games played so far from the database. Running at a control of 10'+10", time losses are more frequent than in repeating controls- but after checking all the games, they were all in order and fine.


Intel i5 w/4TCs
Fritz 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit

10'+10"
Match=1000 games


Code: Select all

1   Houdini 2.0c x64    +48/-30/=63   56.00%   79.5/141   
2   Engine 40x(2)       +30/-48/=63   44.00%   61.5/141


I guess I may never find out why I received 40x and then an update to 40x- 40x(2), and will very likely never get a chance to see their "big daddy"- the main engine.

But this match is not over quite yet.


george
The result seems convincing.

48-30 not including draws
If I flip a coin 78 times the standard deviation is sqrt(19)<4.5
so Houdini scored more than 2 standard deviation above 50% that is 39.

I am almost sure that houdini is stronger.

Uri
I agree with you; I suppose that the standard deviation is sqrt(19.5) instead sqrt(19) although differences are small:

Code: Select all

(48 - 39)/sqrt(19.5) ~ 2.0381
A little more than two standard deviations in the case of the coin. Regarding this match: draws are not meaningless in the model I use. The result I get is very curious:

Code: Select all

Minimum_score_for_no_regression, ® 2012.

Calculation of the minimum score for no regression in a match between two engines:

 Write down the number of games of the match (it must be a positive integer, up to 1073741823):

141

Write down the draw ratio (in percentage):

44.68085106

Write down k (for making confidence intervals of (mu) +/- (k*sigma) in a normal distribution); k must be positive:

1.96

Theoretical minimum score for no regression: 56.0564 %

Minimum number of won points for the engine in this match:        79.5 points.

Minimum Elo advantage, which is also the negative part of the error bar:
 44.5968 Elo

End of the calculations.

Thanks for using Minimum_score_for_no_regression. Press Enter to exit.
I get a minimum of 79.5 points with 1.96-sigma confidence (~ 95% confidence) for reach some conclusions, just the same points that Houdini had scored! What a coincidence... life is full of them.

@George: thank you very much for this great test. This Engine 40x(2) is a mystery: single core or multi-core (I know that the test is running in single core), capability (or not) to use EGTB and/or bitbases (I know that engines do not use bases in this match), typical depths and speed that Engine 40x(2) reach in your computer... what a pity if finally Engine 40x(2) remains private. Please keep up the good work!

Regards from Spain.

Ajedrecista.
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: 40x(2) vs. Houdini 2.0 - NEW UPDATE!

Post by geots »

Thanks for all your comments- each of you. I am glad that this match is of some interest, and even I will be in terrain unfamiliar to me when I get over 500 games at this longer time control. (Compared to 40/3).


Best,

george