nTCEC - Season 1 - Superfinal | Houdini vs Stockfish | 16CPU

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Kohflote
Posts: 240
Joined: Wed Sep 19, 2007 11:07 am
Location: Singapore

Re: nTCEC - Season 1 - Superfinal | Houdini vs Stockfish | 1

Post by Kohflote »

Thank you, Martin, for running such a superb tournament.

Best wishes,
Koh, Kah Huat
User avatar
Graham Banks
Posts: 45925
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: nTCEC - Season 1 - Superfinal | Houdini vs Stockfish | 1

Post by Graham Banks »

Martin Thoresen wrote:Houdini wins the Superfinal 25-23 and is therefore the nTCEC Grand Champion. Congratulations to Robert Houdart and thanks to everyone following nTCEC. Season 2 will begin in the autumn, about 4 months from now.
An epic match in the Superfinal, reaching new heights in engine v engine play.
Well done Martin and congratulations to Robert.
gbanksnz at gmail.com
S.Taylor
Posts: 8514
Joined: Thu Mar 09, 2006 3:25 am
Location: Jerusalem Israel

Re: nTCEC - Season 1 - Superfinal | Houdini vs Stockfish | 1

Post by S.Taylor »

I know this doesn't interest everyone here,
but if we look at the superfinal from game 14 till game 48, we see that SF wins, AND was always in the lead, and not because of such a big boost, but that Houdini simply could not overcome the monster that SF was.
It is only like strange twist of luck that Houdini won 3 games and was undefeated from games 1-13. And houdini failed to ever get into the lead, because of this alone.
[I know that most people here are number-crunching statisticians, but there are other statistcs patterns which i think can sometimes show things even better (but it requires a depth of thinking which many people seem to not want to get themselves into, for some reason. But there can be major differences, if this was studied. e.g. a 50% result from 100% draws says much different things about a program, than does 50% wins and 50% losses etc. etc.)].
Modern Times
Posts: 3871
Joined: Thu Jun 07, 2012 11:02 pm

Re: nTCEC - Season 1 - Superfinal | Houdini vs Stockfish | 1

Post by Modern Times »

In the 48 game final, score was + 6 = 38 - 4

So a 79% draw rate. It seems that this extremely high level of chess on 16 cores and with long time control brings the engines closer together in terms of match results. Very different from fast time control on one core....

Thanks to Martin for running this great tournament.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: nTCEC - Season 1 - Superfinal | Houdini vs Stockfish | 1

Post by mcostalba »

Martin Thoresen wrote:Houdini wins the Superfinal 25-23 and is therefore the nTCEC Grand Champion. Congratulations to Robert Houdart and thanks to everyone following nTCEC. Season 2 will begin in the autumn, about 4 months from now.
Thanks Martin for running this wonderful tournament. It was really fun to watch.

SF result in superfinal is well above its current level and it was really nice to see SF holding in that way vs H3.

Perhaps it is only statistic, but I think at such long TC SF has a better time management because always reached endgame after spending one hour more than it's opponent. One hour of thinking time more it is a huge advantage. It is true that at endgame was in time trouble but this almost never comprised the possibility for SF to hold the game.

TM is from Joona and is very advanced and shows all its power at very long TC, my guess is that H3 and friends will adopt the same TM of SF for future tournaments :-)
User avatar
Ajedrecista
Posts: 2246
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

nTCEC - Season 1 - Superfinal: Houdini vs Stockfish | 16 CPU

Post by Ajedrecista »

Hello Ernest:
ernest wrote:
Houdini wrote:in other words +25 ± 60 Elo. Maybe someone can compute more accurate values.
With my rule of thumb, I get (95% error bar): +24 ± 47 Elo
I agree with your confidence interval:

Code: Select all

LOS_and_Elo_uncertainties_calculator, ® 2012-2013.

----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------

(The input and output data is referred to the first engine).

Please write down non-negative integers.

Maximum number of games supported: 2147483647.

Write down the number of wins (up to 1825361100):

10

Write down the number of loses (up to 1825361100):

6

Write down the number of draws (up to 2147483631):

43

 Write down the confidence level (in percentage) between 65% and 99.9% (it will
be rounded up to 0.01%):

95

Write down the clock rate of the CPU (in GHz), only for timing the elapsed time
of the calculations:

2.8

---------------------------------------
Elo interval for 95.00 % confidence:

Elo rating difference:     23.59 Elo

Lower rating difference:  -22.64 Elo
Upper rating difference:   70.68 Elo

Lower bound uncertainty:  -46.24 Elo
Upper bound uncertainty:   47.09 Elo
Average error:        +/-  46.66 Elo

K = (average error)*[sqrt(n)] =  358.42

Elo interval: ] -22.64,   70.68[
---------------------------------------

Number of games of the match:        59
Score: 53.39 %
Elo rating difference:   23.59 Elo
Draw ratio: 72.88 %

************************************************************************
        Sample standard deviation:  3.3898 % of the points of the match.
1.9600 sample standard deviations:  6.6439 % of the points of the match.

                 (Corresponding to 95.00 % confidence).
************************************************************************

 Error bars were calculated with two-sided tests; values are rounded up to 0.01
Elo, or 0.01 in the case of K.

-------------------------------------------------------------------
Calculation of likelihood of superiority (LOS) in a one-sided test:
-------------------------------------------------------------------

LOS (taking into account draws) is always calculated, if possible.

LOS (not taking into account draws) is only calculated if wins + loses < 16001.

LOS (average value) is calculated only when LOS (not taking into account draws)
is calculated.
______________________________________________

LOS:  84.13 % (taking into account draws).
LOS:  83.39 % (not taking into account draws).
LOS:  83.76 % (average value).
______________________________________________

These values of LOS are rounded up to 0.01%

End of the calculations. Approximated elapsed time:   42 ms.

Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
Of course 59 games are too few, but something is something. Considering the long TC and the resources (a lot of cores each engine), the effort is titanic.

Congratulations to Robert, SF team, Martin et al! Thank you very much.

Regards from Spain.

Ajedrecista.
Modern Times
Posts: 3871
Joined: Thu Jun 07, 2012 11:02 pm

Re: nTCEC - Season 1 - Superfinal | Houdini vs Stockfish | 1

Post by Modern Times »

mcostalba wrote: Perhaps it is only statistic, but I think at such long TC SF has a better time management because always reached endgame after spending one hour more than it's opponent. One hour of thinking time more it is a huge advantage. It is true that at endgame was in time trouble but this almost never comprised the possibility for SF to hold the game.
Yes I agree, the extra time spent earlier in the game makes perfect sense to me. Perhaps slightly too aggressive, but I think it is exactly the right approach in principle.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: nTCEC - Season 1 - Superfinal | Houdini vs Stockfish | 1

Post by Laskos »

mcostalba wrote:
Martin Thoresen wrote:Houdini wins the Superfinal 25-23 and is therefore the nTCEC Grand Champion. Congratulations to Robert Houdart and thanks to everyone following nTCEC. Season 2 will begin in the autumn, about 4 months from now.
Thanks Martin for running this wonderful tournament. It was really fun to watch.

SF result in superfinal is well above its current level and it was really nice to see SF holding in that way vs H3.

Perhaps it is only statistic, but I think at such long TC SF has a better time management because always reached endgame after spending one hour more than it's opponent. One hour of thinking time more it is a huge advantage. It is true that at endgame was in time trouble but this almost never comprised the possibility for SF to hold the game.

TM is from Joona and is very advanced and shows all its power at very long TC, my guess is that H3 and friends will adopt the same TM of SF for future tournaments :-)
Robert wrote something to the effect that he wants Houdini spend more time in the endgame, and that it's just a question of preferences. But I would agree that 80+% of the games are decided in the middlegame, so Houdini TM at X+Y could be improved. I had expected that H3 would often dominate the middlegame, but it didn't happened, maybe because of TM. Generally, excluding the draws, I expected a 2/1 or even higher win/loss ratio for H3, it happened to be 6/4.
Gusev
Posts: 1476
Joined: Mon Jan 28, 2013 2:51 pm

Re: nTCEC - Season 1 - Superfinal | Houdini vs Stockfish | 1

Post by Gusev »

Modern Times wrote:In the 48 game final, score was + 6 = 38 - 4

So a 79% draw rate. It seems that this extremely high level of chess on 16 cores and with long time control brings the engines closer together in terms of match results. Very different from fast time control on one core....

Thanks to Martin for running this great tournament.
nTCEC was a rare event of historic proportion. For the first time since 1984, two strongest chess players of the planet played a 48-game championship match with a long time control. 29 years later, the top two players are no longer human, but their authors are, so emotions were still running high. And this time we have an undisputed winner, according to both nTCEC rules and the old 1984 rules (Houdini scored 6 victories, something that Karpov and Kasparov failed to achieve in their first match). Thanks and congratulations to Martin, Robert, Marco, Gary and the rest of the Stockfish team that kept everyone on edge until the last game and demonstrated impressive strength of open-source development!
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: nTCEC - Season 1 - Superfinal | Houdini vs Stockfish | 1

Post by geots »

Houdini wrote:
JuLieN wrote:Congrats, both to Martin for the tournament and to Robert for his win. :)

The result was surprisingly narrow: if I'm not mistaken, this final's result places Stockfish only 14 Elo points behind Houdini! Stockfish has made big progress!
Stockfish played extremely well and made this a very close match.

Over the full tournament the direct confrontation between the 2 engines was +10 -6 =43 or 31.5-27.5 which translates to about +25 Elo with a confidence interval of about 60 Elo (both values rough estimates), in other words +25 ± 60 Elo. Maybe someone can compute more accurate values.

Cheers,
Robert



Maybe you think it best not to say it- so let me. This result is important to me and to anyone else who is thinking logically, "if and only if" Stockfish can play reasonably this close to Houdini in 1, 4 and 6-core testing. If not, this result will mean absolutely nothing in anyone's or any groups' rating list.

Of course that is the way it is with all tournaments- you have to differentiate between "interesting" and "important". There is no doubt this was a VERY interesting and enjoyable tournament and final match.

But like Vas said with the match in Mexico- Zappa had about a 20% chance of winning, and in the end it fell within that 20%. Taking nothing away from Zappa, but it helped Cozzie "zero" ultimately in any rating lists.

Still, very interesting and congratulations to Marco, Tord and Gary for some very nice play by Stockfish- and of course it goes without saying- congrats to you as well.



All the best,

george