New Houdini

tpoppins · Post by **tpoppins** » Sun Oct 16, 2016 4:48 pm

Lyudmil Tsvetkov wrote:Poor Drawfish just in 3rd, way behind...

One of the reasons I would sometimes hate using SF as an analysis engine is that it sees too much draws, even in completely unbalanced positions.

I can confirm this after months of seeing hundreds of random (as opposed to those taken from the same game) positions analyzed on Let's Check. Since it lists three lines for every position you often get to see a second opinion (or two) by other engines, and Stockfish's 0.00 sometimes sticks out like a sore thumb.

It's like they hired Anish Giri to hand-tune SF's eval. ;)

Nay Lin Tun · Post by **Nay Lin Tun** » Mon Oct 17, 2016 8:27 am

This Houdini scored 90 percent, 48/53 so far. So,Houdini is nearly 400 elo above the average elo.Impressive...

Laskos · Post by **Laskos** » Tue Oct 18, 2016 10:47 am

After 54-55 rounds out of 62 in the TCEC Rapidi:

Code: Select all

   # PLAYER                  &#58; RATING  ERROR    POINTS  PLAYED     (%)   CFS&#40;next&#41;
   1 Houdini 200716          &#58; 3341.7  121.5      50.0      55    90.9      88    
   2 Stockfish 030916        &#58; 3247.0  104.6      47.0      55    85.5      53    
   3 Komodo 1692.19          &#58; 3241.6  106.0      47.5      55    86.4      94    
   4 Fire 5                  &#58; 3127.7   95.0      42.0      54    77.8      71    
   5 Jonny 8                 &#58; 3094.0   89.8      40.0      54    74.1      52    
   6 Ginkgo 1.9h             &#58; 3091.3   85.5      41.5      55    75.5      76    
   7 Gull 3                  &#58; 3048.6   87.1      38.0      54    70.4      68    
   8 Rybka 4.1               &#58; 3021.0   80.1      37.5      55    68.2      53    
   9 Andscacs 0.872b         &#58; 3016.4   84.3      36.5      54    67.6      54    
  10 Nirvana 010916          &#58; 3010.2   85.3      37.0      55    67.3      55    
  11 Chiron 030916           &#58; 3002.3   82.4      35.5      54    65.7      59    
  12 Protector 1.9           &#58; 2989.6   82.1      35.5      55    64.5      71    
  13 Texel 1.07a6            &#58; 2958.1   83.2      34.0      54    63.0      70    
  14 Naum 4.6                &#58; 2928.5   79.4      31.0      55    56.4      53    
  15 Hannibal 1.7            &#58; 2925.0   78.7      32.0      55    58.2      52    
  16 Fizbo 1.8               &#58; 2922.8   80.5      32.5      55    59.1      53    
  17 Critter 1.6a            &#58; 2918.5   81.8      31.5      54    58.3      90    
  18 Bobcat 070916           &#58; 2845.8   81.0      26.0      55    47.3      80    
  19 Raptor 2.3              &#58; 2797.4   82.2      26.5      55    48.2      79    
  20 Vajolet2 2.2.15         &#58; 2751.6   83.4      21.0      55    38.2      75    
  21 Fruit 070916            &#58; 2710.5   87.2      20.5      55    37.3      56    
  22 Laser 280816            &#58; 2700.8   87.7      20.5      55    37.3      66    
  23 Arasan 19.1             &#58; 2675.1   91.6      17.5      55    31.8      59    
  24 Gaviota 1.01            &#58; 2659.8   87.9      18.5      55    33.6      60    
  25 The Baron 3.40b         &#58; 2643.7   91.4      18.5      54    34.3      64    
  26 DisasterArea 1.63       &#58; 2619.7   97.4      16.0      55    29.1      65    
  27 Hakkapeliitta 210416    &#58; 2592.0   96.1      16.5      55    30.0     100    
  28 Jellyfish 1.1           &#58; 2354.1  131.5       9.5      55    17.3      91    
  29 Myrddin 0.87            &#58; 2226.1  158.3       6.0      55    10.9      85    
  30 Delphil 3.3b2           &#58; 2106.3  193.7       4.0      55     7.3      55    
  31 Firefly 2.7.0           &#58; 2091.2  194.0       4.0      55     7.3      87    
  32 Fridolin 2              &#58; 1941.5  238.0       2.0      55     3.6     ---    

White advantage = 32.74 +/- 11.28
Draw rate &#40;equal opponents&#41; = 54.94 % +/- 3.17

Houdini performs about 100 ELO points above SF and Komodo, SF a bit above K, although it has less points. Houdini is 88% likely to be stronger than SF in the Rapdis. Not that high confidence, but I predict that Superfinal won't be a walkover of SF over H.

shrapnel · Post by **shrapnel** » Tue Oct 18, 2016 3:36 pm

Yup ! Houdini will win. Stockfish has no chance against the Magician !
GO HOUDINI !!!

APassionForCriminalJustic · Tue Oct 18, 2016 7:02 pm

Laskos wrote:After 54-55 rounds out of 62 in the TCEC Rapidi:

Code: Select all

   # PLAYER                  &#58; RATING  ERROR    POINTS  PLAYED     (%)   CFS&#40;next&#41;
   1 Houdini 200716          &#58; 3341.7  121.5      50.0      55    90.9      88    
   2 Stockfish 030916        &#58; 3247.0  104.6      47.0      55    85.5      53    
   3 Komodo 1692.19          &#58; 3241.6  106.0      47.5      55    86.4      94    
   4 Fire 5                  &#58; 3127.7   95.0      42.0      54    77.8      71    
   5 Jonny 8                 &#58; 3094.0   89.8      40.0      54    74.1      52    
   6 Ginkgo 1.9h             &#58; 3091.3   85.5      41.5      55    75.5      76    
   7 Gull 3                  &#58; 3048.6   87.1      38.0      54    70.4      68    
   8 Rybka 4.1               &#58; 3021.0   80.1      37.5      55    68.2      53    
   9 Andscacs 0.872b         &#58; 3016.4   84.3      36.5      54    67.6      54    
  10 Nirvana 010916          &#58; 3010.2   85.3      37.0      55    67.3      55    
  11 Chiron 030916           &#58; 3002.3   82.4      35.5      54    65.7      59    
  12 Protector 1.9           &#58; 2989.6   82.1      35.5      55    64.5      71    
  13 Texel 1.07a6            &#58; 2958.1   83.2      34.0      54    63.0      70    
  14 Naum 4.6                &#58; 2928.5   79.4      31.0      55    56.4      53    
  15 Hannibal 1.7            &#58; 2925.0   78.7      32.0      55    58.2      52    
  16 Fizbo 1.8               &#58; 2922.8   80.5      32.5      55    59.1      53    
  17 Critter 1.6a            &#58; 2918.5   81.8      31.5      54    58.3      90    
  18 Bobcat 070916           &#58; 2845.8   81.0      26.0      55    47.3      80    
  19 Raptor 2.3              &#58; 2797.4   82.2      26.5      55    48.2      79    
  20 Vajolet2 2.2.15         &#58; 2751.6   83.4      21.0      55    38.2      75    
  21 Fruit 070916            &#58; 2710.5   87.2      20.5      55    37.3      56    
  22 Laser 280816            &#58; 2700.8   87.7      20.5      55    37.3      66    
  23 Arasan 19.1             &#58; 2675.1   91.6      17.5      55    31.8      59    
  24 Gaviota 1.01            &#58; 2659.8   87.9      18.5      55    33.6      60    
  25 The Baron 3.40b         &#58; 2643.7   91.4      18.5      54    34.3      64    
  26 DisasterArea 1.63       &#58; 2619.7   97.4      16.0      55    29.1      65    
  27 Hakkapeliitta 210416    &#58; 2592.0   96.1      16.5      55    30.0     100    
  28 Jellyfish 1.1           &#58; 2354.1  131.5       9.5      55    17.3      91    
  29 Myrddin 0.87            &#58; 2226.1  158.3       6.0      55    10.9      85    
  30 Delphil 3.3b2           &#58; 2106.3  193.7       4.0      55     7.3      55    
  31 Firefly 2.7.0           &#58; 2091.2  194.0       4.0      55     7.3      87    
  32 Fridolin 2              &#58; 1941.5  238.0       2.0      55     3.6     ---    

White advantage = 32.74 +/- 11.28
Draw rate &#40;equal opponents&#41; = 54.94 % +/- 3.17

Houdini performs about 100 ELO points above SF and Komodo, SF a bit above K, although it has less points. Houdini is 88% likely to be stronger than SF in the Rapdis. Not that high confidence, but I predict that Superfinal won't be a walkover of SF over H.

I'm not sure what you are trying to prove here. Rapids means absolutely nothing. The top engines have barely played each other. Hence what exactly can you possibly extract from Rapids? Did you watch Stage Three where Stockfish dominated? Once again Stockfish has always had a higher draw rate - and Houdini has always done well versus more inferior opposition. Rapids is nothing more than "who can pop the most amount bubbles in the air". The shortest match in the Rapids thus far actually comes from Stockfish. Does that account for anything? I guess not.

So your prediction is that it will not be a "walk over"? Give a result out of 100 games then. Mine is what I have stated all along; the superfinal will be a comfortable win for Stockfish at, let us say, a score of 20-5-75. Have you already forgotten about the tournament where Stockfish effectively gutted Komodo 10.1? Here is the link: http://talkchess.com/forum/viewtopic.php?t=61484

This means that EVEN if Houdini development somehow reached Komodo 10.1s level (unlikely but who really knows) that would not be even close to enough to make the superfinal realistically competitive.

APassionForCriminalJustic · Tue Oct 18, 2016 7:06 pm

shrapnel wrote:Yup ! Houdini will win. Stockfish has no chance against the Magician !
GO HOUDINI !!!

Anil, you are going to be one very depressed individual 50 games into the final. Stockfish is a much stronger engine. There is absolutely no question about that. Guess who did not lose a single match in Stage Three? And that DOES mean something. Contrary to some public opinion(s), Stockfish gains significant strength at VLTC.

Laskos · Post by **Laskos** » Tue Oct 18, 2016 7:15 pm

APassionForCriminalJustic wrote:
Laskos wrote:After 54-55 rounds out of 62 in the TCEC Rapidi:
Code: Select all
   # PLAYER                  &#58; RATING  ERROR    POINTS  PLAYED     (%)   CFS&#40;next&#41;
   1 Houdini 200716          &#58; 3341.7  121.5      50.0      55    90.9      88    
   2 Stockfish 030916        &#58; 3247.0  104.6      47.0      55    85.5      53    
   3 Komodo 1692.19          &#58; 3241.6  106.0      47.5      55    86.4      94    
   4 Fire 5                  &#58; 3127.7   95.0      42.0      54    77.8      71    
   5 Jonny 8                 &#58; 3094.0   89.8      40.0      54    74.1      52    
   6 Ginkgo 1.9h             &#58; 3091.3   85.5      41.5      55    75.5      76    
   7 Gull 3                  &#58; 3048.6   87.1      38.0      54    70.4      68    
   8 Rybka 4.1               &#58; 3021.0   80.1      37.5      55    68.2      53    
   9 Andscacs 0.872b         &#58; 3016.4   84.3      36.5      54    67.6      54    
  10 Nirvana 010916          &#58; 3010.2   85.3      37.0      55    67.3      55    
  11 Chiron 030916           &#58; 3002.3   82.4      35.5      54    65.7      59    
  12 Protector 1.9           &#58; 2989.6   82.1      35.5      55    64.5      71    
  13 Texel 1.07a6            &#58; 2958.1   83.2      34.0      54    63.0      70    
  14 Naum 4.6                &#58; 2928.5   79.4      31.0      55    56.4      53    
  15 Hannibal 1.7            &#58; 2925.0   78.7      32.0      55    58.2      52    
  16 Fizbo 1.8               &#58; 2922.8   80.5      32.5      55    59.1      53    
  17 Critter 1.6a            &#58; 2918.5   81.8      31.5      54    58.3      90    
  18 Bobcat 070916           &#58; 2845.8   81.0      26.0      55    47.3      80    
  19 Raptor 2.3              &#58; 2797.4   82.2      26.5      55    48.2      79    
  20 Vajolet2 2.2.15         &#58; 2751.6   83.4      21.0      55    38.2      75    
  21 Fruit 070916            &#58; 2710.5   87.2      20.5      55    37.3      56    
  22 Laser 280816            &#58; 2700.8   87.7      20.5      55    37.3      66    
  23 Arasan 19.1             &#58; 2675.1   91.6      17.5      55    31.8      59    
  24 Gaviota 1.01            &#58; 2659.8   87.9      18.5      55    33.6      60    
  25 The Baron 3.40b         &#58; 2643.7   91.4      18.5      54    34.3      64    
  26 DisasterArea 1.63       &#58; 2619.7   97.4      16.0      55    29.1      65    
  27 Hakkapeliitta 210416    &#58; 2592.0   96.1      16.5      55    30.0     100    
  28 Jellyfish 1.1           &#58; 2354.1  131.5       9.5      55    17.3      91    
  29 Myrddin 0.87            &#58; 2226.1  158.3       6.0      55    10.9      85    
  30 Delphil 3.3b2           &#58; 2106.3  193.7       4.0      55     7.3      55    
  31 Firefly 2.7.0           &#58; 2091.2  194.0       4.0      55     7.3      87    
  32 Fridolin 2              &#58; 1941.5  238.0       2.0      55     3.6     ---    

White advantage = 32.74 +/- 11.28
Draw rate &#40;equal opponents&#41; = 54.94 % +/- 3.17
Houdini performs about 100 ELO points above SF and Komodo, SF a bit above K, although it has less points. Houdini is 88% likely to be stronger than SF in the Rapdis. Not that high confidence, but I predict that Superfinal won't be a walkover of SF over H.
I'm not sure what you are trying to prove here. Rapids means absolutely nothing. The top engines have barely played each other. Hence what exactly can you possibly extract from Rapids? Did you watch Stage Three where Stockfish dominated? Once again Stockfish has always had a higher draw rate - and Houdini has always done well versus more inferior opposition. Rapids is nothing more than "who can pop the most amount bubbles in the air". The shortest match in the Rapids thus far actually comes from Stockfish. Does that account for anything? I guess not.

So your prediction is that it will not be a "walk over"? Give a result out of 100 games then. Mine is what I have stated all along; the superfinal will be a comfortable win for Stockfish at, let us say, a score of 20-5-75. Have you already forgotten about the tournament where Stockfish effectively gutted Komodo 10.1? Here is the link: http://talkchess.com/forum/viewtopic.php?t=61484

This means that EVEN if Houdini development somehow reached Komodo 10.1s level (unlikely but who really knows) that would not be even close to enough to make the superfinal realistically competitive.

I already posted earlier in this thread for you the Ordo ratings in Stage 3:

Code: Select all

   # PLAYER              &#58; RATING  ERROR    POINTS  PLAYED     (%)   CFS&#40;next&#41; 
   1 Stockfish 160716    &#58; 3332.9   47.8      39.0      56    69.6      89    
   2 Houdini 200716      &#58; 3291.0   43.1      36.5      57    64.0      75    
   3 Komodo 10.1         &#58; 3269.2   41.5      34.5      57    60.5      99    
   4 Fire 5              &#58; 3199.3   37.6      28.5      56    50.9      98    
   5 Andscacs 0.872b     &#58; 3137.9   40.0      23.5      56    42.0      72    
   6 Jonny 8             &#58; 3119.1   41.4      22.0      56    39.3      73    
   7 Gull 3              &#58; 3100.0   42.4      20.5      56    36.6      50    
   8 Rybka 4.1           &#58; 3100.0   45.6      20.5      56    36.6     ---    

White advantage = 85.92 +/- 13.10 
Draw rate &#40;equal opponents&#41; = 92.63 % +/- 4.81

Stockfish and Houdini are basically withing 2SD error margins here. Corroborated with Rapid, one can speculate that 50 games are not enough to separate Stockfish and Houdini. That's why I suspect the Superfinal won't be too skewed, although in 100 games it could happen that one scores significantly higher than the other.

Milos · Post by **Milos** » Tue Oct 18, 2016 8:34 pm

Laskos wrote:I already posted earlier in this thread for you the Ordo ratings in Stage 3:
Code: Select all
   # PLAYER              &#58; RATING  ERROR    POINTS  PLAYED     (%)   CFS&#40;next&#41; 
   1 Stockfish 160716    &#58; 3332.9   47.8      39.0      56    69.6      89    
   2 Houdini 200716      &#58; 3291.0   43.1      36.5      57    64.0      75    
   3 Komodo 10.1         &#58; 3269.2   41.5      34.5      57    60.5      99    
   4 Fire 5              &#58; 3199.3   37.6      28.5      56    50.9      98    
   5 Andscacs 0.872b     &#58; 3137.9   40.0      23.5      56    42.0      72    
   6 Jonny 8             &#58; 3119.1   41.4      22.0      56    39.3      73    
   7 Gull 3              &#58; 3100.0   42.4      20.5      56    36.6      50    
   8 Rybka 4.1           &#58; 3100.0   45.6      20.5      56    36.6     ---    

White advantage = 85.92 +/- 13.10 
Draw rate &#40;equal opponents&#41; = 92.63 % +/- 4.81 
Stockfish and Houdini are basically withing 2SD error margins here. Corroborated with Rapid, one can speculate that 50 games are not enough to separate Stockfish and Houdini. That's why I suspect the Superfinal won't be too skewed, although in 100 games it could happen that one scores significantly higher than the other.

Trying to reason with extreme kind of fanboy is nothing but a total waste of time. No amount of statistics or scientific data would change their mind.

Laskos · Post by **Laskos** » Thu Oct 20, 2016 5:32 pm

Still 2 games to play in round 58/62 TCEC Rapid, the Ordo logistic standings:

Code: Select all

   # PLAYER                  &#58; RATING  ERROR    POINTS  PLAYED     (%)   CFS&#40;next&#41;

   1 Houdini 200716          &#58; 3341.8  117.4      52.5      58    90.5      90    
   2 Komodo 1692.19          &#58; 3244.4  102.1      49.5      58    85.3      55    
   3 Stockfish 030916        &#58; 3234.8  100.9      49.0      58    84.5      93    
   4 Fire 5                  &#58; 3135.6   92.6      46.0      58    79.3      76    
   5 Jonny 8                 &#58; 3091.7   85.2      43.0      58    74.1      64    
   6 Ginkgo 1.9h             &#58; 3069.7   84.2      42.0      58    72.4      75    
   7 Gull 3                  &#58; 3030.3   82.8      40.0      58    69.0      60    
   8 Andscacs 0.872b         &#58; 3015.0   82.8      39.0      58    67.2      54    
   9 Rybka 4.1               &#58; 3009.7   76.0      39.0      58    67.2      60    
  10 Nirvana 010916          &#58; 2996.3   79.1      38.0      58    65.5      63    
  11 Protector 1.9           &#58; 2977.6   78.7      37.5      58    64.7      56    
  12 Chiron 030916           &#58; 2969.9   78.3      36.5      58    62.9      64    
  13 Texel 1.07a6            &#58; 2950.7   76.9      36.0      58    62.1      62    
  14 Naum 4.6                &#58; 2934.3   76.3      33.5      58    57.8      53    
  15 Critter 1.6a            &#58; 2930.2   77.0      34.5      58    59.5      57    
  16 Hannibal 1.7            &#58; 2920.1   79.1      34.5      58    59.5      65    
  17 Fizbo 1.8               &#58; 2900.0   76.0      33.5      58    57.8      89    
  18 Bobcat 070916           &#58; 2832.4   78.4      28.0      58    48.3      79    
  19 Raptor 2.3              &#58; 2788.3   77.9      27.5      58    47.4      80    
  20 Vajolet2 2.2.15         &#58; 2739.8   83.2      22.5      57    39.5      76    
  21 Fruit 070916            &#58; 2698.5   83.6      22.0      57    38.6      53    
  22 Laser 280816            &#58; 2693.9   86.3      22.0      58    37.9      57    
  23 Arasan 19.1             &#58; 2683.2   86.9      20.0      58    34.5      73    
  24 Gaviota 1.01            &#58; 2647.7   81.2      19.0      58    32.8      64    
  25 The Baron 3.40b         &#58; 2626.0   87.1      19.5      58    33.6      51    
  26 DisasterArea 1.63       &#58; 2625.0   91.1      17.5      57    30.7      74    
  27 Hakkapeliitta 210416    &#58; 2585.0   90.3      17.5      58    30.2     100    
  28 Jellyfish 1.1           &#58; 2340.7  123.1       9.5      58    16.4      95    
  29 Myrddin 0.87            &#58; 2180.9  156.4       6.0      58    10.3      69    
  30 Delphil 3.3b2           &#58; 2130.2  170.0       5.0      58     8.6      69    
  31 Firefly 2.7.0           &#58; 2075.6  182.8       4.0      57     7.0      88    
  32 Fridolin 2              &#58; 1928.6  232.0       2.0      58     3.4     ---    

White advantage = 35.28 +/- 11.48
Draw rate &#40;equal opponents&#41; = 55.17 % +/- 2.99

Houdini is 100 ELO points above Komodo and Stockfish. Also, confidence of superiority of 90% of Houdini over Komodo (and higher over Stockfish) in this Rapid.

The most striking peculiarity is that Houdini keeps more Queens on the board till the Win adjudication than Stockfish and Komodo. Average number of Queens at the Win adjudication:

Houdini: 1.06
Stockfish: 0.70
Komodo: 0.68

97.5% or 2 standard deviations that it's not a statistical fluke.

clumma · Post by **clumma** » Thu Oct 20, 2016 8:59 pm

Amazing. How can someone leave the field for years and come back with this kind of performance straight away?

New Houdini

Re: re: win adjudication/Re: New Houdini

Re: re: win adjudication/Re: New Houdini

Re: re: win adjudication/Re: New Houdini

Re: re: win adjudication/Re: New Houdini

Re: re: win adjudication/Re: New Houdini

Re: re: win adjudication/Re: New Houdini

Re: re: win adjudication/Re: New Houdini

Re: re: win adjudication/Re: New Houdini

Re: re: win adjudication/Re: New Houdini

Re: re: win adjudication/Re: New Houdini