Houdini 3

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

marijan
Posts: 56
Joined: Mon Jan 16, 2012 1:16 am

Re: Houdini 3

Post by marijan »

Dr.Wael Deeb wrote:
So according to your testing there is a gap of 93 Elo between Houdini 1.5a & Houdini 2.0c :?: :?: :?: :?:
Well, according to this test; yes.. Test still running...
But, this is my overall rating list ( from 250+ games per engine, long time control games ( 60 min per engine and 30/50 rest 20 min per engine )) This one is more accurate...

Image

All engines 32 bit..


Oder engines did not reach 250 + games so did not show them in this list...


Regards...
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: Houdini 3

Post by Sedat Canbaz »

Dr.Wael Deeb wrote:
As my memory still serves me well yet,I think that Robert commited the same fatal mistkae that The author of Ruffian did in the past....

Ruffian's author released the extremely strong for it's time Ruffian 1.05 and then went commercial....the commercial version was an improvement after all,but not that much....

In Houdini's case,the difference between 1.5a & 2.0c is 10 Elo at best....

So Mr.Houdini should have released Houdini 1.5a as a commercial version and improve furhter the next version....
I give Robert F in computer chess marketing regards,
Dr.D
Dear Wael,

Agreed with you about the Elo difference between Ruffian 1.0.5 and Ruffian 2.0.1 was around 20-30 Elo
But anyway i see the things slightly differently....and as far as i remember in those years,Ruffian 2.0.1 was not leading with a clear Elo difference than its strong opponents (Fritz,Shredder,Chess Tiger,Hiarcs...)

Actually we should really congratulate those engine authors, who release top chess programs, which are 50-100 Elo stronger than its opponents

As you know,Rybka (Rybka was almost 5 years as leader),Shredder,Fritz,Hiarcs...had also similar superior records in the past

For example,please check the bellow top 20 standing:Houdini is more than 60 Elo stronger than second place

Image

Best,
Sedat
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: Houdini 3

Post by Dr.Wael Deeb »

marijan wrote:
Dr.Wael Deeb wrote:
So according to your testing there is a gap of 93 Elo between Houdini 1.5a & Houdini 2.0c :?: :?: :?: :?:
Well, according to this test; yes.. Test still running...
But, this is my overall rating list ( from 250+ games per engine, long time control games ( 60 min per engine and 30/50 rest 20 min per engine )) This one is more accurate...

Image

All engines 32 bit..


Oder engines did not reach 250 + games so did not show them in this list...


Regards...
Thanks for the update....
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: Houdini 3

Post by Dr.Wael Deeb »

Sedat Canbaz wrote:
Dr.Wael Deeb wrote:
As my memory still serves me well yet,I think that Robert commited the same fatal mistkae that The author of Ruffian did in the past....

Ruffian's author released the extremely strong for it's time Ruffian 1.05 and then went commercial....the commercial version was an improvement after all,but not that much....

In Houdini's case,the difference between 1.5a & 2.0c is 10 Elo at best....

So Mr.Houdini should have released Houdini 1.5a as a commercial version and improve furhter the next version....
I give Robert F in computer chess marketing regards,
Dr.D
Dear Wael,

Agreed with you about the Elo difference between Ruffian 1.0.5 and Ruffian 2.0.1 was around 20-30 Elo
But anyway i see the things slightly differently....and as far as i remember in those years,Ruffian 2.0.1 was not leading with a clear Elo difference than its strong opponents (Fritz,Shredder,Chess Tiger,Hiarcs...)

Actually we should really congratulate those engine authors, who release top chess programs, which are 50-100 Elo stronger than its opponents

As you know,Rybka (Rybka was almost 5 years as leader),Shredder,Fritz,Hiarcs...had also similar superior records in the past

For example,please check the bellow top 20 standing:Houdini is more than 60 Elo stronger than second place

Image

Best,
Sedat
I'm glad that you remember the Ruffian saga :wink:

Yes,there is no doubt that Houdini is a monster and Robert achieved a superior goal with it....
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
jmartus
Posts: 256
Joined: Sun May 16, 2010 2:50 am

Re: Houdini 3

Post by jmartus »

you said the same thing about rybka and how it was on another planet years ago. now looked what happened to it
User avatar
Dr.Wael Deeb
Posts: 9773
Joined: Wed Mar 08, 2006 8:44 pm
Location: Amman,Jordan

Re: Houdini 3

Post by Dr.Wael Deeb »

jmartus wrote:you said the same thing about rybka and how it was on another planet years ago. now looked what happened to it
What happened to Rybka :!: :?:

It's still a hell of a chess playing entity and it's a few Elo points behind Houdini 2.0c....

At longer time controls I don't know if Houdini or Rybka are superior....
Dr.D
_No one can hit as hard as life.But it ain’t about how hard you can hit.It’s about how hard you can get hit and keep moving forward.How much you can take and keep moving forward….
lkaufman
Posts: 6229
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Houdini 3

Post by lkaufman »

MM wrote:
lkaufman wrote:
Mike S. wrote:[quote="mclane & Stockfish.

Maybe my explanation is even more simple but true: Rybka, Ippo & Co are simply better. :mrgreen: The weaker engines just benefit from the bigger draw rates at big depths/long time controls.
This might be a factor for Houdini, but certainly Ivanhoe (Ippo), Rybka, and all other Ippo-related programs are not stronger than Komodo and not measurably stronger than SF except at bullet chess, so this cannot be the explanation here.

There seem to be two theories to explain the observed scaling behavior:
1. Komodo (and perhaps also SF) are more intelligent but slower, and that this tradeoff usually (but not always) favors the fast programs in blitz and the smart programs at long time controls.
2. For whatever reason, the search in Komodo (and perhaps SF) scales better than the search in the Rybka/Ippo family.

Both could be true. If the second is true, can anyone suggest WHY SF might scale better in search than Ivanhoe et al?
I don't think ''intelligence'' is the right word. There are two different approaches to the search in my view.

Ippo family engines have a search for which they find tactical moves in a very short time.
I think it is a question of search (and evaluation). Like humans. Some humans search mainly for tactical moves.
I'm not a programmer but i think it depends by the evaluation that an engine gives to each move that it analyses.
In this way it could happen that all the moves that don't give a ''break'' in the evaluation are discarded or analysed less time (and perhaps they are good positional moves).
If this would be true, it is logical that these engines find more often and more quickly tactical moves.

On the other hand Komodo plays mainly positionally.
Komodo often manouevres his pieces and pawns for many moves without playing any tactical move.

Probably it depends by the fact that Komodo gives a special evaluation on some moves that lead to some positional patterns.

I mean, probably during the search, Komodo seems to make the opposite of the engines of the ippo family.

Then Komodo analyzes more the ''positional moves'' (probably because it consider them better than other moves with its evaluation) and gives less time or discards some tactical moves.

In fact, sometimes Komodo overlooks some tactical moves, and it depends by the search or sometimes by the evaluation.

If all i wrote would be true, it is logical that in very fast time control, ippo family engines excel.

With more time to think, a positional engine like Komodo has 2 advantages:

1. What Komodo overlooks or evaluates bad in a few seconds can be seen well with more time available.

2.The weight of positional play of Komodo increases a lot and becomes more important than its (relative) weakness in tactics.

Generally speaking i think that the strenght of Komodo is that it plays nice positional and logical moves. It seems that it uses the search just to verify that everything is ok.

On the other hand the strenght of ippo family engines is the approach of the search (i think built mainly for tactical moves) and the evaluation that allows them to find sometimes tactical moves apparently hidden.

In long time control, especially in very long time control, tactics ability is almost useless because the opponent (if he has a good search and evaluation) has all the time to see it.

So the thing that has more importance is the positional/strategical sensibility.


Best Regards
I am coming to the conclusion that the relative weakness of both Komodo and Stockfish at bullet speeds vs. slower tests (which is obvious if you compare any bullet list with say the IPON list) is not mainly a question of slowness due to more or slower evaluation. I say this because I noticed that Rybka 4.1 shares the same behavior as Ivanhoe and all the other top programs (except Komodo and Stockfish) in that it is also relatively stronger at bullet speeds than at normal speeds. If we make the assumption that Rybka 4.1 is at least not substantially "dumber" than Rybka 3, I can say that the evaluation I did for Rybka 3 was also rather "smart" and "slow", rather like Komodo, yet Rybka 4.1 shares the behavior of Ippo and family. So I can only conclude that the behavior is due to some aspect of the Ippo and Rybka search that is not present in either Komodo or Stockfish. Since we have at least tried almost every search idea in Ippo (and rejected many of them), I can't guess what that aspect could be. If we can solve this mystery perhaps we can make Komodo as competitive with Houdini at bullet and blitz chess as it already appears to be at longer time controls.
Lion
Posts: 539
Joined: Fri Mar 31, 2006 1:26 pm
Location: Switzerland

Re: Houdini 3

Post by Lion »

Or should you just wait until everybody uses an 16 cores machine.
Wouldn't that compensate already ?
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: Houdini 3

Post by Sedat Canbaz »

MM wrote:
Sedat Canbaz wrote:
MM wrote:
Sedat Canbaz wrote:
MM wrote:
jpqy wrote:To have the right strenght difference you have to test the different Split Depth's
I get in my results Houdini 2.0 clearly stronger.
If you look to these lists and they get so small or no difference in ELO means they just use default settings.
Houdini has as default Split Depth 10 ..this is tuned for a system as a Q6600
When you play with SD=10 on a slower system..Houdini will play weaker ,even he stay strongest engine!
Lower the SD and you get better result.
On my i7 970 i have tested from SD=10 to SD=14 and SD=12 is clearly stronger when i compare with default.
When you have find the right Split Depth you will get better results against other engines.
On my little laptop i have to use SD=9 to get best results.
These lists are nice but don't show the real strenght off these engines.
People who say i get only 5Elo difference..well you are not using the best Split Depth for your system! :wink:

JP.
I tested with the correct split depth. I read the manual of Mr Houdart and made the test.

Best Regards
Dear Maurizio,

I don't understand what is your point exactly...

If you really don't see the real Houdini Elo difference,then it seems there is something wrong in your test

Maybe i am wrong..so can you inform me please your hardware test conditions ?

Just my 2 cents over this issue,
Honestly i am very happy to be a Houdini customer
And there is no doubt that Houdini is the world's strongest engine

Another interesting note is that,some people debated that Rybka seems to a clone
But then really i wonder:
-if its a clone,then why Rybka even without pawn,its much stronger than almost all original engines ?

Btw,i am happy also that we have Critter,Stockfish,Strelka,Ivanhoe,Komodo,Fire...

Otherwise i am afraid that Houdini's price will go up :)

Greetings,
Sedat
Hello Sedat, sorry but i think there is a misunderstanding.

Me i was the one who asked Mr Houdart to specify (please) the conditions under which he tested Houdini 2.0 because he claimed many times that 2.0 is about 25 elo stronger than 1.5a.

CEGT reports different results.

That's why i asked.

I am interested in:

exact time control (s)
number of games
opening book (s)
contempt
tablebases
hardware
number of cores used

Ipon Chess of Ingo Bauer shows a difference (in favor of 2.0) of 7 elo after 4000 games (Under Rating List) so CEGT it's not the only one rating list that sees 2.0 and 1.5a so close.

http://www.inwoba.de/

I don't claim that my results are correct but i see that my results are almost identical to the ones of Ipon Chess.


Best Regards

Hello dear Maurizio,

Thank you for your replay...

Oh...yes,now its more clear

In my opinion,the Elo difference between Houdini 2.0c and Houdini 1.5a is around 20-25 Elo

SWCR (40/10):21 Elo difference

Code: Select all

           NAME / version of engine       ELO    +    -   GAM    SC   OP     DR
  ------------------------------------------------------------------------------
   1     1 Houdini 2.0c x64               3019   18   18  1400   79%  2777   28% 
   -     2 Houdini 1.5 x64                2999   14   14  2320   78%  2772   29%


SCCT Auto232(4m+2s):expecting to be around 20-25 Elo difference too

Code: Select all

Rank  Name                        Elo    +    -   games score oppo. draws
   1 Houdini 2.0c Pro x64 6c      3441   21   21   632   61%  3379   53%
   3 Houdini 2.0b Pro x64 6c      3425   17   17  1049   68%  3312   46%
   8 Houdini 2.0b Pro x64 4c      3355   17   17  1030   51%  3352   47%
  10 Houdini 1.5a x64 4c          3352   17   17  1086   62%  3280   48% 
About IPON 4000 games per player,
It sounds perfect, but can you prove that ?
Plus,i will be very happy,if i will access to IPON's openings and games too (with annotations if possible)

About CEGT,
I have big respect to theirs work,but however i think its quite normal to see a such Elo difference under adapted time controls and using many various openings and different hardwares


Best Wshes,
Sedat
Hi Sedat,

to check ipon chess you should go here:

http://www.inwoba.de/index.html

then click on ''rating list'' and scroll down to ''full list''.

As regards the games and other infos that don't appear into the site, you should ask to Mr Bauer.

As regards CEGT, i think that 1200 games are not enough to be sure at 100% about a result but the fact that 1.5a scores +10 elo versus 2.0 at single core impresses me a little and, with ipon chess and my results, this is the reason why i asked Mr Houdart some more informations about his testing conditions.


Best Regards

Mr Maurizio Maglio,

Sorry for the late replay

About your recommendations,
No thanks,i dont care about results,which are based on standings-without games

About Herr Igno Bauer,
I dont want to spend my time again over this issue,for more details:
http://www.talkchess.com/forum/viewtopi ... ight=sedat

Best,
Sedat
lkaufman
Posts: 6229
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Houdini 3

Post by lkaufman »

Lion wrote:Or should you just wait until everybody uses an 16 cores machine.
Wouldn't that compensate already ?
I don't think we need to wait for that. My tests indicate the break-even point between latest Komodo and Houdini 2.0 on one core is around 15 minutes plus 15 seconds with ponder on (Komodo leads by two games after 95 at this level) which is about like 20 + 20 with ponder off. If we can roughly match Houdini's speedup on 4 cores this would put the break-even time limit around 7 or 8 minutes with as many seconds. So I think we have every chance to top Houdini 2 on 4 cores on the next CCRL and CEGT 40/40 and 40/20 lists following our MP release.