Komodo 4 on long time control

mwyoung · Post by **mwyoung** » Sat Dec 03, 2011 10:55 pm

tomgdrums wrote:
mwyoung wrote:
Houdini wrote:
lkaufman wrote: I think you are wrong on this last point. I'm combining the blitz results of ccrl with the blitz results of cegt to get a larger sample, and then comparing the slow results of the two organizations to get a larger sample. So the margin of error should be divided by roughly the square root of two, bringing it back down to your original 20 elo estimate. Twenty is less than 25, so even if I accept your twenty value the chance that Houdini 1.5 scales better than 2.0 is about 99% based on this data.
No, you miss the point that you're using 4 ratings to compute the relative scaling. With individual rating errors of 20 Elo one simply cannot make any statistically sound conclusions about micro-differences of plus or minus 10 Elo. And that even ignores the fact that you're cherry-picking the rating lists to base your conclusions on...
From all the test results at my disposal (including some private rating list results with 6 CPU), there is no evidence that Houdini 2 scales any differently than Houdini 1.5.

Robert
This is easy to settle...

Don and Larry needs to meet one of their deadlines and release Komodo 4 for testing.

All they are waiting for is a version of Komodo 4 that can overtake Houdini in the rating lists. With all the trash talk from them about Houdini's problems. Why the delay?
+10

Don and Larry do indeed talk a lot of smack in the forum.

They seemed to have become obsessed with overtaking Houdini!! It is like watching the computer chess version of "Moby Dick"!!

He piled upon Houdini the sum of all the general rage and hate felt by his whole race from Adam down; and then, as if his chest had been a mortar, he burst his hot heart's shell upon it.

lkaufman · Post by **lkaufman** » Sat Dec 03, 2011 11:20 pm

rvida wrote:
lkaufman wrote:
rvida wrote:
MM wrote:i see that Komodo and Rybka 4.1 are the engines that improve their level of play with longer TC (it's what i'm saying from ages about Komodo).
Or it might be other way around - their level of play decrease with faster TC. There are rumors (on the rybka forum) that R4.1 has particularly bad time management.
I'm not sure what the difference is. Either way, the point is that results at blitz underrate the performance at long time limits of the better scaling engine, whatever the reason.
My point was that maybe Komodo does not "scale better", but instead something is holding it back at very fast TC. For example if you spend some 10ms setting up search, initializing tables or whatever, it might be a serious handicap at ultra fast but is negligible at long TC. Another reason might be overstepping allocated time by few ms at each move and running into time trouble. Engines checking more often for stop condition generally perform better at fast TC. After how many nodes do you check the time?

I don't remember, but we did spend a little time on the question so I'm sure this is not a problem for us. At short time controls Komodo is not bad, it's fine compared to SF and Rybka, but comparing ANY of these programs to ANY Ippo-related program (even including yours despite differences between Ippo and Critter) shows the same thing: Ippos crush everyone else in bullet chess, even if those programs are stronger at normal time controls. So it's something about the Ippo algorithms that makes them super-strong at very fast time controls relative to their strength at longer ones. I have a hunch that it has something to do with relative emphasis on the PV; this benefits fast play much more than slow play. Perhaps someone could compare the percentage of PV nodes vs. non-PV nodes on one-second searches of Critter (or Robo or Ivanhoe) vs. Stockfish to see if they are very different. Anyway, it's just one theory.

M ANSARI · Post by **M ANSARI** » Sat Dec 03, 2011 11:31 pm

rvida wrote:
lkaufman wrote:
rvida wrote:
MM wrote:i see that Komodo and Rybka 4.1 are the engines that improve their level of play with longer TC (it's what i'm saying from ages about Komodo).
Or it might be other way around - their level of play decrease with faster TC. There are rumors (on the rybka forum) that R4.1 has particularly bad time management.
I'm not sure what the difference is. Either way, the point is that results at blitz underrate the performance at long time limits of the better scaling engine, whatever the reason.
My point was that maybe Komodo does not "scale better", but instead something is holding it back at very fast TC. For example if you spend some 10ms setting up search, initializing tables or whatever, it might be a serious handicap at ultra fast games but is negligible at long TC. Another reason might be overstepping allocated time by few ms at each move and running into time trouble. Engines checking more often for stop condition generally perform better at fast TC. After how many nodes do you check the time?

I think that this happens with Rybka and I believe that in the case of Rybka it has to do with way too many latencies when initializing the engine. Somehow I have a feeling that it is related to Rybka using processes rather than threads, and thus suffers from an inherent extra time hit which is probably very tiny if you take a single move into consideration, but hurts way more than it should at extremely quick time controls. You can physically see that in effect if you spend some time and look at very fast time control games being played ... Rybka simply takes more than other engines to generate a move. While many blame Rybka's poor time management for poor results, and it is true that changing the time management of Rybka can improve scores in super fast TC games ... the problem is that "fixing" the time management is actually just a band aid where you are simply loading extra time for the engine in the endgame so that it doesn't blunder away a position in the endgame by an engine that doesn't suffer engine initialization lag. Vas has been pretty stubborn about making a threaded Rybka because a process based engine seems to be much better in multi core cluster configuration. One thing for sure though is in the case of Rybka, it most certainly suffers more than other engines when run at extremely fast TC's. I would bet that there is a lot of ELO points to be gained by Rybka at the TC's that the testing groups do if Rybka would have a threaded port.

MM · Post by MM » Sat Dec 03, 2011 11:35 pm

lkaufman wrote:
rvida wrote:
lkaufman wrote:
rvida wrote:
MM wrote:i see that Komodo and Rybka 4.1 are the engines that improve their level of play with longer TC (it's what i'm saying from ages about Komodo).
Or it might be other way around - their level of play decrease with faster TC. There are rumors (on the rybka forum) that R4.1 has particularly bad time management.
I'm not sure what the difference is. Either way, the point is that results at blitz underrate the performance at long time limits of the better scaling engine, whatever the reason.
My point was that maybe Komodo does not "scale better", but instead something is holding it back at very fast TC. For example if you spend some 10ms setting up search, initializing tables or whatever, it might be a serious handicap at ultra fast but is negligible at long TC. Another reason might be overstepping allocated time by few ms at each move and running into time trouble. Engines checking more often for stop condition generally perform better at fast TC. After how many nodes do you check the time?
I don't remember, but we did spend a little time on the question so I'm sure this is not a problem for us. At short time controls Komodo is not bad, it's fine compared to SF and Rybka, but comparing ANY of these programs to ANY Ippo-related program (even including yours despite differences between Ippo and Critter) shows the same thing: Ippos crush everyone else in bullet chess, even if those programs are stronger at normal time controls. So it's something about the Ippo algorithms that makes them super-strong at very fast time controls relative to their strength at longer ones. I have a hunch that it has something to do with relative emphasis on the PV; this benefits fast play much more than slow play. Perhaps someone could compare the percentage of PV nodes vs. non-PV nodes on one-second searches of Critter (or Robo or Ivanhoe) vs. Stockfish to see if they are very different. Anyway, it's just one theory.

It's curious (not really) that in human chess there is the same situation:

relative week players crush strong players at bullet or even blitz and they are Fide Masters (or even not titled players) against Grand Masters (playchess is full of examples). Even me i beat an IM at bullet 2 minutes...

Regards

mwyoung · Post by **mwyoung** » Sat Dec 03, 2011 11:37 pm

lkaufman wrote:
rvida wrote:
lkaufman wrote:
rvida wrote:
MM wrote:i see that Komodo and Rybka 4.1 are the engines that improve their level of play with longer TC (it's what i'm saying from ages about Komodo).
Or it might be other way around - their level of play decrease with faster TC. There are rumors (on the rybka forum) that R4.1 has particularly bad time management.
I'm not sure what the difference is. Either way, the point is that results at blitz underrate the performance at long time limits of the better scaling engine, whatever the reason.
My point was that maybe Komodo does not "scale better", but instead something is holding it back at very fast TC. For example if you spend some 10ms setting up search, initializing tables or whatever, it might be a serious handicap at ultra fast but is negligible at long TC. Another reason might be overstepping allocated time by few ms at each move and running into time trouble. Engines checking more often for stop condition generally perform better at fast TC. After how many nodes do you check the time?
I don't remember, but we did spend a little time on the question so I'm sure this is not a problem for us. At short time controls Komodo is not bad, it's fine compared to SF and Rybka, but comparing ANY of these programs to ANY Ippo-related program (even including yours despite differences between Ippo and Critter) shows the same thing: Ippos crush everyone else in bullet chess, even if those programs are stronger at normal time controls. So it's something about the Ippo algorithms that makes them super-strong at very fast time controls relative to their strength at longer ones. I have a hunch that it has something to do with relative emphasis on the PV; this benefits fast play much more than slow play. Perhaps someone could compare the percentage of PV nodes vs. non-PV nodes on one-second searches of Critter (or Robo or Ivanhoe) vs. Stockfish to see if they are very different. Anyway, it's just one theory.

He are some data points for you and Richard to look at....

CCRL Data 4cpu, 64bit.

Data can be found under complete list tab.

Houdini 2 beats Critter 1.2 by 53.0% at 4/40

Houdini 2 beats stockfish 2.1.1 by 64.0% at 4/40

At 40/40 the data shows an increases in winning percentage over 40/4 results, not less as your theory predicts.

Houdini 2 beats Critter 1.2 by 63.7% at 40/40

Houdini 2 beats stockfish 2.1.1 by 65.4% at 40/40

lkaufman · Post by **lkaufman** » Sat Dec 03, 2011 11:50 pm

[quote="tomgdrumsDon and Larry do indeed talk a lot of smack in the forum.

They seemed to have become obsessed with overtaking Houdini!! It is like watching the computer chess version of "Moby Dick"!![/quote]

I like that analogy! Of course, we have a strong commercial incentive to overtake Houdini, but you are right, it goes beyond that. My main gripe about Houdart is that he has refused to acknowledge the well-established fact that the early Houdini versions were just some Ippolit version with minor modifications. Apparently, he could admit this and still sell legally if the program he cloned didn't have any conditions listed on using it, but for some reason he won't admit the obvious. He deserves credit as someone who improved a strong program considerably, but he is not a program author in any meaningfull sense.
I don't have the same bad feelings about Ippo (or other Ippo-relatives) anymore. While Ippo clearly was modelled after Rybka, it was written as an independent program and is sufficiently different to be (probably) legally ok, even if it could not enter ICGA events. If for any reason we fail to pass Houdini soon, I'll still be happy to see Critter or Rybka or even Ivanhoe do so. At least Richard Vida wrote his own program, even if it was based on Ippo, and he has been reasonably forthcoming about it.

MM · Post by MM » Sat Dec 03, 2011 11:52 pm

mwyoung wrote:
lkaufman wrote:
rvida wrote:
lkaufman wrote:
rvida wrote:
MM wrote:i see that Komodo and Rybka 4.1 are the engines that improve their level of play with longer TC (it's what i'm saying from ages about Komodo).
Or it might be other way around - their level of play decrease with faster TC. There are rumors (on the rybka forum) that R4.1 has particularly bad time management.
I'm not sure what the difference is. Either way, the point is that results at blitz underrate the performance at long time limits of the better scaling engine, whatever the reason.
My point was that maybe Komodo does not "scale better", but instead something is holding it back at very fast TC. For example if you spend some 10ms setting up search, initializing tables or whatever, it might be a serious handicap at ultra fast but is negligible at long TC. Another reason might be overstepping allocated time by few ms at each move and running into time trouble. Engines checking more often for stop condition generally perform better at fast TC. After how many nodes do you check the time?
I don't remember, but we did spend a little time on the question so I'm sure this is not a problem for us. At short time controls Komodo is not bad, it's fine compared to SF and Rybka, but comparing ANY of these programs to ANY Ippo-related program (even including yours despite differences between Ippo and Critter) shows the same thing: Ippos crush everyone else in bullet chess, even if those programs are stronger at normal time controls. So it's something about the Ippo algorithms that makes them super-strong at very fast time controls relative to their strength at longer ones. I have a hunch that it has something to do with relative emphasis on the PV; this benefits fast play much more than slow play. Perhaps someone could compare the percentage of PV nodes vs. non-PV nodes on one-second searches of Critter (or Robo or Ivanhoe) vs. Stockfish to see if they are very different. Anyway, it's just one theory.
He are some data points for you and Richard to look at....

CCRL Data 4cpu, 64bit.

Data can be found under complete list tab.

Houdini 2 beats Critter 1.2 by 53.0% at 4/40

Houdini 2 beats stockfish 2.1.1 by 64.0% at 4/40

At 40/40 the data shows an increases in winning percentage over 40/4 results, not less as your theory predicts.

Houdini 2 beats Critter 1.2 by 63.7% at 40/40

Houdini 2 beats stockfish 2.1.1 by 65.4% at 40/40

Hi Mark,

consider that Houdini 1.5a x64 4 CPU scores 46.9% against Critter 1.2 (32 games) and 63.5% against Stockfish 2.1.1 (63 games).

Then i wonder how the score of 2.0 is slightly better against Stockfish and hugely better against Critter.

I think that much more games are needed.

Best regards

mwyoung · Post by **mwyoung** » Sun Dec 04, 2011 12:18 am

lkaufman wrote:[quote="tomgdrumsDon and Larry do indeed talk a lot of smack in the forum.

They seemed to have become obsessed with overtaking Houdini!! It is like watching the computer chess version of "Moby Dick"!!

I like that analogy! Of course, we have a strong commercial incentive to overtake Houdini, but you are right, it goes beyond that. My main gripe about Houdart is that he has refused to acknowledge the well-established fact that the early Houdini versions were just some Ippolit version with minor modifications. Apparently, he could admit this and still sell legally if the program he cloned didn't have any conditions listed on using it, but for some reason he won't admit the obvious. He deserves credit as someone who improved a strong program considerably, but he is not a program author in any meaningfull sense.
I don't have the same bad feelings about Ippo (or other Ippo-relatives) anymore. While Ippo clearly was modelled after Rybka, it was written as an independent program and is sufficiently different to be (probably) legally ok, even if it could not enter ICGA events. If for any reason we fail to pass Houdini soon, I'll still be happy to see Critter or Rybka or even Ivanhoe do so. At least Richard Vida wrote his own program, even if it was based on Ippo, and he has been reasonably forthcoming about it.[/quote]

Larry the problem is no one cares about that issue but maybe 20 people on CCC. You are a GM, and must know GM players, and I have been around many GM players being a member of the St. Louis Chess Club. So you must know what I know.

The chess pros love Houdini, and I have talked with them about the Rybka, and IPPO isssue. And they think you guys are a joke. One GM made the comment "Who do they think they are the computer fide or something."

I have never seen GM's promote a program like Houdini, GM Yasser Seirawan at the club talked about Houdini all the time. I thought Robert was paying him. GM Yasser Seirawan even promotes Houdini on ICC broadcast.

Larry it is hard to beat good word of mouth.

Albert Silver · Post by **Albert Silver** » Sun Dec 04, 2011 12:52 am

mwyoung wrote:I have never seen GM's promote a program like Houdini, GM Yasser Seirawan at the club talked about Houdini all the time. I thought Robert was paying him. GM Yasser Seirawan even promotes Houdini on ICC broadcast.

You must have missed the Rybka craze. For years, all the GMs ever talked about was Rybka. It was Rybka this and Rybka that. In an interview by Gustafsson in New In Chess, when asked who was the player he admired the most, he replied "Rybka".

Every top engine has had this. Bareev in the 90s described his openings work as take a position, run Fritz, which was king-of-the-hill at the time, and wait for it to come up with something new.

mwyoung · Post by **mwyoung** » Sun Dec 04, 2011 2:22 am

Albert Silver wrote:
mwyoung wrote:I have never seen GM's promote a program like Houdini, GM Yasser Seirawan at the club talked about Houdini all the time. I thought Robert was paying him. GM Yasser Seirawan even promotes Houdini on ICC broadcast.
You must have missed the Rybka craze. For years, all the GMs ever talked about was Rybka. It was Rybka this and Rybka that. In an interview by Gustafsson in New In Chess, when asked who was the player he admired the most, he replied "Rybka".

Every top engine has had this. Bareev in the 90s described his openings work as take a position, run Fritz, which was king-of-the-hill at the time, and wait for it to come up with something new.

I don't agree, I have been around computer chess since the early 80's. I have never seen an explosion like Houdini. Back in the 90's many players still had a hatred for computer chess products. They thought they were a joke.

Here is easy proof to understand why no program has made an impact like Houdini on chess players and the market place.

Never in the history of computer chess has a chess program been so strong (#1 by a wide margin), and so easy to get, and free to the market place. Houdini hit like a Cat. 5 Hurricane.

Albert I am not interested in your Chessbase bias.

Komodo 4 on long time control

Re: Komodo 4 on long time control

Re: Komodo 4 on long time control

Re: Komodo 4 on long time control

Re: Komodo 4 on long time control

Re: Komodo 4 on long time control

Re: Komodo 4 on long time control

Re: Komodo 4 on long time control

Re: Komodo 4 on long time control

Re: Komodo 4 on long time control

Re: Komodo 4 on long time control