Komodo 4 on long time control

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: Komodo 4 on long time control

Post by MM »

tano-urayoan wrote:
MM wrote: i didnt speculate, why should i
Are you serious almost all of your threads are speculation. Look at the title of this thread Komodo 4 does not yet exist and yet you are writing about it.
Have you an idea of what ''speculation'' means? Did you really read the title of this thread? Are you accusing me of speculating?

I bought a copy of Houdini 2.0, i mailed a huge quantity of informations and suggestions to Mr Houdart about its engine. I just made a prediction. Speculating is another think that has nothing to do with my words and honestly i don't see the sense in speculating in making a prediction that could be right or wrong.

As regards Komodo 4, if you read lately this forum you see that Komodo's author have announced for the 1st half of this month the release of the new Komodo. It could have another name (Komodo 5, Komodo 3.5 or any other) but the more logical name is Komodo 4. Where should be the speculation in calling the new version ''komodo 4''?
MM
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: Komodo 4 on long time control

Post by MM »

Frank Quisinsky wrote:Hi Maurizio,

that would be great but Komodo 4 x64 must be 60 ELO stronger as Houdini 2.0c x64. I added the results from the ended round robin and games on my webpage, and TalkChess tournament / matches selection.

60 ELO is a lot for the Komodo team.

Best
Frank
Hi Frank, i know your professionality and i don't doubt about what you say. But please consider that i started this thread saying that i think that Komodo is going to overtake Houdini at ''tournament time control''. I don't know if this will really happen and when, but it is my opinion, based on the fact that in all charts i have looked at, komodo is closer to houdini more that the TC is longer. More, the authors talk about a significant improvement of Komodo and they said many times that they feel very close to Houdini at long TC or even stronger. Honestly, i feel free to give them credit.

Best regards
MM
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: Komodo 4 on long time control

Post by MM »

Don wrote:
mwyoung wrote:
Kingghidorah wrote:
MM wrote:I think that Komodo 4 on tournament time control is going to overtake Houdini. That is my opinion looking at the different rating related at different time controls (http://www.amateurschach.de/)

Regards
It'll be close

Lonnie

I don't think close will cut it. If Komodo 4 is only close to Houdini 2. Then Komodo 4 is nothing more then a also-ran. And Komodo's sales will suffer.

I read from the Komodo team that Houdini does not scale well at long time controls. I have not seen this in my testing. And I now think that was wishful thinking. Since they said at one point that Komodo was shooting for an October or November release.

This tells me Komodo is having trouble overtaking Houdini. If Komodo 4 does not make the release for the Christmas buying season, it will be most telling.

And I find it funny for people to assume that Houdini is some how a static target....
The versions of Komodo we have now are the same strength or stronger than Houdini. If you look at CCRL 40/40 you will see that Komodo is in second place, 27 ELO behind Houdini. We have gained over 30 ELO so we have caught Houdini 2.0 and probably even the slightly stronger Houdini 1.5. This is 64 bit single CPU engines. Rybka 4 is only slightly behind and in general there are 4 programs in a close pack for second place with Stockfish lagging a bit behind but after than almost 100 ELO gap.

The primary wait is that we want to get the SMP version working well and we would like to come out with a program that is significantly stronger than Houdini, not just slightly stronger. But we are not going to delay a long time just to do this as we can do it with a later release. Houdini and Rybka will no doubt respond, but we will follow up with something substantially stronger. Komodo is ripe for many optimization's that are already in other programs and we have never focused as much on that as we have been able to achieve ELO by smarter algorithms.
Well, reading this words from one of the authors of Komodo i cannot avoid to give him credit. If i was Don, i would never say these things if they were not true.

Regards
MM
tano-urayoan
Posts: 638
Joined: Thu Aug 30, 2007 8:23 pm
Location: San Juan, Puerto Rico

Re: Komodo 4 on long time control

Post by tano-urayoan »

MM wrote: Have you an idea of what ''speculation'' means?
spec·u·la·tion /ˌspɛkjəˈleɪʃən/ noun
plural spec·u·la·tions
1 : ideas or guesses about something that is not known


Do you have data about Komodo 4 on long time control?

No, so you are speculating, or guessing or whatever word you want to use.
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Komodo 4 on long time control

Post by lkaufman »

Albert Silver wrote:I tend to agree in all honesty. I do think that there is a difference in hyperspeed games, and not, but believe this difference disappears quite quickly unless one of the engines has serious time management issues.

Vincent used to make claims that Diep would be the top program given enough time due to its sheer superiority in knowledge over others. Ultimately it did not work that way. The diminishing returns he expected to kick in with greater ply-depth, that his engine's knowledge would outweigh, consistently failed to kick in.

In other words, if an engine was killing Diep because it was outsearching it by X Plies, it continued to kill him due to those plies even with greater time spent.

I do not believe there is an engine that will be weaker than Houdini at 5 minute games, but stronger at 2 hour games, unless they are already neck-to-neck (+ or -10 Elo) from the start.
While I agree that differences between blitz results and tournament level results will not be enormous (as in 100 or more elo), I think your estimate of a limit of 10 is way too low. Without even comparing unrelated programs, just compare Houdini 1.5a and Houdini 2.0. All blitz tests on single core (CCRL, CEGT, IPON, and others) agree that Houdini 2.0 is stronger (ccrl by 12 elo, cegt by 4 elo, others a bit higher, so average a bit over 10 elo). This is based on thousands of games. At 40/20 or 40/40 we find the opposite, H1.5 is stronger by 12 elo on CCRL and by 13 on CEGT, all with decent size samples. There is no slow data on 4 cores for 2.0 on either CCRL or CEGT that meets the minimum number of games to avoid being greyed out. So we have a net swing just going from blitz to an average of 40/30 of about 25 elo, just for two successive versions of the same program!! I know that some of this 25 could be sample error, but even if half of it is bogus it would indicate that going from blitz to 40/2 could swing relative ratings by 25 elo just in this one case. Surely with unrelated programs the swing could be much greater. I think that roughly 50 elo is the maximum likely swing in relative ratings going from blitz to 40/2.

In the present instance the trend is really marked. Someone has a 1 minute rating list that shows Houdini to be something like a hundred elo above everyone else. It's been very clear to us for a long time that all of the Ippolit-related programs, even including Critter which is related but does not copy code directly from Ippolit, are incredibly strong at bullet relative to all other programs (basically only Rybka, Komodo and Stockfish are strong enough to even compare to the Ippos), much less so but still superior in blitz, and start to be weaker somewhere around 40/20 or so. Houdini is stronger than the other Ippos, but clearly follows the same pattern of a marked decrease in relative strength with more time. This will only become clear though once a non-ippo program catches Houdini at 40/20 or 40/40.

I would be interested in any thoughtful comments as to why Stockfish gains on the ippos as we go from bullet to blitz to 40/20 to 40/40. I say Stockfish rather than Komodo because Stockfish is open-source so people need not guess as to what they are doing, they can actually compare SF code to Ippo code. I strongly suspect that whatever the answer is, it will also apply to Komodo, as our search has much more in common with SF than with Ippos.
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: Komodo 4 on long time control

Post by MM »

tano-urayoan wrote:
MM wrote: Have you an idea of what ''speculation'' means?
spec·u·la·tion /ˌspɛkjəˈleɪʃən/ noun
plural spec·u·la·tions
1 : ideas or guesses about something that is not known


Do you have data about Komodo 4 on long time control?

No, so you are speculating, or guessing or whatever word you want to use.
I don't want to continue this totally useless discussion. If you read what i said you will see i just expressed an opinion, a thought based on several factors:

1. the bigger strenght of Komodo at long time control ''(see the charts, like CEGT, and CCRL and see how the gap diminishes more that the TC is longer).
2. I talked about ''tournament time control'' that is much more than 40/10 or 40/40.
2. The authors of Komodo clearly said that they feel to be very close or at the same level or even slightly ahead of Komodo in long time control.
3. Every strategical/positional engine (and humans too) improve its strenght with more time available because it is very well known that tactical engines (and humans) have better results in short time controls. Komodo has a ''mainly'' strategical/positional style.
4. I know Larry Kaufman and Don Dailey and i cannot believe that they claim to be so close to Houdini without having heavy proofs of that.

If you are still convinced that i am speculating keep on believing it.

Best Regards
MM
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: Komodo 4 on long time control

Post by Houdini »

lkaufman wrote:While I agree that differences between blitz results and tournament level results will not be enormous (as in 100 or more elo), I think your estimate of a limit of 10 is way too low. Without even comparing unrelated programs, just compare Houdini 1.5a and Houdini 2.0. All blitz tests on single core (CCRL, CEGT, IPON, and others) agree that Houdini 2.0 is stronger (ccrl by 12 elo, cegt by 4 elo, others a bit higher, so average a bit over 10 elo). This is based on thousands of games. At 40/20 or 40/40 we find the opposite, H1.5 is stronger by 12 elo on CCRL and by 13 on CEGT, all with decent size samples. There is no slow data on 4 cores for 2.0 on either CCRL or CEGT that meets the minimum number of games to avoid being greyed out. So we have a net swing just going from blitz to an average of 40/30 of about 25 elo, just for two successive versions of the same program!! I know that some of this 25 could be sample error, but even if half of it is bogus it would indicate that going from blitz to 40/2 could swing relative ratings by 25 elo just in this one case. Surely with unrelated programs the swing could be much greater. I think that roughly 50 elo is the maximum likely swing in relative ratings going from blitz to 40/2.
Your whole "scaling" story is statistically unsound. Individual errors on heterogeneous rating lists are easily 20 Elo points. Comparing two engines implies that the error on the comparison will easily be 30 points (1.4 * 20). Here you're actually comparing 4 ratings results, the uncertainty of this is easily 40 Elo (2 * 20).
Making claims about other engines based on rating differences that clearly lie below the level of uncertainty is technically *very* dubious.
From all information that is available to me I see no reason to assume that the scaling of Houdini 2 is any different from the scaling of Houdini 1.5.

As a side note, it's not your finest display of ethics to spread invalid claims about other engines to promote your own commercial release.

Robert
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Komodo 4 on long time control

Post by Adam Hair »

Houdini wrote:
lkaufman wrote:While I agree that differences between blitz results and tournament level results will not be enormous (as in 100 or more elo), I think your estimate of a limit of 10 is way too low. Without even comparing unrelated programs, just compare Houdini 1.5a and Houdini 2.0. All blitz tests on single core (CCRL, CEGT, IPON, and others) agree that Houdini 2.0 is stronger (ccrl by 12 elo, cegt by 4 elo, others a bit higher, so average a bit over 10 elo). This is based on thousands of games. At 40/20 or 40/40 we find the opposite, H1.5 is stronger by 12 elo on CCRL and by 13 on CEGT, all with decent size samples. There is no slow data on 4 cores for 2.0 on either CCRL or CEGT that meets the minimum number of games to avoid being greyed out. So we have a net swing just going from blitz to an average of 40/30 of about 25 elo, just for two successive versions of the same program!! I know that some of this 25 could be sample error, but even if half of it is bogus it would indicate that going from blitz to 40/2 could swing relative ratings by 25 elo just in this one case. Surely with unrelated programs the swing could be much greater. I think that roughly 50 elo is the maximum likely swing in relative ratings going from blitz to 40/2.
Your whole "scaling" story is statistically unsound. Individual errors on heterogeneous rating lists are easily 20 Elo points. Comparing two engines implies that the error on the comparison will easily be 30 points (1.4 * 20). Here you're actually comparing 4 ratings results, the uncertainty of this is easily 40 Elo (2 * 20).
Making claims about other engines based on rating differences that clearly lie below the level of uncertainty is technically *very* dubious.
From all information that is available to me I see no reason to assume that the scaling of Houdini 2 is any different from the scaling of Houdini 1.5.

As a side note, it's not your finest display of ethics to spread invalid claims about other engines to promote your own commercial release.

Robert
It definitely is statistically unsound. Whether or not the claims are invalid remains to be seen.
Uri Blass
Posts: 11107
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Komodo 4 on long time control

Post by Uri Blass »

Here is the list of the top programs and I see only little difference between CCRL 40/4 and CCRL 40/40

program are ranked here based on average rating between CCRL 40/40 and CCRL 40/4(the last numebr in every line is the standard rating advantage of the program and usaully the good programs have negative standard rating advantage.

1)Houdini 1.5a 64-bit 4CPU 3332.5(-63)
2)Critter 1.2 64-bit 4CPU 3282.5(-47)
3)Rybka 4.1 64-bit 4CPU 3281(-34)
4)Rybka 4 64-bit 4CPU 3274.5(-33)
5)Stockfish 2.1.1 64-bit 4CPU 3263.5(-65)
6)Houdini 1.5a 64-bit 3260(-8)
7)Stockfish 2.0.1 64-bit 4CPU 3251.5(-21)
8)Rybka 3 64-bit 4CPU 3245.5(-33)
9)Stockfish 1.9.1 64-bit 4CPU 3234.5(-27)
10)Stockfish 1.7.1 64-bit 4CP 3224(-24)
11)Stockfish 1.8 64-bit 4CPU 3223.5(-19)
12)Rybka4 64-bit 2CPU 3216(-2)
13)Critter0.90 64-bit 4CPU 3209.5(-29)
14)Rybka 4.1 64-bit 3208.5(+7)
15)Citter1.2 64-bit 3205.5(-5)
16)Komodo 3 64-bit 3202.5(+13)
17)Rybka 4 64-bit 3198(-2)
18)Rybka 3 64-bit 2CPU 3195(-46)
19)Naum4.2 64-bit 4CPU 3187(-14)
20)Stockfish 2.1.1 64-bit 3184.5(-9)
Last edited by Uri Blass on Sat Dec 03, 2011 5:16 pm, edited 1 time in total.
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: Komodo 4 on long time control

Post by MM »

SzG wrote:THis thread has now grown to 6 pages when it should not exist at all. Why all this speculation when we are going to know everything in two week's time?
Why it should not exist at all?

It's like to say that if one has a thought or an opinion about a fact that still has to come, he must keep silence.
MM