Sedat Canbaz wrote:You are claiming that you are always right,and who have different views than you,those views are wrong
Not at all, I invite everyone to a rational discussion with rational arguments.
If your best argument is that I didn't post on Talkchess before March 2010, we're definitely heading towards a non-rational discussion .
Sedat Canbaz wrote:You are claiming that you are always right,and who have different views than you,those views are wrong
Not at all, I invite everyone to a rational discussion with rational arguments.
If your best argument is that I didn't post on Talkchess before March 2010, we're definitely heading towards a non-rational discussion .
Robert
Dear Robert,
Its ok...i started to believe that you are in Computertchess since 1985
And i am sorry...if you are offended, really i respect you as a good programer
Sedat Canbaz wrote:Timo changed only the openings and now we see completely different results by Komodo 5:
Not at all, we see exactly the same kind of results. It's just two engines close in strength playing a short match. The result can be 4-2 or 2-4, it does not mean anything.
With all respect, but just like Larry you often jump to conclusions or make claims that ignore the basic variability of chess engines matches or rating list results.
Both of you often seek (usually incorrect) explanations for things that don't require explanations other than random variability.
For both of you the "+" and "-" error margin columns in rating lists appear to be some remote theoretical concept without any practical meaning.
It makes rational discussion very difficult.
Robert
Hello dear Robert,
First of all,
I dont claim that (in any conditions),Komodo 5 is stronger than Houdini 2.0c
And the current Timo's test does not prove that Komodo 5 is stronger than Houdini 2.0c
Second,
Still i did not purchased Komodo 5,but i have no patience for Komodo 5 MP
Really i wonder a lot about what would be the results of Komodo MP in SCCT
Third,
Relax...take it easy please and you are right that everything can happen in such short test
*To be sure 100 %, we need more games with slow time controls
You testers must learn a bit of statistics. LOS is never 100%, and sometimes even few games are relevant for given confidence. If you want to know only which engine is better, a 10 games match resulting in 9-1 gives a confidence of >95% that the first engine is better, but one has to check the set-up. If you need to check for 2 Elo points difference, 50,000 games may not be enough for 95% confidence.
Robert is right correcting you, but guess what, although he says Houdini and Komodo are fairly matched and statistical flukes are everything now, he feels pretty confident that Komodo will not surpass Houdini in Timo's 240 Noomen suite games. Me too (>90%).
Kai
And last,
I am BIG fan of your Houdini engine,but however my BIG congratulations to Larry and Don
Finely,now we see Komodo as a serious opponent against Houdini !
Uri Blass wrote:The fact that random variability is a possible explanation for something does not mean that it is the only possible explanation and I see nothing wrong with trying to check possible different explanations.
Based on looking at the games it seems that the time management of komodo is a possible explanation for relatively bad results of komodo(and something seems not logical in the time that Komodo used for moves).
It may explain only 10 elo difference relative to better time management but I see nothing illogical in discussing it.
It is also possible that Komodo has problems in the first positions in the noomen positions and I see no problem in suggesting this possibility.
A lot of things are possible, but there's no way to make any rational claim if the random part is an order of magnitude larger than any other influence.
I do not think that rational claims are only about things that you are sure about them
For example
I find nothing not rational in the previous words of Larry Kaufman:
"I have evidence that our compile, while fine for Intel machines, was unsuitable for AMD machines. If this is confirmed it would explain our disappointing IPON results, and we may have to make another compile for AMD users."
Larry did not claim that he was sure that the reason for the disappointing performance of Komodo at that time is because of the reason that he suggested so he did not ignore the possibility of statistical noise
and I find his words clearly logical because he knew that the komodo team optimized Komodo for Intel Machines and not for AMD machines.
Sedat Canbaz wrote:Timo changed only the openings and now we see completely different results by Komodo 5:
Not at all, we see exactly the same kind of results. It's just two engines close in strength playing a short match. The result can be 4-2 or 2-4, it does not mean anything.
With all respect, but just like Larry you often jump to conclusions or make claims that ignore the basic variability of chess engines matches or rating list results.
Both of you often seek (usually incorrect) explanations for things that don't require explanations other than random variability.
For both of you the "+" and "-" error margin columns in rating lists appear to be some remote theoretical concept without any practical meaning.
It makes rational discussion very difficult.
Robert
Hello dear Robert,
First of all,
I dont claim that (in any conditions),Komodo 5 is stronger than Houdini 2.0c
And the current Timo's test does not prove that Komodo 5 is stronger than Houdini 2.0c
Second,
Still i did not purchased Komodo 5,but i have no patience for Komodo 5 MP
Really i wonder a lot about what would be the results of Komodo MP in SCCT
Third,
Relax...take it easy please and you are right that everything can happen in such short test
*To be sure 100 %, we need more games with slow time controls
You testers must learn a bit of statistics. LOS is never 100%, and sometimes even few games are relevant for given confidence. If you want to know only which engine is better, a 10 games match resulting in 9-1 gives a confidence of >95% that the first engine is better, but one has to check the set-up. If you need to check for 2 Elo points difference, 50,000 games may not be enough for 95% confidence.
Robert is right correcting you, but guess what, although he says Houdini and Komodo are fairly matched and statistical flukes are everything now, he feels pretty confident that Komodo will not surpass Houdini in Timo's 240 Noomen suite games. Me too (>90%).
Kai
And last,
I am BIG fan of your Houdini engine,but however my BIG congratulations to Larry and Don
Finely,now we see Komodo as a serious opponent against Houdini !
Best,
Sedat
Just my two cents over this issue
Oh...i am tired to repeat the same words
This thread is started to be boring...
I never said that 10 games are enough data or even 300-500 games are not enough data to show the engines real strenght
And i strongly believe in that:
-Minimum 1.000 games per player is required for reliable rating
Ok as i mentioned before,
Lets wait and see and what will be the final results...
Btw,thanks for your comments...but i will prefer to see just the games,especially with the current very interesting conditions,which is Timo running
One thing more (for those who does no agree with me),
I just wonder, what is wrong with the current my below statement ???
We need more games...,but as i expected,Komodo is started to show its real power
Really i dont want to loose more time over this issue...
Uri Blass wrote:I do not think that rational claims are only about things that you are sure about them
For example
I find nothing not rational in the previous words of Larry Kaufman:
"I have evidence that our compile, while fine for Intel machines, was unsuitable for AMD machines. If this is confirmed it would explain our disappointing IPON results, and we may have to make another compile for AMD users."
Larry did not claim that he was sure that the reason for the disappointing performance of Komodo at that time is because of the reason that he suggested so he did not ignore the possibility of statistical noise
and I find his words clearly logical because he knew that the komodo team optimized Komodo for Intel Machines and not for AMD machines.
Larry's statement was just a panicky reaction to very partial IPON results, interpreted without proper consideration of the error margins, and resulting in a flawed hypothesis that turned out to be unfounded.
It's a perfect example of what I said above .
Sedat Canbaz wrote:One thing more (for those who does no agree with me),
I just wonder, what is wrong with the current my below statement ???
We need more games...,but as i expected,Komodo is started to show its real power
Really i dont want to loose more time over this issue...
Best,
Sedat
You said that based on the score 3.5:2.5 between two engines separated by no more than 50 Elo points.
I never said that 10 games are enough data or even 300-500 games are not enough data to show the engines real strenght
And i strongly believe in that:
-Minimum 1.000 games per player is required for reliable rating
After I gave you examples of what to do with these numbers, you are coming again with some pretty silly statements.
Uri Blass wrote:I do not think that rational claims are only about things that you are sure about them
For example
I find nothing not rational in the previous words of Larry Kaufman:
"I have evidence that our compile, while fine for Intel machines, was unsuitable for AMD machines. If this is confirmed it would explain our disappointing IPON results, and we may have to make another compile for AMD users."
Larry did not claim that he was sure that the reason for the disappointing performance of Komodo at that time is because of the reason that he suggested so he did not ignore the possibility of statistical noise
and I find his words clearly logical because he knew that the komodo team optimized Komodo for Intel Machines and not for AMD machines.
Larry's statement was just a panicky reaction to very partial IPON results, interpreted without proper consideration of the error margins, and resulting in a flawed hypothesis that turned out to be unfounded.
It's a perfect example of what I said above .
Robert
I do not see it as panicky reaction.
When you see disappointing results then statistical noise is only one of the possible explanations and there is nothing wrong in checking for other possible explanations.
Even without the partial IPON results it was logical to think that maybe a program that is optimized for intel machines can be relatively 10 elo weaker on AMD machines so checking this conjecture is clearly logical
espacially when the IPON partial results supported this possibility.
I know that there is a significant noise but even with noise results clearly can change my opinion and it may be logical to give probability of 60% for something that earlier I gave it only probability of 40%.
You can see that Komodo used only 33 seconds for 31...Qxb2(that houdini did not expect) and only got depth 18 when it got depth 21 at move 30.
It seems that maybe 31...Qxb2 is a losing mistake(not sure if komodo could avoid it by using more time but
I think that when the score drops it is a hint for the program to use more time and at least houdini1.5's score drop here from +2.08 at depth 14 to 0.00 at depth 16-23.
[D]5bk1/5ppp/1N6/n1p1p3/P3P3/1pP1B2P/1Pq2PP1/5QK1 b - - 5 1