Stockfish 090613 beating all Houdini versions @ 40/40

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by mwyoung »

geots wrote:
mwyoung wrote:
michiguel wrote:
mwyoung wrote:
S.Taylor wrote:
mwyoung wrote:A development version of Stockfsih is currently topping all Houdini versions at 50 game mark, 300 to be played. Error bar is now +63 / -62



Stockfish Test 2013

Stockfish 090613 64 SSE4.2 - Deep Rybka 4.1 SSE42 x64 5.5 - 3.5 +3/=5/-1 61.11%
Stockfish 090613 64 SSE4.2 - Houdini 3 Pro x64 4.5 - 3.5 +2/=5/-1 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 2.0c Pro x64 4.5 - 3.5 +3/=3/-2 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 1.5a x64 5.0 - 3.0 +2/=6/-0 62.50%
Stockfish 090613 64 SSE4.2 - Stockfish 3 JA 64bit SSE4.2 4.5 - 3.5 +1/=7/-0 56.25%
Stockfish 090613 64 SSE4.2 - Critter 1.6a 64-bit 5.0 - 3.0 +3/=4/-1 62.50%
It's nice to know there is movement!
It will be interesting to see how it continues and even more interesting to see how will be vs the next Houdini and other upcoming engine upgrades.
Stockfish is moving so fast, I can not even keep up. This version I am testing is already many version behind the the current version and I am testing one that is only 3 days old.

I love the fact that the stockfish team is giving the public ever development version. It will keep me busy for some time.

As for Houdini, I hear it will not be out with a new version before next year.

If this is the case it maybe likely Houdini will no longer be the top chess engine.

It is possible going by the current results, that stockfish has already passed Houdini as the top chess engine, but it will take more test games to know for sure. But current results are looking good for Stockfish.
You are making this conclusion with only 8 games
(against H3). The error is gigantic.

Miguel
As always confused by your asinine comments...What conculsion are you talking about. I made no conclusions regarding Houdini 3 with only 8 games. If it were possible I would try typing slower for you, but I think that only works when you are talking.

You may need to look up the meaning of "possible", "if", "current", or the phrase "but it will take more test games to know for sure"




Try this asinine remark. You say it is beating all versions of Houdini @ 40/40. To say that with only the few games you have played tells me either you don't know jack about testing- or you are just delusional.
Here is what I said in total not just the title that is limited in length but still accurate in the results I posted.

It is clear, simple, and accurate to the results.

A development version of Stockfish is currently topping all Houdini versions at 50 game mark, 300 to be played. Error bar is now +63 / -62.


Thanks.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by Daniel Shawul »

Your concern is touching for others Miguel, the funny thing is you only hold my post to this standard. I did not hear this concern at anytime in the past or present until now. Here is a recent post by Larry Kaufman that gives partial results and speculation. And Larry is trying to sell Komodo 5.1 on CCC to CCC members Mr. Moderator. Where is your concern here Miguel.
You are absolutely right about that. It is like the 'commercial exhortations' note of CCC charter doesn't exist when it comes to their beloved engine.
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by geots »

mwyoung wrote:
geots wrote:
mwyoung wrote:
michiguel wrote:
mwyoung wrote:
S.Taylor wrote:
mwyoung wrote:A development version of Stockfsih is currently topping all Houdini versions at 50 game mark, 300 to be played. Error bar is now +63 / -62



Stockfish Test 2013

Stockfish 090613 64 SSE4.2 - Deep Rybka 4.1 SSE42 x64 5.5 - 3.5 +3/=5/-1 61.11%
Stockfish 090613 64 SSE4.2 - Houdini 3 Pro x64 4.5 - 3.5 +2/=5/-1 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 2.0c Pro x64 4.5 - 3.5 +3/=3/-2 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 1.5a x64 5.0 - 3.0 +2/=6/-0 62.50%
Stockfish 090613 64 SSE4.2 - Stockfish 3 JA 64bit SSE4.2 4.5 - 3.5 +1/=7/-0 56.25%
Stockfish 090613 64 SSE4.2 - Critter 1.6a 64-bit 5.0 - 3.0 +3/=4/-1 62.50%
It's nice to know there is movement!
It will be interesting to see how it continues and even more interesting to see how will be vs the next Houdini and other upcoming engine upgrades.
Stockfish is moving so fast, I can not even keep up. This version I am testing is already many version behind the the current version and I am testing one that is only 3 days old.

I love the fact that the stockfish team is giving the public ever development version. It will keep me busy for some time.

As for Houdini, I hear it will not be out with a new version before next year.

If this is the case it maybe likely Houdini will no longer be the top chess engine.

It is possible going by the current results, that stockfish has already passed Houdini as the top chess engine, but it will take more test games to know for sure. But current results are looking good for Stockfish.
You are making this conclusion with only 8 games
(against H3). The error is gigantic.

Miguel
As always confused by your asinine comments...What conculsion are you talking about. I made no conclusions regarding Houdini 3 with only 8 games. If it were possible I would try typing slower for you, but I think that only works when you are talking.

You may need to look up the meaning of "possible", "if", "current", or the phrase "but it will take more test games to know for sure"




Try this asinine remark. You say it is beating all versions of Houdini @ 40/40. To say that with only the few games you have played tells me either you don't know jack about testing- or you are just delusional.
Here is what I said in total not just the title that is limited in length but still accurate in the results I posted.

It is clear, simple, and accurate to the results.

A development version of Stockfish is currently topping all Houdini versions at 50 game mark, 300 to be played. Error bar is now +63 / -62.


Thanks.



Let me be very clear. If you want to really see the difference between Houdini 3 and a Stockfish development version- run them against each other in a head-up match of 100 games or more. As I have done with the Komodo MP beta that will be released tomorrow- ag. the 2 strongest Stockfish Dev. versions, 010613 and 090613. Both of which Komodo MP beat. Don't confuse the issue with a gauntlet if "Stockfish - Houdini" is going to be your headline. Strange things happen to an engine when all it gets is a steady diet of Houdini game after game after game after game...... never getting a breather- just more Houdini on top of Houdini.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by mwyoung »

Daniel Shawul wrote:
Your concern is touching for others Miguel, the funny thing is you only hold my post to this standard. I did not hear this concern at anytime in the past or present until now. Here is a recent post by Larry Kaufman that gives partial results and speculation. And Larry is trying to sell Komodo 5.1 on CCC to CCC members Mr. Moderator. Where is your concern here Miguel.
You are absolutely right about that. It is like the 'commercial exhortations' note of CCC charter doesn't exist when it comes to their beloved engine.
Yes, I have seen that standard change also depending on the program and programmer IMO. And I have been here at CCC since the beginning.

But I don't have a problem with Larry or Don post. I think all information good, and I think people are smart enough to accept or reject that information, or take what they need from that information unlike Miguel.

I was just pointing out the obvious hypocrisy, and false outrage by Miguel and others.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by mwyoung »

geots wrote:
mwyoung wrote:
geots wrote:
mwyoung wrote:
michiguel wrote:
mwyoung wrote:
S.Taylor wrote:
mwyoung wrote:A development version of Stockfsih is currently topping all Houdini versions at 50 game mark, 300 to be played. Error bar is now +63 / -62



Stockfish Test 2013

Stockfish 090613 64 SSE4.2 - Deep Rybka 4.1 SSE42 x64 5.5 - 3.5 +3/=5/-1 61.11%
Stockfish 090613 64 SSE4.2 - Houdini 3 Pro x64 4.5 - 3.5 +2/=5/-1 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 2.0c Pro x64 4.5 - 3.5 +3/=3/-2 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 1.5a x64 5.0 - 3.0 +2/=6/-0 62.50%
Stockfish 090613 64 SSE4.2 - Stockfish 3 JA 64bit SSE4.2 4.5 - 3.5 +1/=7/-0 56.25%
Stockfish 090613 64 SSE4.2 - Critter 1.6a 64-bit 5.0 - 3.0 +3/=4/-1 62.50%
It's nice to know there is movement!
It will be interesting to see how it continues and even more interesting to see how will be vs the next Houdini and other upcoming engine upgrades.
Stockfish is moving so fast, I can not even keep up. This version I am testing is already many version behind the the current version and I am testing one that is only 3 days old.

I love the fact that the stockfish team is giving the public ever development version. It will keep me busy for some time.

As for Houdini, I hear it will not be out with a new version before next year.

If this is the case it maybe likely Houdini will no longer be the top chess engine.

It is possible going by the current results, that stockfish has already passed Houdini as the top chess engine, but it will take more test games to know for sure. But current results are looking good for Stockfish.
You are making this conclusion with only 8 games
(against H3). The error is gigantic.

Miguel
As always confused by your asinine comments...What conculsion are you talking about. I made no conclusions regarding Houdini 3 with only 8 games. If it were possible I would try typing slower for you, but I think that only works when you are talking.

You may need to look up the meaning of "possible", "if", "current", or the phrase "but it will take more test games to know for sure"




Try this asinine remark. You say it is beating all versions of Houdini @ 40/40. To say that with only the few games you have played tells me either you don't know jack about testing- or you are just delusional.
Here is what I said in total not just the title that is limited in length but still accurate in the results I posted.

It is clear, simple, and accurate to the results.

A development version of Stockfish is currently topping all Houdini versions at 50 game mark, 300 to be played. Error bar is now +63 / -62.


Thanks.



Let me be very clear. If you want to really see the difference between Houdini 3 and a Stockfish development version- run them against each other in a head-up match of 100 games or more. As I have done with the Komodo MP beta that will be released tomorrow- ag. the 2 strongest Stockfish Dev. versions, 010613 and 090613. Both of which Komodo MP beat. Don't confuse the issue with a gauntlet if "Stockfish - Houdini" is going to be your headline. Strange things happen to an engine when all it gets is a steady diet of Houdini game after game after game after game...... never getting a breather- just more Houdini on top of Houdini.
I have already done this against Houdini3 and gave Larry my results.

Some of the games were at 40/2 hours, and they are posted on OPEN CHESS. Only posted the 40/2 games.

But over 100 games were played in total.

http://www.open-chess.org/viewtopic.php?f=4&t=2334

That is why I am doing a real ratings test against all the best engines to see if Stockfish still holds up.

http://www.talkchess.com/forum/viewtopi ... c279c2a3ef

It will be interesting to see what this version of stockfish ratings will be after 300 games at 40/40.

Current Results at this time.

Stockfish test 2013

Stockfish 090613 64 SSE4.2 - Deep Rybka 4.1 SSE42 x64 5.5 - 3.5 +3/=5/-1 61.11%
Stockfish 090613 64 SSE4.2 - Houdini 3 Pro x64 5.0 - 4.0 +2/=6/-1 55.56%
Stockfish 090613 64 SSE4.2 - Houdini 2.0c Pro x64 5.5 - 3.5 +4/=3/-2 61.11%
Stockfish 090613 64 SSE4.2 - Houdini 1.5a x64 5.5 - 3.5 +2/=7/-0 61.11%
Stockfish 090613 64 SSE4.2 - Stockfish 3 JA 64bit SSE4.2 5.0 - 4.0 +1/=8/-0 55.56%
Stockfish 090613 64 SSE4.2 - Critter 1.6a 64-bit 5.5 - 3.5 +3/=5/-1 61.11%
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by michiguel »

mwyoung wrote:
michiguel wrote:
mwyoung wrote:
michiguel wrote:
mwyoung wrote:
S.Taylor wrote:
mwyoung wrote:A development version of Stockfsih is currently topping all Houdini versions at 50 game mark, 300 to be played. Error bar is now +63 / -62



Stockfish Test 2013

Stockfish 090613 64 SSE4.2 - Deep Rybka 4.1 SSE42 x64 5.5 - 3.5 +3/=5/-1 61.11%
Stockfish 090613 64 SSE4.2 - Houdini 3 Pro x64 4.5 - 3.5 +2/=5/-1 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 2.0c Pro x64 4.5 - 3.5 +3/=3/-2 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 1.5a x64 5.0 - 3.0 +2/=6/-0 62.50%
Stockfish 090613 64 SSE4.2 - Stockfish 3 JA 64bit SSE4.2 4.5 - 3.5 +1/=7/-0 56.25%
Stockfish 090613 64 SSE4.2 - Critter 1.6a 64-bit 5.0 - 3.0 +3/=4/-1 62.50%
It's nice to know there is movement!
It will be interesting to see how it continues and even more interesting to see how will be vs the next Houdini and other upcoming engine upgrades.
Stockfish is moving so fast, I can not even keep up. This version I am testing is already many version behind the the current version and I am testing one that is only 3 days old.

I love the fact that the stockfish team is giving the public ever development version. It will keep me busy for some time.

As for Houdini, I hear it will not be out with a new version before next year.

If this is the case it maybe likely Houdini will no longer be the top chess engine.

It is possible going by the current results, that stockfish has already passed Houdini as the top chess engine, but it will take more test games to know for sure. But current results are looking good for Stockfish.
You are making this conclusion with only 8 games
(against H3). The error is gigantic.

Miguel
As always confused by your asinine comments...What conculsion are you talking about. I made no conclusions regarding Houdini 3 with only 8 games. If it were possible I would try typing slower for you, but I think that only works when you are talking.

You may need to look up the meaning of "possible", "if", "current", or the phrase "but it will take more test games to know for sure"
No idea where the rudeness come from.

Anyway, based on your current results you are stating, saying, (do not call it concluding if you do not want) that "it is possible...etc". Yes, anything it is possible. My point is that your data cannot support any statement or even speculation about the relative strength between SF and H. Who knows, it is even possible that this version plays worse than the previous one too. Nothing against reporting partial results, but you are making it sound (particularly with the title, as Kai mentioned) more exciting than what it is. I was excited to see the title and then disappointed when I saw the data.

I just pointed out that the critical match to determine the relative strength with Houdini has 8 games. Otherwise, people may not grasp the significance of the data.

Miguel
To be clear, you claimed I made a conclusion regarding Stockfish and Houdini 3 which I did not.

And I could determine a rating for Stockfish without ever playing Houdini 3. To be clear again as posted I am playing 6 of the strongest programs a total of 300 games to determine a rating for Stockfish 090613. This is not a head to head match with Houdini 3. Stockfish could still tie or lose against Houdini 3 in this testing, and still have a higher rating.


--------------------------------------------------------------

Miguel A. Ballicora - "My point is that your data cannot support any statement or even speculation".... "Otherwise, people may not grasp the significance of the data."


Your concern is touching for others Miguel, the funny thing is you only hold my post to this standard. I did not hear this concern at anytime in the past or present until now. Here is a recent post by Larry Kaufman that gives partial results and speculation. And Larry is trying to sell Komodo 5.1 on CCC to CCC members Mr. Moderator. Where is your concern here Miguel.
Mr. Moderator? What on earth myself being a moderator has anything to do with this? What I said was a fact, and actually, I do not even see you disagree with me. The problem is that for some reason I cannot understand you are very touchy about it.

Is there any other thread by somebody using few games in their report? I am sure, I do not read all threads. Apparently the one you may be referring to has a very uninteresting title. No wonder I did not pay attention to it. Yours was flashy, about SF, and caught my attention. So? I cannot make a comment here because I did not make a comment in every single thread that may have been a similar situation?

I have no intention to get into a silly fight. My point was simple and that was all I wanted to say: You say there is a possibility that SF is already stronger than H. (that needed to be confirmed etc. etc.) and I wanted to stress that the error is gigantic. Do you disagree with this? If you don't what is the big deal?.

Miguel

Larry Kaufman - "So samples are still quite small, but it appears likely that Komodo 5.1 will be the number 2 MP engine. Updates later"

Post by Larry Kaufman
Here are some early results for Komodo 5.1 MP. All tests on Windows sse4 machines using Fritz 11 interface, equal number of cores for Komodo and opposing engines.
First, it should be noted that the underlying SP engine is slightly stronger than Komodo 5 but clearly weaker than Komodo CCT. This has been verified by multiple tests including independent testing.
Against Critter 1.6a, at 3' + 2" on 4 cores I got a tied score after 69 games, but at 5'+3" Komodo currently leads by six games out of 45. On 12 cores at 5' + 3" I got a 25 to 25 tie.
Against Stockfish 3.0 at 3'+ 2" on 12 cores Komodo won by 27.5 to 22.5.
Against Deep Rybka 4 on 12 cores at 5' + 3" so far Komodo leads 5 to 2.
Against Houdini 3 on 4 cores at 5' + 3" Komodo is 5 games down after 52 games (-33 elo).

So samples are still quite small, but it appears likely that Komodo 5.1 will be the number 2 MP engine. Updates later

--------------------------------------------

For future reference I don't make conclusions on who has the stronger chess program in my testing. I rely on another person to make those types of determinations.

If you have a problem with his conclusions and the error bars I posted contact him.

http://remi.coulom.free.fr/Bayesian-Elo/

Bye
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by mwyoung »

PS. I found the development versions of Stockfish are also weaker at short time controls, but getting better. My games are at longer time controls, the development versions of Stockfish were doing much better then Stockfish 3 against Houdini 3. That is what got me started testing development version of stockfish at long time controls, I was curious.
Last edited by mwyoung on Mon Jun 17, 2013 5:27 am, edited 1 time in total.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by mwyoung »

"I have no intention to get into a silly fight."

Then don't start one, because that is what you are doing.

There is nothing wrong with my post, move on.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by geots »

mwyoung wrote:PS. I found the development versions of Stockfish are also weaker at short time controls, but getting better. My games are at longer time controls, the development versions of Stockfish were doing much better then Stockfish 3 against Houdini 3. That is what got me started testing development version of stockfish at long time controls, I was curious.


They are better, and they are strong- but they are no match for Houdini 3. I am sorry- but such is life. In fact- I will bet the farm that Komodo MP will easily come in at Number 2 in the world. I have run too many 100s of games with it to not know. You can put a stamp on that and mail it.
Last edited by geots on Mon Jun 17, 2013 5:36 am, edited 1 time in total.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by Adam Hair »

Daniel Shawul wrote:
Your concern is touching for others Miguel, the funny thing is you only hold my post to this standard. I did not hear this concern at anytime in the past or present until now. Here is a recent post by Larry Kaufman that gives partial results and speculation. And Larry is trying to sell Komodo 5.1 on CCC to CCC members Mr. Moderator. Where is your concern here Miguel.
You are absolutely right about that. It is like the 'commercial exhortations' note of CCC charter doesn't exist when it comes to their beloved engine.
From the CCC charter:
Once a member gains access to the message board, he may read all messages and post new or response messages with the proviso that these new or response messages:

1. Are, within reason, on the topic of computer chess
2. Are not abusive in nature
3. Do not contain personal and/or libelous attacks on others
4. Are not flagrant commercial exhortations
5. Are not of questionable legal status.
Believe it or not, we take #4 quite seriously, no matter which engine/author is involved. There has been a history of commercial authors and their customers interacting on CCC, much of it initiated by the customers. It is tolerated to some degree due to the fact that those customers form a significant proportion of the active members of CCC. But we monitor every thread that could be claimed to be a commercial ad, and we debate whether or not #4 has been violated.

I will share one observation I have made. The likelihood that a person claims #4 is being broken is much higher when the claimant has shown past/present hostility towards the accused. I have seen much more flagrant examples of someone advertising their product without a single complaint being lodged. The difference? Those people have not been involved CCC discussions and made some statement(s) that have pissed off other members.