Stockfish 090613 beating all Houdini versions @ 40/40

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by geots »

mwyoung wrote:
geots wrote:
mwyoung wrote:
geots wrote:
mwyoung wrote:PS. I found the development versions of Stockfish are also weaker at short time controls, but getting better. My games are at longer time controls, the development versions of Stockfish were doing much better then Stockfish 3 against Houdini 3. That is what got me started testing development version of stockfish at long time controls, I was curious.


They are better, and they are strong- but they are no match for Houdini 3. I am sorry- but such is life.
I don't know the if you are correct or not, since no one has tested them yet that I know of at longer time controls. I will let the results speak for themselves.

That is why as chess engine testers, we test, and share our results. Would love to see more data points on this other then my own. But most likely that will not happen until a official release of Stockfish 4 or 3.5 or what ever the Stockfish team calls then next official version.


Let me explain something to you. Anytime someone says such and such is better or worse at this control or that control- that is another way of saying the engine in question needs a lot of work. Because the "studs" don't give a shit. They will play you at midnight in a cornfield with extension cords. At any control. To the best- all that is irrelevant bullshit.
You know me here, I question everything when it to chess assumptions. It is my time to waste.

Lets see what the results tell us. The conclusions will be made by Bayeselo not by me.



Look, this is my last thread, Mark. I have tried to help you a bit- but you are not interested.So I got better shit to do. Just let me close by saying no seasoned tester would ever post results the way you did. Because unless you list all the controls- gui, opening positions or generic book- if book- what is the move limit- TB or not- how many cores, what hash- all your results are totally useless bullshit without that. Your problem is you are learning but want to be treated like a pro. And that will turn people off quick.


Bye
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by mwyoung »

geots wrote:
Adam Hair wrote:
mwyoung wrote: Stockfish is moving so fast, I can not even keep up. This version I am testing is already many version behind the the current version and I am testing one that is only 3 days old.

I love the fact that the stockfish team is giving the public ever development version. It will keep me busy for some time.

As for Houdini, I hear it will not be out with a new version before next year.

If this is the case it maybe likely Houdini will no longer be the top chess engine.

It is possible going by the current results, that stockfish has already passed Houdini as the top chess engine, but it will take more test games to know for sure. But current results are looking good for Stockfish.
Stockfish is making quite a bit of progress. However, not every developmental version that Marco releases actually contains a functional change. In fact, there is no functional difference between the version with the timestamp 1371145832, released on June 13, and the version timestamped 1371381670, which was released on June 16.




Adam, it cannot beat Houdini 3. But IF it could, it still would not be Number 1 now- because I will bet any amount of money I can get my hands on there is not a development version right now that can beat Komodo MP. I have tested it too much. That is a promise from me to you.

As for 2 weeks from now, or next month- the way it is spitting out Stockfish versions- who knows. And Marco, besides being a friend, is no dummy. All I can tell you is what I see now. No Ouija board at my house.



Best-

george
I guess other testers can make statements with no games and Miguel is just fine with it. Just need to be Friend of Miguel and standards seems to change.

Miguel were is your outrage here, Man George is making Komodo MP sound great. A must buy....do you also have a problem with George making such speculatons with no game data at all. :roll:

or with Larry's speculations.....

It is starting to dawn on me that there is more going on here then meet the eye. It made no sense why some would get so mad at me and the stockfish results I posted. Why make such by big deal out of nothing.

But are silent when when other testers and programmers do the same and more.

This thread is getting curiouser and curiouser.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by mwyoung »

geots wrote:
mwyoung wrote:
geots wrote:
mwyoung wrote:
geots wrote:
mwyoung wrote:PS. I found the development versions of Stockfish are also weaker at short time controls, but getting better. My games are at longer time controls, the development versions of Stockfish were doing much better then Stockfish 3 against Houdini 3. That is what got me started testing development version of stockfish at long time controls, I was curious.


They are better, and they are strong- but they are no match for Houdini 3. I am sorry- but such is life.
I don't know the if you are correct or not, since no one has tested them yet that I know of at longer time controls. I will let the results speak for themselves.

That is why as chess engine testers, we test, and share our results. Would love to see more data points on this other then my own. But most likely that will not happen until a official release of Stockfish 4 or 3.5 or what ever the Stockfish team calls then next official version.


Let me explain something to you. Anytime someone says such and such is better or worse at this control or that control- that is another way of saying the engine in question needs a lot of work. Because the "studs" don't give a shit. They will play you at midnight in a cornfield with extension cords. At any control. To the best- all that is irrelevant bullshit.
You know me here, I question everything when it to chess assumptions. It is my time to waste.

Lets see what the results tell us. The conclusions will be made by Bayeselo not by me.



Look, this is my last thread, Mark. I have tried to help you a bit- but you are not interested.So I got better shit to do. Just let me close by saying no seasoned tester would ever post results the way you did. Because unless you list all the controls- gui, opening positions or generic book- if book- what is the move limit- TB or not- how many cores, what hash- all your results are totally useless bullshit without that. Your problem is you are learning but want to be treated like a pro. And that will turn people off quick.


Bye
Thanks for your help, and advice on not testing Stockfish development versions when you are seem to have a interest in Komodo doing better. I will just have to keep learning, like I have been since the 1980's testing computer chess programs.

But the picture is becoming clearer on many fronts. This thread has been on eye opener for me.

Like I said it is my time to waste, so I will keep testing Stockfish 090613 and see what the results tell us.

Current results on Stockfish Test version 090613


Stockfish test 2013

Stockfish 090613 64 SSE4.2 - Deep Rybka 4.1 SSE42 x64 6.0 - 4.0 +3/=6/-1 60.00%
Stockfish 090613 64 SSE4.2 - Houdini 3 Pro x64 5.0 - 4.0 +2/=6/-1 55.56%
Stockfish 090613 64 SSE4.2 - Houdini 2.0c Pro x64 5.5 - 3.5 +4/=3/-2 61.11%
Stockfish 090613 64 SSE4.2 - Houdini 1.5a x64 5.5 - 3.5 +2/=7/-0 61.11%
Stockfish 090613 64 SSE4.2 - Stockfish 3 JA 64bit SSE4.2 5.0 - 4.0 +1/=8/-0 55.56%
Stockfish 090613 64 SSE4.2 - Critter 1.6a 64-bit 5.5 - 3.5 +3/=5/-1 61.11%
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
M ANSARI
Posts: 3734
Joined: Thu Mar 16, 2006 7:10 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by M ANSARI »

Can you do a quick test of say 100 games at 3_1 or 5_2 against H3 Pro. No need to include the other engines as so far H3 is top dog (at least for now).
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by mwyoung »

M ANSARI wrote:Can you do a quick test of say 100 games at 3_1 or 5_2 against H3 Pro. No need to include the other engines as so far H3 is top dog (at least for now).
I already did a fast engine match with this version of Stockfish, I can't get the exact data because I am running the test now and it is the middle of a test game.

From memory I know that Houdini 3 was around 50 to 60 elo better. My test was run at 5 +3 on a i7. But it did better then Stockfish 3 against Houdini 3. Again from memory like 20 elo better, don't remember the error bars.

Houdini 3 was better at fast time controls in my test run, don't know about slow games on this version, but seems to be doing well at this time.

I did not do my normal fast game test setup with this version of stockfish 090613, I know before hand I wanted to do slow games, because of earlier good results with a different development version of Stockfish.

Here are only the 40/2 hours games of that version, in a head to head match up with Houdini 3. Over 100 games were played head to head with Houdini 3 at slower time controls. The match ended in a tie with that version. This is why I started full ratings test at 40/40 for this latest version. I normally would never do this with a development version.

Computer info in the game data.

http://www.open-chess.org/viewtopic.php?f=4&t=2334
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by mwyoung »

I took this off what I posted on open chess about the earlier Stockfish development match. This also includes the games played at 40/2 hours that I posted on open chess. The other games were played at 40/20 mins.

--------------------------------------

From open chess:

I ran the match with Bayesian Elo Rating calculator so you can see the error bars for the match

after 106 games Stockfish270513 has a error bar of +/- 25 elo points with a elo rating equal to Houdini 3.

Seems highly likely that stockfish 270513 is on par with Houdini 3.

Hope this helps.

Rank Name Elo + - games score oppo. draws
1 Houdini 3 Pro x64 0 25 25 106 50% 0 63%
2 Stockfish 270513 64 SSE4.2 0 25 25 106 50% 0 63%
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Uri Blass
Posts: 11153
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by Uri Blass »

michiguel wrote:
mwyoung wrote:
michiguel wrote:
mwyoung wrote:
S.Taylor wrote:
mwyoung wrote:A development version of Stockfsih is currently topping all Houdini versions at 50 game mark, 300 to be played. Error bar is now +63 / -62



Stockfish Test 2013

Stockfish 090613 64 SSE4.2 - Deep Rybka 4.1 SSE42 x64 5.5 - 3.5 +3/=5/-1 61.11%
Stockfish 090613 64 SSE4.2 - Houdini 3 Pro x64 4.5 - 3.5 +2/=5/-1 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 2.0c Pro x64 4.5 - 3.5 +3/=3/-2 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 1.5a x64 5.0 - 3.0 +2/=6/-0 62.50%
Stockfish 090613 64 SSE4.2 - Stockfish 3 JA 64bit SSE4.2 4.5 - 3.5 +1/=7/-0 56.25%
Stockfish 090613 64 SSE4.2 - Critter 1.6a 64-bit 5.0 - 3.0 +3/=4/-1 62.50%
It's nice to know there is movement!
It will be interesting to see how it continues and even more interesting to see how will be vs the next Houdini and other upcoming engine upgrades.
Stockfish is moving so fast, I can not even keep up. This version I am testing is already many version behind the the current version and I am testing one that is only 3 days old.

I love the fact that the stockfish team is giving the public ever development version. It will keep me busy for some time.

As for Houdini, I hear it will not be out with a new version before next year.

If this is the case it maybe likely Houdini will no longer be the top chess engine.

It is possible going by the current results, that stockfish has already passed Houdini as the top chess engine, but it will take more test games to know for sure. But current results are looking good for Stockfish.
You are making this conclusion with only 8 games
(against H3). The error is gigantic.

Miguel
As always confused by your asinine comments...What conculsion are you talking about. I made no conclusions regarding Houdini 3 with only 8 games. If it were possible I would try typing slower for you, but I think that only works when you are talking.

You may need to look up the meaning of "possible", "if", "current", or the phrase "but it will take more test games to know for sure"
No idea where the rudeness come from.

Anyway, based on your current results you are stating, saying, (do not call it concluding if you do not want) that "it is possible...etc". Yes, anything it is possible. My point is that your data cannot support any statement or even speculation about the relative strength between SF and H. Who knows, it is even possible that this version plays worse than the previous one too. Nothing against reporting partial results, but you are making it sound (particularly with the title, as Kai mentioned) more exciting than what it is. I was excited to see the title and then disappointed when I saw the data.

I just pointed out that the critical match to determine the relative strength with Houdini has 8 games. Otherwise, people may not grasp the significance of the data.

Miguel
Note that looking at the time control in the title and at the date of stockfish in the title there is no reason to expect enough games for significant result so I support mark young here.

The only relevant claim against him is that he did not give the exact conditions(like hardware and hash and books that he used) or maybe he did in some post and I missed it.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by mwyoung »

Uri Blass wrote:
michiguel wrote:
mwyoung wrote:
michiguel wrote:
mwyoung wrote:
S.Taylor wrote:
mwyoung wrote:A development version of Stockfsih is currently topping all Houdini versions at 50 game mark, 300 to be played. Error bar is now +63 / -62



Stockfish Test 2013

Stockfish 090613 64 SSE4.2 - Deep Rybka 4.1 SSE42 x64 5.5 - 3.5 +3/=5/-1 61.11%
Stockfish 090613 64 SSE4.2 - Houdini 3 Pro x64 4.5 - 3.5 +2/=5/-1 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 2.0c Pro x64 4.5 - 3.5 +3/=3/-2 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 1.5a x64 5.0 - 3.0 +2/=6/-0 62.50%
Stockfish 090613 64 SSE4.2 - Stockfish 3 JA 64bit SSE4.2 4.5 - 3.5 +1/=7/-0 56.25%
Stockfish 090613 64 SSE4.2 - Critter 1.6a 64-bit 5.0 - 3.0 +3/=4/-1 62.50%
It's nice to know there is movement!
It will be interesting to see how it continues and even more interesting to see how will be vs the next Houdini and other upcoming engine upgrades.
Stockfish is moving so fast, I can not even keep up. This version I am testing is already many version behind the the current version and I am testing one that is only 3 days old.

I love the fact that the stockfish team is giving the public ever development version. It will keep me busy for some time.

As for Houdini, I hear it will not be out with a new version before next year.

If this is the case it maybe likely Houdini will no longer be the top chess engine.

It is possible going by the current results, that stockfish has already passed Houdini as the top chess engine, but it will take more test games to know for sure. But current results are looking good for Stockfish.
You are making this conclusion with only 8 games
(against H3). The error is gigantic.

Miguel
As always confused by your asinine comments...What conculsion are you talking about. I made no conclusions regarding Houdini 3 with only 8 games. If it were possible I would try typing slower for you, but I think that only works when you are talking.

You may need to look up the meaning of "possible", "if", "current", or the phrase "but it will take more test games to know for sure"
No idea where the rudeness come from.

Anyway, based on your current results you are stating, saying, (do not call it concluding if you do not want) that "it is possible...etc". Yes, anything it is possible. My point is that your data cannot support any statement or even speculation about the relative strength between SF and H. Who knows, it is even possible that this version plays worse than the previous one too. Nothing against reporting partial results, but you are making it sound (particularly with the title, as Kai mentioned) more exciting than what it is. I was excited to see the title and then disappointed when I saw the data.

I just pointed out that the critical match to determine the relative strength with Houdini has 8 games. Otherwise, people may not grasp the significance of the data.

Miguel
Note that looking at the time control in the title and at the date of stockfish in the title there is no reason to expect enough games for significant result so I support mark young here.

The only relevant claim against him is that he did not give the exact conditions(like hardware and hash and books that he used) or maybe he did in some post and I missed it.
My info is posted in the PGN game records that I post. I was just giving raw data and a quick update, note that it was not my first post on Stockfish 090613. And Miguel did not have a problem then or anyone else.

You know I have been here for a long time. I never had a problem ever with answering someones questions about testing.

I love taking about it....

But this attack started by Miguel. Was obviously not about that....Just an observation. More is motivating this I think.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by Adam Hair »

Uri Blass wrote:
michiguel wrote:
mwyoung wrote:
michiguel wrote:
mwyoung wrote:
S.Taylor wrote:
mwyoung wrote:A development version of Stockfsih is currently topping all Houdini versions at 50 game mark, 300 to be played. Error bar is now +63 / -62



Stockfish Test 2013

Stockfish 090613 64 SSE4.2 - Deep Rybka 4.1 SSE42 x64 5.5 - 3.5 +3/=5/-1 61.11%
Stockfish 090613 64 SSE4.2 - Houdini 3 Pro x64 4.5 - 3.5 +2/=5/-1 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 2.0c Pro x64 4.5 - 3.5 +3/=3/-2 56.25%
Stockfish 090613 64 SSE4.2 - Houdini 1.5a x64 5.0 - 3.0 +2/=6/-0 62.50%
Stockfish 090613 64 SSE4.2 - Stockfish 3 JA 64bit SSE4.2 4.5 - 3.5 +1/=7/-0 56.25%
Stockfish 090613 64 SSE4.2 - Critter 1.6a 64-bit 5.0 - 3.0 +3/=4/-1 62.50%
It's nice to know there is movement!
It will be interesting to see how it continues and even more interesting to see how will be vs the next Houdini and other upcoming engine upgrades.
Stockfish is moving so fast, I can not even keep up. This version I am testing is already many version behind the the current version and I am testing one that is only 3 days old.

I love the fact that the stockfish team is giving the public ever development version. It will keep me busy for some time.

As for Houdini, I hear it will not be out with a new version before next year.

If this is the case it maybe likely Houdini will no longer be the top chess engine.

It is possible going by the current results, that stockfish has already passed Houdini as the top chess engine, but it will take more test games to know for sure. But current results are looking good for Stockfish.
You are making this conclusion with only 8 games
(against H3). The error is gigantic.

Miguel
As always confused by your asinine comments...What conculsion are you talking about. I made no conclusions regarding Houdini 3 with only 8 games. If it were possible I would try typing slower for you, but I think that only works when you are talking.

You may need to look up the meaning of "possible", "if", "current", or the phrase "but it will take more test games to know for sure"
No idea where the rudeness come from.

Anyway, based on your current results you are stating, saying, (do not call it concluding if you do not want) that "it is possible...etc". Yes, anything it is possible. My point is that your data cannot support any statement or even speculation about the relative strength between SF and H. Who knows, it is even possible that this version plays worse than the previous one too. Nothing against reporting partial results, but you are making it sound (particularly with the title, as Kai mentioned) more exciting than what it is. I was excited to see the title and then disappointed when I saw the data.

I just pointed out that the critical match to determine the relative strength with Houdini has 8 games. Otherwise, people may not grasp the significance of the data.

Miguel
Note that looking at the time control in the title and at the date of stockfish in the title there is no reason to expect enough games for significant result so I support mark young here.

The only relevant claim against him is that he did not give the exact conditions(like hardware and hash and books that he used) or maybe he did in some post and I missed it.
Depending on his hardware and how/if games are adjudicated, as well as assuming he was just running Stockfish vs Houdini matches, it is possible to believe he might have had significant result. Especially if the first thread concerning this test was missed.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Stockfish 090613 beating all Houdini versions @ 40/40

Post by Adam Hair »

@Mark

I am interested in the manner which you were attacked by Miguel and any speculation as to the reason for it. I must admit that I can not detect the attack in the public exchanges between you and Miguel.