Stockfish has dethroned Houdini 3. Nice work Stockfish.

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Post by mwyoung »

jundery wrote:
mwyoung wrote:I have enough games now to show Stockfish latest development versions has surpassed Houdini 3 at TC 40/5. I can not say anything in regards to Komodo vs Stockfish. I can now say Stockfish is now stronger then Houdini 3 at my tested time control played on a i7 840 4core.




Code: Select all

	
1	Stockfish 020913 64 SSE4.2    3313	
2	Houdini 3 Pro x64		       3282
3	Stockfish 4 64 SSE4.2		   3272	


As much as I'd like to believe this (as a very minor contributor to Stockfish), can you post results or even better PGN's. My guess is that this is statistically an abnormality but I'd REALLY love to be proved wrong.
It maybe a statistically abnormality, testing is believing, have computer will travel.

The best way for you to find out if my results are correct is for you to test.

I give the link below. Enjoy

http://abrok.eu/stockfish/
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
PaulieD
Posts: 242
Joined: Tue Jun 25, 2013 8:19 pm

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Post by PaulieD »

When a clear declaration of strength is made as Mark did here, it is necessary to show the # of games, the pgn etc. People will just see it as spam without these things...because there is no proof to justify the BOLD declaration.... :)
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Post by mwyoung »

PaulieD wrote:When a clear declaration of strength is made as Mark did here, it is necessary to show the # of games, the pgn etc. People will just see it as spam without these things...because there is no proof to justify the BOLD declaration.... :)
I am a privet tester. If you want to reject my resulits that is fine. I don't have a problem with that. But the PGN files are mine to publish. I Was only sharing results. If you think my results are wrong then you can confirm or dispute my results in a very short time with your own data. This was a fast time control test of 40/5 not 40/120. I always review other data points.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Dirt
Posts: 2851
Joined: Wed Mar 08, 2006 10:01 pm
Location: Irvine, CA, USA

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Post by Dirt »

mwyoung wrote:I have enough games now...
How many? It's meaningless without the number of games played.
jundery
Posts: 18
Joined: Thu Mar 14, 2013 5:57 am

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Post by jundery »

mwyoung wrote:
jundery wrote: As much as I'd like to believe this (as a very minor contributor to Stockfish), can you post results or even better PGN's. My guess is that this is statistically an abnormality but I'd REALLY love to be proved wrong.
It maybe a statistically abnormality, testing is believing, have computer will travel.

The best way for you to find out if my results are correct is for you to test.

I give the link below. Enjoy

http://abrok.eu/stockfish/
By contributor, I mean I've modified, compiled and tested code and then have the code changes accepted after passing meaningful statistical tests. (i.e. I don't need a link to a download.) Posting ELO results without context is meaningless.

I've no doubt your test results are true for your test, I do however doubt they are statistically relevant. If you don't give the W/D/L statistics no one can draw any conclusions beyond on one test run by one tester Stockfish won.
syzygy
Posts: 5843
Joined: Tue Feb 28, 2012 11:56 pm

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Post by syzygy »

jundery wrote:I've no doubt your test results are true for your test, I do however doubt they are statistically relevant. If you don't give the W/D/L statistics no one can draw any conclusions beyond on one test run by one tester Stockfish won.
And I would add that there is somewhat of a discrepancy between loudly announcing that Stockfish has dethroned Houdin 3 and being secretive about the basis for this announcement. If it was just a private test only meant to satisfy your own curiosity and not meant to convince anyone else of anything, then there certainly was no need for any big headlines. But we've been here before (except that the previous time that Stockfish was leading all its opponents the number of games was mentioned and it was tiny).
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Post by Adam Hair »

[MODERATION]

An insult was removed from this thread. While it is completely permissible to express opinions concerning the content of this thread, it is not allowed to curse the OP.
lucasart
Posts: 3243
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Post by lucasart »

jundery wrote:
mwyoung wrote:
jundery wrote: As much as I'd like to believe this (as a very minor contributor to Stockfish), can you post results or even better PGN's. My guess is that this is statistically an abnormality but I'd REALLY love to be proved wrong.
It maybe a statistically abnormality, testing is believing, have computer will travel.

The best way for you to find out if my results are correct is for you to test.

I give the link below. Enjoy

http://abrok.eu/stockfish/
By contributor, I mean I've modified, compiled and tested code and then have the code changes accepted after passing meaningful statistical tests. (i.e. I don't need a link to a download.) Posting ELO results without context is meaningless.

I've no doubt your test results are true for your test, I do however doubt they are statistically relevant. If you don't give the W/D/L statistics no one can draw any conclusions beyond on one test run by one tester Stockfish won.
Indeed.

Stockfish is not stronger than Houdini:
* latest regfression test shows that SF 20130907 is +27 ELO above Stockfish 4, at 1min games (expect it to be less at long TC).
* according to all rating lists, Houdini 3 is more than 27 ELO ahead of SF 4.

That being said, it won't be long before SF becomes the strongest engine in the world. The fact that it is developped by an open source community, powered by monstruous testing resources make it inevitable. The only question is when. No one can stand in the way of the Stockfish steam-roller, at least not for long :-)

Recently I got bored of developping DiscoCheck on my own, with my lame testing resources, so I joined the SF team again. There are some politics which is inevitable in team work (ass opposed to developping on your own), but overall it's a more interesting and enriching experience.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Post by Laskos »

lucasart wrote:
Stockfish is not stronger than Houdini:
* latest regfression test shows that SF 20130907 is +27 ELO above Stockfish 4, at 1min games (expect it to be less at long TC).
* according to all rating lists, Houdini 3 is more than 27 ELO ahead of SF 4.
It is not clear at 40/40' and longer TC on 4 cores or more. The newest Komodo enters the scene too in these conditions, as it also scales well. We will see the nTCEC results.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Post by mwyoung »

syzygy wrote:
jundery wrote:I've no doubt your test results are true for your test, I do however doubt they are statistically relevant. If you don't give the W/D/L statistics no one can draw any conclusions beyond on one test run by one tester Stockfish won.
And I would add that there is somewhat of a discrepancy between loudly announcing that Stockfish has dethroned Houdin 3 and being secretive about the basis for this announcement. If it was just a private test only meant to satisfy your own curiosity and not meant to convince anyone else of anything, then there certainly was no need for any big headlines. But we've been here before (except that the previous time that Stockfish was leading all its opponents the number of games was mentioned and it was tiny).
I have those results. I have many hundreds of games. I said clearly I can say Stockfish is now stronger the Houdini 3 at my tested time control played on a i7.

What I won't do is get into a war about statistics. I have been a tester since the 1980's. So I have some knowledge in this area. I have been posting results on CCC since CCC first came online.

I will report my results, and if my results using statistics say Stockfish is now proven most likely stronger I will say this.

As I suggested, don't take my test as your only data point. Stockfish is a free program. Test it for yourself...
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.