Stockfish has dethroned Houdini 3. Nice work Stockfish.

mwyoung · Post by **mwyoung** » Fri Sep 06, 2013 1:17 pm

jundery wrote:
mwyoung wrote:I have enough games now to show Stockfish latest development versions has surpassed Houdini 3 at TC 40/5. I can not say anything in regards to Komodo vs Stockfish. I can now say Stockfish is now stronger then Houdini 3 at my tested time control played on a i7 840 4core.
Code: Select all
	
1	Stockfish 020913 64 SSE4.2    3313	
2	Houdini 3 Pro x64		       3282
3	Stockfish 4 64 SSE4.2		   3272	
As much as I'd like to believe this (as a very minor contributor to Stockfish), can you post results or even better PGN's. My guess is that this is statistically an abnormality but I'd REALLY love to be proved wrong.

It maybe a statistically abnormality, testing is believing, have computer will travel.

The best way for you to find out if my results are correct is for you to test.

I give the link below. Enjoy

http://abrok.eu/stockfish/

PaulieD · Post by **PaulieD** » Fri Sep 06, 2013 6:58 pm

When a clear declaration of strength is made as Mark did here, it is necessary to show the # of games, the pgn etc. People will just see it as spam without these things...because there is no proof to justify the BOLD declaration....

mwyoung · Post by **mwyoung** » Fri Sep 06, 2013 7:43 pm

PaulieD wrote:When a clear declaration of strength is made as Mark did here, it is necessary to show the # of games, the pgn etc. People will just see it as spam without these things...because there is no proof to justify the BOLD declaration....

I am a privet tester. If you want to reject my resulits that is fine. I don't have a problem with that. But the PGN files are mine to publish. I Was only sharing results. If you think my results are wrong then you can confirm or dispute my results in a very short time with your own data. This was a fast time control test of 40/5 not 40/120. I always review other data points.

Dirt · Post by **Dirt** » Sat Sep 07, 2013 10:16 pm

mwyoung wrote:I have enough games now...

How many? It's meaningless without the number of games played.

jundery · Post by **jundery** » Sun Sep 08, 2013 10:57 am

mwyoung wrote:
jundery wrote: As much as I'd like to believe this (as a very minor contributor to Stockfish), can you post results or even better PGN's. My guess is that this is statistically an abnormality but I'd REALLY love to be proved wrong.
It maybe a statistically abnormality, testing is believing, have computer will travel.

The best way for you to find out if my results are correct is for you to test.

I give the link below. Enjoy

http://abrok.eu/stockfish/

By contributor, I mean I've modified, compiled and tested code and then have the code changes accepted after passing meaningful statistical tests. (i.e. I don't need a link to a download.) Posting ELO results without context is meaningless.

I've no doubt your test results are true for your test, I do however doubt they are statistically relevant. If you don't give the W/D/L statistics no one can draw any conclusions beyond on one test run by one tester Stockfish won.

syzygy · Post by **syzygy** » Sun Sep 08, 2013 12:09 pm

jundery wrote:I've no doubt your test results are true for your test, I do however doubt they are statistically relevant. If you don't give the W/D/L statistics no one can draw any conclusions beyond on one test run by one tester Stockfish won.

And I would add that there is somewhat of a discrepancy between loudly announcing that Stockfish has dethroned Houdin 3 and being secretive about the basis for this announcement. If it was just a private test only meant to satisfy your own curiosity and not meant to convince anyone else of anything, then there certainly was no need for any big headlines. But we've been here before (except that the previous time that Stockfish was leading all its opponents the number of games was mentioned and it was tiny).

Adam Hair · Post by **Adam Hair** » Sun Sep 08, 2013 1:15 pm

[MODERATION]

An insult was removed from this thread. While it is completely permissible to express opinions concerning the content of this thread, it is not allowed to curse the OP.

lucasart · Post by **lucasart** » Sun Sep 08, 2013 2:39 pm

jundery wrote:
mwyoung wrote:
jundery wrote: As much as I'd like to believe this (as a very minor contributor to Stockfish), can you post results or even better PGN's. My guess is that this is statistically an abnormality but I'd REALLY love to be proved wrong.
It maybe a statistically abnormality, testing is believing, have computer will travel.

The best way for you to find out if my results are correct is for you to test.

I give the link below. Enjoy

http://abrok.eu/stockfish/
By contributor, I mean I've modified, compiled and tested code and then have the code changes accepted after passing meaningful statistical tests. (i.e. I don't need a link to a download.) Posting ELO results without context is meaningless.

I've no doubt your test results are true for your test, I do however doubt they are statistically relevant. If you don't give the W/D/L statistics no one can draw any conclusions beyond on one test run by one tester Stockfish won.

Indeed.

Stockfish is not stronger than Houdini:
* latest regfression test shows that SF 20130907 is +27 ELO above Stockfish 4, at 1min games (expect it to be less at long TC).
* according to all rating lists, Houdini 3 is more than 27 ELO ahead of SF 4.

That being said, it won't be long before SF becomes the strongest engine in the world. The fact that it is developped by an open source community, powered by monstruous testing resources make it inevitable. The only question is when. No one can stand in the way of the Stockfish steam-roller, at least not for long

Recently I got bored of developping DiscoCheck on my own, with my lame testing resources, so I joined the SF team again. There are some politics which is inevitable in team work (ass opposed to developping on your own), but overall it's a more interesting and enriching experience.

Laskos · Post by **Laskos** » Sun Sep 08, 2013 3:06 pm

lucasart wrote:
Stockfish is not stronger than Houdini:
* latest regfression test shows that SF 20130907 is +27 ELO above Stockfish 4, at 1min games (expect it to be less at long TC).
* according to all rating lists, Houdini 3 is more than 27 ELO ahead of SF 4.

It is not clear at 40/40' and longer TC on 4 cores or more. The newest Komodo enters the scene too in these conditions, as it also scales well. We will see the nTCEC results.

mwyoung · Post by **mwyoung** » Sun Sep 08, 2013 3:27 pm

syzygy wrote:
jundery wrote:I've no doubt your test results are true for your test, I do however doubt they are statistically relevant. If you don't give the W/D/L statistics no one can draw any conclusions beyond on one test run by one tester Stockfish won.
And I would add that there is somewhat of a discrepancy between loudly announcing that Stockfish has dethroned Houdin 3 and being secretive about the basis for this announcement. If it was just a private test only meant to satisfy your own curiosity and not meant to convince anyone else of anything, then there certainly was no need for any big headlines. But we've been here before (except that the previous time that Stockfish was leading all its opponents the number of games was mentioned and it was tiny).

I have those results. I have many hundreds of games. I said clearly I can say Stockfish is now stronger the Houdini 3 at my tested time control played on a i7.

What I won't do is get into a war about statistics. I have been a tester since the 1980's. So I have some knowledge in this area. I have been posting results on CCC since CCC first came online.

I will report my results, and if my results using statistics say Stockfish is now proven most likely stronger I will say this.

As I suggested, don't take my test as your only data point. Stockfish is a free program. Test it for yourself...

Stockfish has dethroned Houdini 3. Nice work Stockfish.

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.

Re: Stockfish has dethroned Houdini 3. Nice work Stockfish.