LS-ratinglist (news & comments)

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
pohl4711
Posts: 2901
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: LS-ratinglist (news & comments)

Post by pohl4711 »

gladius wrote:
pohl4711 wrote:The test of Stockfish 130724 is running. I will finish that test, if there is a progress or not, because the last finished test (10000 games) was Stockfish 130623 (one month older version).
Result - if all works correct - on sunday. Then we will see, if there is progress again - or regression.

Stefan
Thanks Stefan! Looking forward to seeing the results. There should be a few more changes coming in today that will increase strength as well. The battle never ends :).
I hope so!!

But I think a complete LS-testrun per month is really enough...But if there is some unused PC-power on my desk, I will perhaps do another Stockfish-test in between...We will see.
Stay tuned.

Stefan
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: LS-ratinglist (news & comments)

Post by gladius »

pohl4711 wrote:
gladius wrote:
pohl4711 wrote:The test of Stockfish 130724 is running. I will finish that test, if there is a progress or not, because the last finished test (10000 games) was Stockfish 130623 (one month older version).
Result - if all works correct - on sunday. Then we will see, if there is progress again - or regression.

Stefan
Thanks Stefan! Looking forward to seeing the results. There should be a few more changes coming in today that will increase strength as well. The battle never ends :).
I hope so!!

But I think a complete LS-testrun per month is really enough...But if there is some unused PC-power on my desk, I will perhaps do another Stockfish-test in between...We will see.
Stay tuned.

Stefan
A run per month is definitely more than enough :). It's great to see the self-testing results validated against other opponents. And it really helps to see if there could be a regression in there as well!
User avatar
pohl4711
Posts: 2901
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: LS-ratinglist (news & comments)

Post by pohl4711 »

The result of Komodo 5.1r2 is now online

http://ls-ratinglist.beepworld.de/

(Perhaps you have to clear your browsercache or reload the website)


Stefan
Uri Blass
Posts: 11153
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: LS-ratinglist (news & comments)

Post by Uri Blass »

pohl4711 wrote:
gladius wrote:
pohl4711 wrote:The test of Stockfish 130724 is running. I will finish that test, if there is a progress or not, because the last finished test (10000 games) was Stockfish 130623 (one month older version).
Result - if all works correct - on sunday. Then we will see, if there is progress again - or regression.

Stefan
Thanks Stefan! Looking forward to seeing the results. There should be a few more changes coming in today that will increase strength as well. The battle never ends :).
I hope so!!

But I think a complete LS-testrun per month is really enough...But if there is some unused PC-power on my desk, I will perhaps do another Stockfish-test in between...We will see.
Stay tuned.

Stefan
Note that the last stockfish version(26.07) is the best so far based on stockfish-stockfish tests

http://tests.stockfishchess.org/tests

ELO: 48.92 +-3.6 (95%) LOS: 100.0%
Total: 16408 W: 4924 L: 2629 D: 8855

The result of previous test from 19.7.2013 was

ELO: 39.83 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 5350 L: 3067 D: 11583
User avatar
pohl4711
Posts: 2901
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: LS-ratinglist (news & comments)

Post by pohl4711 »

Uri Blass wrote:
Note that the last stockfish version(26.07) is the best so far based on stockfish-stockfish tests

http://tests.stockfishchess.org/tests

ELO: 48.92 +-3.6 (95%) LOS: 100.0%
Total: 16408 W: 4924 L: 2629 D: 8855

The result of previous test from 19.7.2013 was

ELO: 39.83 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 5350 L: 3067 D: 11583

The test of Stockfish 130724 was already running, when this new version came out.
But the next test of a Stocfish-development version for the LS-ratinglist will come soon - but it is impossible for me to test all versions...

Stefan
User avatar
Eelco de Groot
Posts: 4697
Joined: Sun Mar 12, 2006 2:40 am
Full name:   Eelco de Groot

Re: LS-ratinglist (news & comments)

Post by Eelco de Groot »

pohl4711 wrote:
Uri Blass wrote:
Note that the last stockfish version(26.07) is the best so far based on stockfish-stockfish tests

http://tests.stockfishchess.org/tests

ELO: 48.92 +-3.6 (95%) LOS: 100.0%
Total: 16408 W: 4924 L: 2629 D: 8855

The result of previous test from 19.7.2013 was

ELO: 39.83 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 5350 L: 3067 D: 11583

The test of Stockfish 130724 was already running, when this new version came out.
But the next test of a Stocfish-development version for the LS-ratinglist will come soon - but it is impossible for me to test all versions...

Stefan
If you can please test the Stockfish 130724 Stefan! The Stockfish team very much would like to know if there was really a 13 elo regression somewhere. We did not find it in selftesting but against Houdini we may have gotten worse somewhere although nobody found something as large as 13 elo. Well, at least I am very curious about this result!

Thank you,
Eelco
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
User avatar
pohl4711
Posts: 2901
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: LS-ratinglist (news & comments)

Post by pohl4711 »

Eelco de Groot wrote:
pohl4711 wrote:
Uri Blass wrote:
Note that the last stockfish version(26.07) is the best so far based on stockfish-stockfish tests

http://tests.stockfishchess.org/tests

ELO: 48.92 +-3.6 (95%) LOS: 100.0%
Total: 16408 W: 4924 L: 2629 D: 8855

The result of previous test from 19.7.2013 was

ELO: 39.83 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 5350 L: 3067 D: 11583

The test of Stockfish 130724 was already running, when this new version came out.
But the next test of a Stocfish-development version for the LS-ratinglist will come soon - but it is impossible for me to test all versions...

Stefan
If you can please test the Stockfish 130724 Stefan! The Stockfish team very much would like to know if there was really a 13 elo regression somewhere. We did not find it in selftesting but against Houdini we may have gotten worse somewhere although nobody found something as large as 13 elo. Well, at least I am very curious about this result!

Thank you,
Eelco
I wrote, that the test of Stockfish 130724 is running...and I will finish it. Thats why I dont wanted to restart with Stockfish 130726.
Final result of Stockfish 130724 on Sunday on my LS-Website, if all works correct (its very, very hot in Berlin at the moment, I hope my PCs dont crash...). What I can say now is, that this version is not a regression, but a (little) progress to Stockfish 130623, which was the latest full testrun of a Stockfish in the LS-ratinglist...but there are still 2000 games to play...

And the regression of 13 Elo that I found in Stockfish 130721 is not sure. I aborted that testrun after 2600 games, so the errorbar of that result (Stockfish 130721) is +/-10 Elo. So perhaps the regession is perhaps only -3 Elo?!?

Best - Stefan
User avatar
pohl4711
Posts: 2901
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: LS-ratinglist (news & comments)

Post by pohl4711 »

The result of Stockfish 130724 is now online. Next test Stockfish 130727 - lets see, if the super-patch of Tom Vijlbrief is really worth a +10 Elo increase (against the top 10 engines of computerchess - not only against Stockfish 3)...Stay tuned!

http://ls-ratinglist.beepworld.de

(Perhaps you have to clear your browsercache or reload the website)


Stefan
User avatar
pohl4711
Posts: 2901
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

Re: LS-ratinglist (news & comments)

Post by pohl4711 »

pohl4711 wrote:The result of Stockfish 130724 is now online. Next test Stockfish 130727 - lets see, if the super-patch of Tom Vijlbrief is really worth a +10 Elo increase (against the top 10 engines of computerchess - not only against Stockfish 3)...Stay tuned!
Intermediate result of Stockfish 130727: 2100 games played, +12 Elo to Stockfish 130724(!) But after 2100 games Stockfish 130724 was +6 Elo to Stockfish 130623 and at the end (after 10000 games) it was less than +1 Elo...So I believe that Stockfish 130727 finally will be around 5-7 Elo stronger than Stockfish 130724 - that would still be a great result for one patch. But we will see (on wednesday, if all works correct).

Stefan
User avatar
Eelco de Groot
Posts: 4697
Joined: Sun Mar 12, 2006 2:40 am
Full name:   Eelco de Groot

Re: LS-ratinglist (news & comments)

Post by Eelco de Groot »

Great! Thanks Stefan! Some of the credit for this version should go to Ryan Takker who rewrote and reduced the impact of some of the piece square tables while improving elo. But it also shows that king safety can and could still be improved and that not everybody is immune to king attacks yet :) I hope your testcomputer room is not like a sauna!

Eelco
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan