LS-ratinglist (news & comments)

pohl4711 · Post by **pohl4711** » Thu Jul 25, 2013 6:11 pm

gladius wrote:
pohl4711 wrote:The test of Stockfish 130724 is running. I will finish that test, if there is a progress or not, because the last finished test (10000 games) was Stockfish 130623 (one month older version).
Result - if all works correct - on sunday. Then we will see, if there is progress again - or regression.

Stefan
Thanks Stefan! Looking forward to seeing the results. There should be a few more changes coming in today that will increase strength as well. The battle never ends .

I hope so!!

But I think a complete LS-testrun per month is really enough...But if there is some unused PC-power on my desk, I will perhaps do another Stockfish-test in between...We will see.
Stay tuned.

Stefan

gladius · Post by **gladius** » Thu Jul 25, 2013 10:55 pm

pohl4711 wrote:
gladius wrote:
pohl4711 wrote:The test of Stockfish 130724 is running. I will finish that test, if there is a progress or not, because the last finished test (10000 games) was Stockfish 130623 (one month older version).
Result - if all works correct - on sunday. Then we will see, if there is progress again - or regression.

Stefan
Thanks Stefan! Looking forward to seeing the results. There should be a few more changes coming in today that will increase strength as well. The battle never ends .
I hope so!!

But I think a complete LS-testrun per month is really enough...But if there is some unused PC-power on my desk, I will perhaps do another Stockfish-test in between...We will see.
Stay tuned.

Stefan

A run per month is definitely more than enough

. It's great to see the self-testing results validated against other opponents. And it really helps to see if there could be a regression in there as well!

pohl4711 · Post by **pohl4711** » Fri Jul 26, 2013 7:34 am

The result of Komodo 5.1r2 is now online

http://ls-ratinglist.beepworld.de/

(Perhaps you have to clear your browsercache or reload the website)

Stefan

Uri Blass · Post by **Uri Blass** » Sat Jul 27, 2013 3:06 am

pohl4711 wrote:
gladius wrote:
pohl4711 wrote:The test of Stockfish 130724 is running. I will finish that test, if there is a progress or not, because the last finished test (10000 games) was Stockfish 130623 (one month older version).
Result - if all works correct - on sunday. Then we will see, if there is progress again - or regression.

Stefan
Thanks Stefan! Looking forward to seeing the results. There should be a few more changes coming in today that will increase strength as well. The battle never ends .
I hope so!!

But I think a complete LS-testrun per month is really enough...But if there is some unused PC-power on my desk, I will perhaps do another Stockfish-test in between...We will see.
Stay tuned.

Stefan

Note that the last stockfish version(26.07) is the best so far based on stockfish-stockfish tests

http://tests.stockfishchess.org/tests

ELO: 48.92 +-3.6 (95%) LOS: 100.0%
Total: 16408 W: 4924 L: 2629 D: 8855

The result of previous test from 19.7.2013 was

ELO: 39.83 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 5350 L: 3067 D: 11583

pohl4711 · Post by **pohl4711** » Sat Jul 27, 2013 7:50 am

Uri Blass wrote:
Note that the last stockfish version(26.07) is the best so far based on stockfish-stockfish tests

http://tests.stockfishchess.org/tests

ELO: 48.92 +-3.6 (95%) LOS: 100.0%
Total: 16408 W: 4924 L: 2629 D: 8855

The result of previous test from 19.7.2013 was

ELO: 39.83 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 5350 L: 3067 D: 11583

The test of Stockfish 130724 was already running, when this new version came out.
But the next test of a Stocfish-development version for the LS-ratinglist will come soon - but it is impossible for me to test all versions...

Stefan

Eelco de Groot · Post by **Eelco de Groot** » Sat Jul 27, 2013 1:03 pm

pohl4711 wrote:
Uri Blass wrote:
Note that the last stockfish version(26.07) is the best so far based on stockfish-stockfish tests

http://tests.stockfishchess.org/tests

ELO: 48.92 +-3.6 (95%) LOS: 100.0%
Total: 16408 W: 4924 L: 2629 D: 8855

The result of previous test from 19.7.2013 was

ELO: 39.83 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 5350 L: 3067 D: 11583

The test of Stockfish 130724 was already running, when this new version came out.
But the next test of a Stocfish-development version for the LS-ratinglist will come soon - but it is impossible for me to test all versions...

Stefan

If you can please test the Stockfish 130724 Stefan! The Stockfish team very much would like to know if there was really a 13 elo regression somewhere. We did not find it in selftesting but against Houdini we may have gotten worse somewhere although nobody found something as large as 13 elo. Well, at least I am very curious about this result!

Thank you,
Eelco

pohl4711 · Post by **pohl4711** » Sat Jul 27, 2013 1:39 pm

Eelco de Groot wrote:
pohl4711 wrote:
Uri Blass wrote:
Note that the last stockfish version(26.07) is the best so far based on stockfish-stockfish tests

http://tests.stockfishchess.org/tests

ELO: 48.92 +-3.6 (95%) LOS: 100.0%
Total: 16408 W: 4924 L: 2629 D: 8855

The result of previous test from 19.7.2013 was

ELO: 39.83 +-3.1 (95%) LOS: 100.0%
Total: 20000 W: 5350 L: 3067 D: 11583

The test of Stockfish 130724 was already running, when this new version came out.
But the next test of a Stocfish-development version for the LS-ratinglist will come soon - but it is impossible for me to test all versions...

Stefan
If you can please test the Stockfish 130724 Stefan! The Stockfish team very much would like to know if there was really a 13 elo regression somewhere. We did not find it in selftesting but against Houdini we may have gotten worse somewhere although nobody found something as large as 13 elo. Well, at least I am very curious about this result!

Thank you,
Eelco

I wrote, that the test of Stockfish 130724 is running...and I will finish it. Thats why I dont wanted to restart with Stockfish 130726.
Final result of Stockfish 130724 on Sunday on my LS-Website, if all works correct (its very, very hot in Berlin at the moment, I hope my PCs dont crash...). What I can say now is, that this version is not a regression, but a (little) progress to Stockfish 130623, which was the latest full testrun of a Stockfish in the LS-ratinglist...but there are still 2000 games to play...

And the regression of 13 Elo that I found in Stockfish 130721 is not sure. I aborted that testrun after 2600 games, so the errorbar of that result (Stockfish 130721) is +/-10 Elo. So perhaps the regession is perhaps only -3 Elo?!?

Best - Stefan

pohl4711 · Post by **pohl4711** » Sun Jul 28, 2013 9:06 am

The result of Stockfish 130724 is now online. Next test Stockfish 130727 - lets see, if the super-patch of Tom Vijlbrief is really worth a +10 Elo increase (against the top 10 engines of computerchess - not only against Stockfish 3)...Stay tuned!

http://ls-ratinglist.beepworld.de

(Perhaps you have to clear your browsercache or reload the website)

Stefan

pohl4711 · Post by **pohl4711** » Sun Jul 28, 2013 1:19 pm

pohl4711 wrote:The result of Stockfish 130724 is now online. Next test Stockfish 130727 - lets see, if the super-patch of Tom Vijlbrief is really worth a +10 Elo increase (against the top 10 engines of computerchess - not only against Stockfish 3)...Stay tuned!

Intermediate result of Stockfish 130727: 2100 games played, +12 Elo to Stockfish 130724(!) But after 2100 games Stockfish 130724 was +6 Elo to Stockfish 130623 and at the end (after 10000 games) it was less than +1 Elo...So I believe that Stockfish 130727 finally will be around 5-7 Elo stronger than Stockfish 130724 - that would still be a great result for one patch. But we will see (on wednesday, if all works correct).

Stefan

Eelco de Groot · Post by **Eelco de Groot** » Sun Jul 28, 2013 2:49 pm

Great! Thanks Stefan! Some of the credit for this version should go to Ryan Takker who rewrote and reduced the impact of some of the piece square tables while improving elo. But it also shows that king safety can and could still be improved and that not everybody is immune to king attacks yet

I hope your testcomputer room is not like a sauna!

Eelco

LS-ratinglist (news & comments)

Re: LS-ratinglist (news & comments)

Re: LS-ratinglist (news & comments)

Re: LS-ratinglist (news & comments)

Re: LS-ratinglist (news & comments)

Re: LS-ratinglist (news & comments)

Re: LS-ratinglist (news & comments)

Re: LS-ratinglist (news & comments)

Re: LS-ratinglist (news & comments)

Re: LS-ratinglist (news & comments)

Re: LS-ratinglist (news & comments)