Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
sovaz1997
Posts: 217
Joined: Sun Nov 13, 2016 9:37 am

Re: Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Post by sovaz1997 » Tue Jun 12, 2018 12:04 pm

corres wrote:
Tue Jun 12, 2018 11:40 am
AndrewGrant wrote:
Mon Jun 11, 2018 11:35 pm
No comment...
Edgy closers don't replace statistics.
If you are interested in statistics please, make those statistics.
The data are public.
To me it is obvious that to loose 110 Elo during 44 games this is not a statistical issue.
Include mathematical statistics and disable feelings. I think the topic can be closed. Why should we read this nonsense?

And yes: SF 9 use in regression tests, non Sf8/Sf7 lol
Zevra chess engine
Binary, source and description here: https://gitlab.com/sovaz1997/zevra2/tags

corres
Posts: 1398
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Post by corres » Tue Jun 12, 2018 12:55 pm

sovaz1997 wrote:
Tue Jun 12, 2018 12:04 pm

And yes: SF 9 use in regression tests, non Sf8/Sf7 lol
If the developers of Stockfish would use only regression tests it was a very wrong thing.
If you are so clever please, explain us from what arises the 110 Elo loss.

sovaz1997
Posts: 217
Joined: Sun Nov 13, 2016 9:37 am

Re: Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Post by sovaz1997 » Tue Jun 12, 2018 1:30 pm

corres wrote:
Tue Jun 12, 2018 12:55 pm
sovaz1997 wrote:
Tue Jun 12, 2018 12:04 pm

And yes: SF 9 use in regression tests, non Sf8/Sf7 lol
If the developers of Stockfish would use only regression tests it was a very wrong thing.
If you are so clever please, explain us from what arises the 110 Elo loss.
A small number of games. Large error in measurements.
I advise you to stop thinking and start calculating.

I do not think you are smarter than Stockfish developers. Write them on Fishcooking.
Zevra chess engine
Binary, source and description here: https://gitlab.com/sovaz1997/zevra2/tags

corres
Posts: 1398
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Post by corres » Tue Jun 12, 2018 2:19 pm

sovaz1997 wrote:
Tue Jun 12, 2018 1:30 pm

A small number of games. Large error in measurements.
I advise you to stop thinking and start calculating.

I do not think you are smarter than Stockfish developers. Write them on Fishcooking.
If you know any other than to provoke somebody document your knowledge with positive actions.
I made some statements before you so if you do not agree me instead of accusations deny my statement with calculated data.
I am very curious your works.

sovaz1997
Posts: 217
Joined: Sun Nov 13, 2016 9:37 am

Re: Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Post by sovaz1997 » Tue Jun 12, 2018 2:48 pm

corres wrote:
Tue Jun 12, 2018 2:19 pm
sovaz1997 wrote:
Tue Jun 12, 2018 1:30 pm

A small number of games. Large error in measurements.
I advise you to stop thinking and start calculating.

I do not think you are smarter than Stockfish developers. Write them on Fishcooking.
If you know any other than to provoke somebody document your knowledge with positive actions.
I made some statements before you so if you do not agree me instead of accusations deny my statement with calculated data.
I am very curious your works.
Ok. See on error column. It's > 100 in plus and minus:

Code: Select all

ordo-win64.exe -a 3400 -A "Komodo 12" -W -s1000 games.pgn

Code: Select all

   # PLAYER              : RATING  ERROR   POINTS  PLAYED    (%)
   1 Komodo 12           : 3400.0   ----     29.5      46   64.1%
   2 Stockfish 160518    : 3394.4  103.7     29.5      46   64.1%
   3 Houdini 6.03        : 3376.1  104.3     27.5      45   61.1%
   4 Fire 7              : 3272.1  103.1     21.0      45   46.7%
   5 Andscacs 0.93070    : 3250.9  103.4     19.5      46   42.4%
   6 Ginkgo 2.014        : 3219.4  101.8     17.0      45   37.8%
   7 Jonny 8.1           : 3191.8  105.9     15.0      45   33.3%
Zevra chess engine
Binary, source and description here: https://gitlab.com/sovaz1997/zevra2/tags

corres
Posts: 1398
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Post by corres » Tue Jun 12, 2018 3:55 pm

sovaz1997 wrote:
Tue Jun 12, 2018 2:48 pm

Ok. See on error column. It's > 100 in plus and minus:

Code: Select all

ordo-win64.exe -a 3400 -A "Komodo 12" -W -s1000 games.pgn

Code: Select all

   # PLAYER              : RATING  ERROR   POINTS  PLAYED    (%)
   1 Komodo 12           : 3400.0   ----     29.5      46   64.1%
   2 Stockfish 160518    : 3394.4  103.7     29.5      46   64.1%
   3 Houdini 6.03        : 3376.1  104.3     27.5      45   61.1%
   4 Fire 7              : 3272.1  103.1     21.0      45   46.7%
   5 Andscacs 0.93070    : 3250.9  103.4     19.5      46   42.4%
   6 Ginkgo 2.014        : 3219.4  101.8     17.0      45   37.8%
   7 Jonny 8.1           : 3191.8  105.9     15.0      45   33.3%

[/quote]

After 46 games Stockfish stands on -111 Elo.
This is over your calculations.
If you had run tests you ought to know from practice at the end of P division games the Elo loss of Stockfish will be the similar. 
Only the error will be decrease.
So?

sovaz1997
Posts: 217
Joined: Sun Nov 13, 2016 9:37 am

Re: Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Post by sovaz1997 » Tue Jun 12, 2018 4:04 pm

If we spend a lot of games, then we will see real ratings. I'll remind you: Stockfish received such a rating because he played brilliantly in the finals. And I'm asking you: do not judge the strength of the game engine by TCEC.

The table shows that the ratings have an error of 100 points ELO. This means that you can not draw conclusions about the fact that the engine has become weaker or stronger. I could say for sure: he became stronger on a large number of tests with an error of up to 5 points of ELO.
Zevra chess engine
Binary, source and description here: https://gitlab.com/sovaz1997/zevra2/tags

corres
Posts: 1398
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Post by corres » Tue Jun 12, 2018 4:44 pm

sovaz1997 wrote:
Tue Jun 12, 2018 4:04 pm
If we spend a lot of games, then we will see real ratings. I'll remind you: Stockfish received such a rating because he played brilliantly in the finals. And I'm asking you: do not judge the strength of the game engine by TCEC.
The table shows that the ratings have an error of 100 points ELO. This means that you can not draw conclusions about the fact that the engine has become weaker or stronger. I could say for sure: he became stronger on a large number of tests with an error of up to 5 points of ELO.
Please, read my text to Mr. Blass.
I wrote about relative weakening.
To calculate absolute Elo number differences we need a lot of games, really.
But you do not want to determine my opinions.
Rather learn some patience and politeness.

sovaz1997
Posts: 217
Joined: Sun Nov 13, 2016 9:37 am

Re: Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Post by sovaz1997 » Tue Jun 12, 2018 4:54 pm

corres wrote:
Tue Jun 12, 2018 4:44 pm
sovaz1997 wrote:
Tue Jun 12, 2018 4:04 pm
If we spend a lot of games, then we will see real ratings. I'll remind you: Stockfish received such a rating because he played brilliantly in the finals. And I'm asking you: do not judge the strength of the game engine by TCEC.
The table shows that the ratings have an error of 100 points ELO. This means that you can not draw conclusions about the fact that the engine has become weaker or stronger. I could say for sure: he became stronger on a large number of tests with an error of up to 5 points of ELO.
Please, read my text to Mr. Blass.
I wrote about relative weakening.
To calculate absolute Elo number differences we need a lot of games, really.
But you do not want to determine my opinions.
Rather learn some patience and politeness.
I do not know English very well, to understand how polite I am writing.

Just on the TCEC ratings do not have to look: they are not accurate, they are always considered in different conditions. There are very few games. Just last time, Sf was a little more fortunate than now, so that's the difference.
Zevra chess engine
Binary, source and description here: https://gitlab.com/sovaz1997/zevra2/tags

corres
Posts: 1398
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Developer tests of Stockfish need Stockfish 8 instead of Stockfish 7

Post by corres » Tue Jun 12, 2018 5:19 pm

sovaz1997 wrote:
Tue Jun 12, 2018 4:54 pm

I do not know English very well, to understand how polite I am writing.
Patience and politeness do not depend on knowledge of English but one's attitude.

Post Reply