Strongest Fruit - what's the verdict?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
mclane
Posts: 18753
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: Strongest Fruit - what's the verdict?

Post by mclane »

what do you count in your statistics when you judge chess games ?

you count

1-0
1/2-1/2
0-1

and from 1 chess game your statistician gets 1 information out of 3 possible informations. of course if you want to judge anything with this method you need at least 1000 chess games / experiments if you want to get a clue what is going on.

when a human beeing watches a computer chess game
between 2 engines A and B, he gets EACH move hundres of informations, which main line, which score, when did A saw it, when B.
so 100 of information each move makes not 3 stages but 100X80 stages.

you want to reduce life into a number. in the case of chess you reduce these many information 100X80 into 3 stages above.

you reduce the quality of information, therefore you need at least a quantity of 1000 events to get a clue.

a human watching a chess game and seeing 100x80 different informations in this 1 game gets a lot more of information, a higher quality.

therefore getting information from watching an event is far superior than reducing the information and trying to find out from the reduced information.

if you see a human beeing and try to reduce it to 3 criteria how to judge the human beeing, it is a very flat point of view from the human beeing you met.
and in the same way judging chess games from 3 criteria is a very flat view about the topic.

bean counter is you do.
menniepals
Posts: 265
Joined: Wed Mar 08, 2006 8:31 pm
Location: Houston, Texas

Re: Strongest Fruit - what's the verdict?

Post by menniepals »

Durian is also very tasty once you get over with the smell. It is healthy and delicious. It is one of the expensive fruits from South East Asia.
nczempin

Re: Strongest Fruit - what's the verdict?

Post by nczempin »

mclane wrote:what do you count in your statistics when you judge chess games ?

you count

1-0
1/2-1/2
0-1

and from 1 chess game your statistician gets 1 information out of 3 possible informations. of course if you want to judge anything with this method you need at least 1000 chess games / experiments if you want to get a clue what is going on.

when a human beeing watches a computer chess game
between 2 engines A and B, he gets EACH move hundres of informations, which main line, which score, when did A saw it, when B.
so 100 of information each move makes not 3 stages but 100X80 stages.

you want to reduce life into a number. in the case of chess you reduce these many information 100X80 into 3 stages above.

you reduce the quality of information, therefore you need at least a quantity of 1000 events to get a clue.

a human watching a chess game and seeing 100x80 different informations in this 1 game gets a lot more of information, a higher quality.

therefore getting information from watching an event is far superior than reducing the information and trying to find out from the reduced information.

if you see a human beeing and try to reduce it to 3 criteria how to judge the human beeing, it is a very flat point of view from the human beeing you met.
and in the same way judging chess games from 3 criteria is a very flat view about the topic.

bean counter is you do.
Where did you get that figure, 1000?

If one version plays 0-10, and the other version plays 10-0, there is considerable evidence that this difference is not likely to be random. There is no magic number, like 1000 or anything. But there are methods to determine when a result is significant for a given confidence level.

And for the difference you saw in your experiment/tournament, there is intuitively (and I think this would be confirmed by statistical analysis) not sufficient evidence to conclude that the higher-scoring engine is stronger than the other.

There are many issues that contribute to variance in such tournaments (especially at a relatively high level that all the Fruit versions play at), opening book being one of them. I am not sure whether you really have taken care to reduce the variance to such an extent that the result could be decided to be less than random.
nczempin

Re: Strongest Fruit - what's the verdict?

Post by nczempin »

mclane wrote:what do you count in your statistics when you judge chess games ?

you count

1-0
1/2-1/2
0-1

and from 1 chess game your statistician gets 1 information out of 3 possible informations. of course if you want to judge anything with this method you need at least 1000 chess games / experiments if you want to get a clue what is going on.

when a human beeing watches a computer chess game
between 2 engines A and B, he gets EACH move hundres of informations, which main line, which score, when did A saw it, when B.
so 100 of information each move makes not 3 stages but 100X80 stages.

you want to reduce life into a number. in the case of chess you reduce these many information 100X80 into 3 stages above.

you reduce the quality of information, therefore you need at least a quantity of 1000 events to get a clue.

a human watching a chess game and seeing 100x80 different informations in this 1 game gets a lot more of information, a higher quality.

therefore getting information from watching an event is far superior than reducing the information and trying to find out from the reduced information.

if you see a human beeing and try to reduce it to 3 criteria how to judge the human beeing, it is a very flat point of view from the human beeing you met.
and in the same way judging chess games from 3 criteria is a very flat view about the topic.

bean counter is you do.
Strong the force is in you.


You did not mention any parts of games that let you conclude that the one version is stronger than the other; you only pointed us at your tournament results.

The very flat view of the strength of engines is that only the results count to determine if one engine is stronger than another, and not how they were achieved.

I don't see how anything is wrong with that.
nczempin

Re: Strongest Fruit - what's the verdict?

Post by nczempin »

George Tsavdaris wrote:
mclane wrote:statistic is the effort to measure something others know from experience.
LOL! :D

Statistics beat experience many many times.....
If you have a feeling about something, you don't know if it's true. Statistics will tell you if it is.....
Statistics cannot tell you if anything is true, only if there is sufficient evidence under a given confidence level (meaning how many times out of 100 you would accept to be wrong).
User avatar
mclane
Posts: 18753
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: Strongest Fruit - what's the verdict?

Post by mclane »

but thats your major problem that you make statements without knowing. therefore you need statistics :-)

why do you think at a chess computer event the programmers look on the screen in front of them, and don't rely on your 1-0 / 1/2 / 0-1 numbers instead ? why do they bring the 2 screens that way together that they can relate engine A with engine B in ONE view ?

if your method of statistics would be accurate, the programmers could let the autoplayer or GUI run the tournament, and sleep or make shopping or sight seeing and come back AFTER the game and YOU would tell them

1-0
0-1 or 1/2-1/2.

if no information would get lost, it would be AS good, won't it ?
User avatar
mclane
Posts: 18753
Joined: Thu Mar 09, 2006 6:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub

Re: Strongest Fruit - what's the verdict?

Post by mclane »

You did not mention any parts of games that let you conclude that the one version is stronger than the other; you only pointed us at your tournament results.
why do you complain. for you the result is enough, isn't it ?

here it is: 56 !

now conclude from that :-)
nczempin

Re: Strongest Fruit - what's the verdict?

Post by nczempin »

mclane wrote:but thats your major problem that you make statements without knowing. therefore you need statistics :-)

why do you think at a chess computer event the programmers look on the screen in front of them, and don't rely on your 1-0 / 1/2 / 0-1 numbers instead ? why do they bring the 2 screens that way together that they can relate engine A with engine B in ONE view ?

if your method of statistics would be accurate, the programmers could let the autoplayer or GUI run the tournament, and sleep or make shopping or sight seeing and come back AFTER the game and YOU would tell them

1-0
0-1 or 1/2-1/2.

if no information would get lost, it would be AS good, won't it ?
I am an engine programmer, perhaps you are unaware of that. I am pretty sure that I know better than you what we look for in games and in results.

Let me explain:

At events they/we look at individual games to look for specific pointers on what parts to improve. But one game can easily tell you that you are weak or strong in one particular area, but not how changing it will affect the other parts overall.

Just because we also look at individual games it doesn't forbid in any way that we also look at results.

And you can be sure that all the successful programmers look at results, plus some of the less successful ones. (or, let's say successful==coder of a strong engine).

It is not my method of statistics, it is the universally valid method of statistics taught at schools and universities.

There are many tournaments in which autoplayer works fine, viz. Olivier Deville's ChessWar and OpenWar tourneys.

Looking at an individual game will help you find bugs as well as holes in the opening book. In particular at high-level events, coders often try out things, and if they find out that they don't seem to work, they will take them out for the next round.

The only way to "know" _objectively_ anything that is not deterministic, and engine-engine matches at the highest levels certainly aren't, is to use statistical methods.

You obviously don't have even a basic grasp of scientific methods, and your knowledge about chess engines seems cursory at best. At least the way you insist to be right would indicate this.
nczempin

Re: Strongest Fruit - what's the verdict?

Post by nczempin »

mclane wrote:
You did not mention any parts of games that let you conclude that the one version is stronger than the other; you only pointed us at your tournament results.
why do you complain. for you the result is enough, isn't it ?

here it is: 56 !

now conclude from that :-)
The result you quoted led to the conclusion that it is not possible, just based on those results, to say which version of Fruit is the strongest.

You complained that I wasn't taking into account the individual games, but since you didn't give them, how did you expect anybody to conclude anything without them?
nczempin

Re: Strongest Fruit - what's the verdict?

Post by nczempin »

mclane wrote:there have been 1512 games.


enough to conclude about the software. and - if you have the chance to look the games when they are played - enough to make conclusions about the software.

if you do not believe this is possible it is imo your problem.

computerchess is imo nothing where you work with statistics.
statistics happens when the products are finished and sold, then people come and do statistics.
but when you develop a problem you need more than statistics.

more than bean counting.
What is your conclusion?

BTW out of those 1512, the large majority are completely irrelevant to answering the question of which is the strongest Fruit version.