Stockfish 8 no test on inwoba.de

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Stockfish 8 no test on inwoba.de

Post by Milos »

Results are there. Considering Ingo is using Phenom and Pillerdriver and that of all official SF versions only the slowest is working on them, around 10Elo should be added to SF in the rating list to get a real number.
Dan Cooper
Posts: 184
Joined: Sun Nov 01, 2015 3:15 am

Re: Stockfish 8 no test on inwoba.de

Post by Dan Cooper »

SF8 results just went up on fastGM site today. Why is no one angry at him and accusing him of a perception of bias?
Damir
Posts: 2801
Joined: Mon Feb 11, 2008 3:53 pm
Location: Denmark
Full name: Damir Desevac

Re: Stockfish 8 no test on inwoba.de

Post by Damir »

Milos,

I did not see any of Stockfish 8 results on Ingo's website against other engines, and particularly against Komodo 10.2. Why keep it hidden ?
IWB
Posts: 1539
Joined: Thu Mar 09, 2006 2:02 pm

Re: Stockfish 8 no test on inwoba.de

Post by IWB »

Milos wrote:Results are there. Considering Ingo is using Phenom and Pillerdriver and that of all official SF versions only the slowest is working on them, around 10Elo should be added to SF in the rating list to get a real number.
You are realy a strange guy - and wrong ... someone surprised?

I would not have answered but avoid any confusion:
This is the modern compile with POPCNT so not the slowest one.
But even if it would be the slowest one, all other engines could not use that as well so "relativly" nothing changes. It is completly irrelevant for the rating if ALL engines use POPCNT, BMI or all use noting; the ratio remains the same.
Dan Cooper
Posts: 184
Joined: Sun Nov 01, 2015 3:15 am

Re: Stockfish 8 no test on inwoba.de

Post by Dan Cooper »

Damir wrote:Milos,

I did not see any of Stockfish 8 results on Ingo's website against other engines, and particularly against Komodo 10.2. Why keep it hidden ?
Hidden in plain sight.

Code: Select all

Head to head statistics:

  1) Stockfish 8               3299 :   3300 (+2105,=1149,-46),  81.2 %

     vs.                             :  games (    +,    =,  -),   (%) :   Diff,   SD, CFS (%)
     Komodo 10.2                     :    220 (   61,  142, 17),  60.0 :    +30,    8,  100.0
     Shredder 13                     :    220 (  102,  114,  4),  72.3 :   +170,    7,  100.0
     Houdini 4                       :    220 (  153,   63,  4),  83.9 :   +178,    6,  100.0
     Gull 3                          :    220 (  127,   90,  3),  78.2 :   +233,    6,  100.0
     Ginkgo 1.8                      :    220 (  139,   80,  1),  81.4 :   +254,    7,  100.0
     Jonny 8.00                      :    220 (  139,   77,  4),  80.7 :   +268,    7,  100.0
     Equinox 3.30                    :    220 (  145,   72,  3),  82.3 :   +296,    6,  100.0
     Fizbo 1.8                       :    220 (  152,   66,  2),  84.1 :   +302,    7,  100.0
     Fritz 15                        :    220 (  157,   60,  3),  85.0 :   +302,    7,  100.0
     Critter 1.6a                    :    220 (  145,   73,  2),  82.5 :   +306,    6,  100.0
     Hannibal 1.7                    :    220 (  147,   73,  0),  83.4 :   +328,    7,  100.0
     Andscacs 0.88                   :    220 (  160,   60,  0),  86.4 :   +332,    7,  100.0
     Chiron 3.01                     :    220 (  166,   53,  1),  87.5 :   +360,    7,  100.0
     Protector 1.9.0                 :    220 (  165,   55,  0),  87.5 :   +363,    6,  100.0
     Nirvanachess 2.3                :    220 (  147,   71,  2),  83.0 :   +367,    7,  100.0
styx
Posts: 338
Joined: Tue Mar 13, 2012 9:59 pm
Location: Germany

Re: Stockfish 8 no test on inwoba.de

Post by styx »

Biased? Ingo published the test results today and stockfish 8 did a bit too well there to make me believe that this rating list is biased.

And my god, it's an engine rating list, not a credit rating agency... conspiracy theories are getting a bit out of hand here
mehmet karaman
Posts: 142
Joined: Tue Jan 28, 2014 8:37 am
Location: TURKEY

Re: Stockfish 8 no test on inwoba.de

Post by mehmet karaman »

Thanks Ingo for your wonderful rating list. I will wait for Houdini 5

http://www.inwoba.de/
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Stockfish 8 no test on inwoba.de

Post by Frank Quisinsky »

Hi Damir,

to test an engine with more as one core (for an Rating list) is completly nonsense. Thousands of games to produce is time wasting because all what is extra ...

more as 1 core
more time
ponder on
5-pieces

and so one can be calculate in self work.

Example:
How stronger with 2, 4, 8 cores ... is a question of how big is the factor.

1.8 for 2 Cores
3.3 for 4 Cores

Not more not less.

TCEC is great for all for looking in live games. But have nothing to do with to get statisic Information ... how strong is an engine. TCEC can't produce many games.

Sure, different engines have a better Support for more as 1 core as others, but again ... you can find it out with 4-8 test positions in self work. No reason for a rating list to test it out. In this time much more interesting to Play more games vs. more different engines for a better and much more exactly rating.

Best
Frank
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: **ALL** what we have here is not good enough!

Post by Frank Quisinsky »

Hi Norman,

IPON is a list with a very fast time control on older AMD processors. Games are not available. During the list the hardware was changed for around 3-4 years.

Same CCRL or CEGT!
CEGT blitz Ratings with 1-2 seconds per move and 40 in 20 is around 1/3 of time I am using with 4GHz Intel i7 hardware.

But after all ...
Much more important ...

Consistancy is the topic! Ingo can give for a hand full engine (three hand full engines) very fast a results.

Here the work by Ingo is good and strong!!
Clear conditions all the time, same for CEGT, CCRL or the work by others, like the Stockfish results by Stefan Pohl or other systems ... very strong is the work by Arnaud LOHÉAC and Andreas Strangmüller.

Honestly ...

For me is important:

A clear database (because I can use the database for opening books, statistics) ... same hardware ... also a consitancy work and a good time control. Only CCRL with 40 in 40 have longer time controls with good stats we can create. CCRL with 40 in 40 is in reality 40 in 12 if I compare with my conditions on i7 4.0 Ghz hardware.

CCRL:
A farce to give others the Information 40 in 40 (for a 40 in 12 list on modern hardware today).

CEGT:
A farce to give others the Information 40 in 20 (for a 40 in 5 list on modern hardware today).

But I can do nothing with all the works because ...

- I don't like in CEGT. Too many book loses in databases. To many fast draw games in database and so one.

- I don't like IPON because games are not avaialble, very fast games on AMD hardware.

- I don't like CCRL because to many clones and in most of cases not enough games.

- I don't like that in all of the rating Systems are not enough opponents.
- I don't like that many opening systems never played.

CEGT 5 + 3 is interesting and my favorite from all that what is available if I am looking not on my own work.

But that is my personal opinion and I try to make it better. What we have isn't good enough for myself. I am honest here because if I am thinking ... not good enough for myself ... in this case ... very easy ... Make it better or shut up.

And if I am looking on my own ...
Nothing is perfect!

Today I don't like my own too.

My own have also different problems!

- 4-pieces are bad ...
Better is testing without endgame databases!

- I have to check each games because my book is only perfect to 95% with the way I try to optimate it.

---

Many persons working on CEGT and CCRL very had some years.
This is good and bad.

An old list with old engines and older hardware can not be good.

Better is to create each 3-4 years a new one with more modern hardware.
Better is that all of the people working on it and build one group only.

Furthermore, to test private available engines is very bad because no others can have it. With such information we can do nothing. But often the group of testers will have this one and this one ... more with the reason to have it ... not with the reason to test it.

In the past so many persons have interest to help here but if persons have the private available engines ... no interest to work on the rating systems. Can give you more as 10 examples from Winboard times. Same in questions ... I will be a administor of a Forum. OK I give the admin from Arena Forum with the results ... most of person make more problems as to help in development.

In all cases ...
To test private avaialbe engine is really very bad for all ... only for the programmers of private available engines interesting.

So long we have different groups working on rating systems so long nothing is perfect. And the groups make allways the same and produce allways the same problems.

We need clear conditions and many people and we can create a lot of better things as we have today. What we have is good ... not good enough ... but after so many years Computer chess we must have today much better things.

The main problem in computer chess ...
The team working can be much better.

Arena in the past with the official beta test Forum ... Stockfish today ... and the same we can build for a better rating Systems.

With a solo action by Ingo or a solo action by Frank ... such things give max the information ... that most of us will have but have no interest to do a bit for the group.

We await this one and this one but only 10-15% of Computer chess People are able to do a bit. And from the 10-15% are the most solo fighter.

After all the years Computer chess ...
I know that and can life with it.

Best
Frank
Frank Quisinsky
Posts: 6808
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Frustating!

Post by Frank Quisinsky »

And all the different conditions ...

List A ... no Fire for an example I like to test Houdini (the clone of the predecessor of Fire).
CEGT, IPON

List B ... 200 Elo more as the others have.
CCRL

To bake buns ...
Not more not less!

He thinks he knows it all with half information reading here and here. Sure that I have also a lot from such half information.

...

More problems from my own ...

- I have only TOP-50 in my list and we have more as 50 available engines.
Again: With more testers much more things are possible. All will be much better with a good organisation.

- my list is now three years old ... to old!

- I changed the the testing method 3 times. OK, try to make it better and better but also this one is bad.

and much more ...
Again nothing is perfect!

And to test engines is really a lot of work. All what is available on rating systems ... I can read a rating and not more. Today we can create interesting stats ... CCRL have a bit, I have a bit ... statistic people can do so many things with games we produced.

We need ...
One group of testers ... persons created stats from the games for us. This is much more interesting and to read boring ELO information only.

We need here a good and strong book for testing.
An additional group can working on a strong and equal book.

All to do in home work alone ... it can not be better as a group of people can do. I try my best but I know each day ... the work is good but not perfect.

Again and yet again ...
The work is good but not perfect is frustating pure!