GGT2

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

rainhaus
Posts: 187
Joined: Sun Feb 01, 2009 7:26 pm
Location: Germany
Full name: Rainer Neuhäusler

Re: GGT2/7-12/ Stockfish wins 2 gambits/ Firebird still lead

Post by rainhaus »

Hello Frank,

Back to community

The relapse rate among us, -fellow sufferers of computer chess-, is high! There are only a few who have durable stayed clean. On the German side, H.J. Schumacher, M. Gurevich and M. Kästner come to my mind, programmer and business people excluded. Could it be that "Mike" Scheidl and Thorsten Czub are just about trying it too? Although, I think that a return to computer chess isn't a lack of endurance but often arises from a deep desire for finding the truth. What kind of truth? Finding the best engine, of course!

Nobody could ignore, that a well known and deserving guy has returned to computer chess again. With full power, lengthy and noisy as ever, with a sharp view for everybodys good ideas for the sake of own applications, and with this certain sense and this apodictic language of an important mission.
Of course I regularly keep myself informed by Frank's very interesting SCHACHWELT column. Frank, you have really a lot of work with it and I hope you are left in peace by private obligations. It's quite a while ago since we talked together in the ChessBits forum, suddenly abandoned by the tragic death of its caring moderator ""Elvis" and I remember your daily night talks in the CSS-forum too. Sometimes I would like grubbing in the old threads and posts, however, the project "CSS-Archive" seems to be buried finally :( Let's put an end to nostalgy, we are international here.

About your recommendations

Don't worry, my computer and me are cooperating very well with the available resources, the tournament doesn't run just since yesterday. 6 GB are sufficient, far and away, but for shure! Supported by Taskmanager, Sidebar, ASUS-Suite and by the nodes- and hashcounter of the GUI I'm currently informed what's going on. With your specifications and with your knowledge about multi-core processing it would crash daily, indeed :) How do you come to the adventurous 4 GB Hash? 4 cores doesn't mean 4x 1,2 GB Hash and 4x GUI and 4x EGTB! No, there all are used only 1,2 GB hash, no matter how many threads you are activating. It doesn't matter too if you are running a processor-based engine like Rybka or a thread-based engine like all the others. Look at the memory-minitool in the Sidebar, change the number of processors by he parameters and you will see a constant percentage value of memory usage in the display. Imagine a waggon which gets filled by 4 workers, thereabout... Perhaps this misunderstanding is one of the reasons why there are still so many single users :) By the way, I'm using only 3 cores for the tournament.
You seems to be a little pressured by a self-given order to propagate norms and reference values. Why the limit of 760 MB hash? It depends on the CPU speed and in this case also on the number of the activated threads and on the architecture of the engine, how fast the hash tables get filled. For example, Zappa II fills 1,2 GB hash within 40 seconds under GGT conditions! In other words, the program could have even more.

But now, back to my tournament, the engines play one round after another and I should have written a report long ago.

Greetings to Trier
Rainer
Frank Quisinsky
Posts: 7044
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: GGT2/7-12/ Stockfish wins 2 gambits/ Firebird still lead

Post by Frank Quisinsky »

Hi Rainer,

oh, a very nice answer by yourself.
Yes I remember on the old times and ELVIS. Phone often in the night with him many hours, more times in the week. ELVIS and I working in these times in the night and we have a lot of fun. I missed him.

OK, wrote all this to the wrong person, sorry for it. I follow your tournament, very interesting at all.

Thanks for your time and work and your friendly answer.

My regards
Best
Frank
rainhaus
Posts: 187
Joined: Sun Feb 01, 2009 7:26 pm
Location: Germany
Full name: Rainer Neuhäusler

GGT2/ rnd 1-24/12 Gambits/Total Scores

Post by rainhaus »

GGT2 Total Scores Round 1-24; 12 Gambits; Opening 21-32; Eco C02-C40: Nimzo,Danish,Calabrese,Lewis,Wing,Urusov,Falkbeer,Charousek,WildMuzio,Hanstein,Allgaier,Elephant

Code: Select all

                Fire       Stock      DRybka     Naum       DFritz     DShred     ZapMex     Points
                ----------------------------------------------------------------------------------
FireBird 1.2     *****     12.5-11.5  13.0-11.0  15.0- 9.0  16.5- 7.5  17.5-6.5   17.0-7.0   91.5
Stockfish1.7.1  11.5-12.5    *****    11.0-13.0  12.5-11.5  14.0-10.0  15.5-8.5   19.0-5.0   83,5
DeepRybka 3     11.0-13.0  13.0-11.0    *****    12.0-12.0  15.0- 9.0  13.0-11.0  15.5-8.5   79.5
Naum 4.2         9.0-15.0  11.5-12.5  12.0-12.0    *****    16.5- 7.5  15.0-9.0   15.0-9.0   79.0
DeepFritz 12     7.5-16.5  10.0-14.0   9.0-15.0   7.5-16.5    *****    13.0-11.0  13.5-10.5  60.5
DeepShredder12   6.5-17.5   8.5-15.5  11.0-13.0   9.0-15.0  11.0-13.0   *****     14.0-10.0  60.0
ZappaMexico II   7.0-17.0   5.0-19.0   8.5-15.5   9.0-15.0  10.5-13.5  10.0-14.0   *****     50.0
-------------------------------------------------------------------------------------------------
504 games
Trends
(Previous comments cursive)
So far FireBird controls all the other engines with the exception of Stockfish.
Stockfish is no longer FireBird's feared opponent. To say it martially, the FIRE-finished Ippo/Robbo/Igor-blade seems to be steeled against every engine (clang..spark..ouch..) :shock: With the exception of Stockfish FireBird's superiority is significant at the 95% level to the rest of the best (see Elo-table below)
New Naum is considerably stronger than the predecessor but seems to be very afraid of "Fire"
Yes, this tendency goes on. I've said it before, a little bit more Ippo/Robbo/Igo genes wouldn't be bad!
Rybka 3 is finally dethroned...
Yes, concerning this tournament! Of course, Rybka 3 will keep its leadership in the common lists because of FireBird's absence. Rybka 3 will be replaced by Rybka 4 and you'll get yet more a Rybka-overcrowded competition and a purified top class of the engines. Allowing for hell you are better advised by reading the Great Gambit Tournament :twisted:
Shredder lost within a few months its contact to the engines in front, but it still belongs to the Best of Five .
Now, archrival Fritz is also wrangling for this rank. If the moves and positions are favourable both engines even succeed in winning a gambit (see the single rounds)
NewFritz is a bitter disappointment for the friends of pure playing strength.
The engine must have heard this and won the Allgaier Gambit instantly. It will not happen very often but it's entertaining the spectator.
Orphaned old Zappa is still about spoiling the scores of Shredder and Fritz
Yes, but now and then it makes a muck of all the other engines too.

GGT2 Rank Frequency Round 1-24

Code: Select all

                  1st  2nd  3th 4th  5th  6th  7th  gambits
                  ------------------------------------- 
FireBird 1.2      3    5    4    0    0    0    0    12 
Stockfisch 1.7.1  2    2    3    4    0    1    0    12 
Deep Rybka 3      4    1    3    2    0    2    0    12 
Naum 4.2          1    4    2    2    2    0    1    12 
Deep Fritz 12     1    0    0    1    4    4    2    12 
Deep Shredder 12  1    0    0    1    4    3    3    12 
Zappa Mexico II   0    0    0    2    2    2    6    12 
                  --------------------------------------
gambits           12   12   12   12   12   12   12
An unusual but interesting statistic. The table shows more clearly than points and Elos the tournament strength of FireBird. It's the only engine which is ranking among the top three in all 12 gambits : 3x first, 5x second, 4x third! In contrast, Stockfish 4x fourth, Rybka 2x sixth, Naum 1x at last.
But the 4 first ranks of Deep Rybka 3 are not to be sneezed at.

GGT2 Total Performance Round 1-24

Code: Select all

                  Games  W   D   L   Points Perform
FireBird1.2 newSMP 144   +59 =65 -20  91.5   63%
Stockfish 1.7.1    144   +54 =59 -31  83.5   57%
DeepRybka 3        144   +48 =63 -33  79.5   55%
Naum 4.2           144   +51 =56 -37  79.0   54%
DeepFritz 12       144   +33 =55 -56  60.5   42%
DeepShredder 12    144   +32 =56 -56  60.0   41%
ZappaMexico II     144   +25 =50 -69  50.0   34%
----------------------------------------------------------
504 games
GGT2 Total Elo-Ranking Round 1-24 with CEGT-Calibration

Code: Select all

  Program         Elo    +   -   Games   Score.  Draws   CEGT
1 FireBird1.2     3193   43  42   144    63.5 %   45.1%  0000
2 Stockfish1.7.1  3158   44  44   144    58.0 %   41.0%  3158
3 DeepRybka3      3141   43  43   144    55.2 %   43.8%  3181
4 Naum4.2         3139   45  45   144    54.9 %   38.9%  3139
5 DeepFritz12     3062   45  45   144    42.0 %   38.2%  3048
6 DeepShredder12  3060   45  45   144    41.7 %   38.9%  3062
7 ZappaMexicoII   3016   47  47   144    34.7 %   34.7%  3018
------------------------------------------------------------------------
504 games; Starting Value EloStat:3110; List: CEGT 40/20,4 threads, May 2010
Rybka excluded, an excellent correlation between CEGT and GGT after 504 games. For interpreting Rybka's deviation let's wait to the further results.

Played games GGT2: 12 gambits=24 rounds

Code: Select all

                         Games 
1  engine/round             6
1  engine/gambit           12 (double round, switched colours)
1  engine pair/12 gambits  24(for instance, FireBird against Rybka)
1  gambit                  42(7x6)
1  engine/12gambits       144 (12x12)
12 gambits                504 (42x12)
PGN-Link:
http://www.file-upload.net/download-255 ... 2.pgn.html
Next monday: extended report: rounds 13-24

Book
50 gambit starting positions. GGT1:Eco00 - B44. GGT2:EcoC02 - E60
Test conditions
Time Control: tournament level 40/20', 20/10', 10'+12''
System: Intel Core i7 920, oc 3600-3800 MHz, 6 GB DDR3 RAM. Vista 64
Hyperthreading off, Turbo Mode off.
Engine parameters: 3 threads. Ponder off. 1,2 GB Hash.
EGTB 3,4,5: Nalimov, TotalBases, sometimes TripleBases. Stockfish don't use EGTB. Bitbases are not needed. FireBird's TotalBases and RAM-resident TripleBases don't work always properly.
Fritz12-GUI: remis late, resign late/never.
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: GGT2/ rnd 1-24/12 Gambits/Total Scores

Post by beram »

Very nice and thank you Rainer,
couldn't wait till monday so I made a first elostat myself:
Fire 1.2 is 52 ELO ahead of Rybka 3 in this competition at LTC (!)
when Rybka 4 is 60 ELO better than R3, as the reliable Larry Kaufman is saying (and the first testresults also indicates), we can presumably see a future head to head race between Rybka 4 and Fire 1.3 and hopefully a new Stockfish 1.8

Kind regards, Bram

Program Elo + - Games Score Av.Op. Draws

1 FireBird1.2newSMPx64 : 2883 43 42 144 63.5 % 2786 45.1 %
2 Stockfish1.7.1x64 : 2848 44 44 144 58.0 % 2792 41.0 %
3 DeepRybka3x64 : 2831 43 43 144 55.2 % 2795 43.8 %
4 Naum4.2x64 : 2829 45 45 144 54.9 % 2795 38.9 %
5 DeepFritz12 : 2752 45 45 144 42.0 % 2808 38.2 %
6 DeepShredder12x64 : 2750 45 45 144 41.7 % 2809 38.9 %
7 ZappaMexicoIIx64 : 2706 47 47 144 34.7 % 2816 34.7 %
rainhaus
Posts: 187
Joined: Sun Feb 01, 2009 7:26 pm
Location: Germany
Full name: Rainer Neuhäusler

Re: GGT2/ rnd 1-24/12 Gambits/Total Scores

Post by rainhaus »

Thanks again for enjoying my tournament, Bram.

- Wdym by LTC? Longer Time Control?
- You must not wait till Monday, you already can find an Elo table in my report.
- Of course, a head to head race would be very more exciting than watching a busting fish inbreeding in an aquarium :)
- If you want a formatted table than use no tabs and no proportional font and apply the CODE option.

Code: Select all

    Program                          Elo    +   -   Games   Score   Av.Op.  Draws

  1 FireBird1.2newSMPx64           : 2883   43  42   144    63.5 %   2786   45.1 %
  2 Stockfish1.7.1x64              : 2848   44  44   144    58.0 %   2792   41.0 %
  3 DeepRybka3x64                  : 2831   43  43   144    55.2 %   2795   43.8 %
  4 Naum4.2x64                     : 2829   45  45   144    54.9 %   2795   38.9 %
  5 DeepFritz12                    : 2752   45  45   144    42.0 %   2808   38.2 %
  6 DeepShredder12x64              : 2750   45  45   144    41.7 %   2809   38.9 %
  7 ZappaMexicoIIx64               : 2706   47  47   144    34.7 %   2816   34.7 
Cheers
Rainer


beram wrote:Very nice and thank you Rainer,
couldn't wait till monday so I made a first elostat myself:
Fire 1.2 is 52 ELO ahead of Rybka 3 in this competition at LTC (!)
when Rybka 4 is 60 ELO better than R3, as the reliable Larry Kaufman is saying (and the first testresults also indicates), we can presumably see a future head to head race between Rybka 4 and Fire 1.3 and hopefully a new Stockfish 1.8

Kind regards, Bram

Program Elo + - Games Score Av.Op. Draws

1 FireBird1.2newSMPx64 : 2883 43 42 144 63.5 % 2786 45.1 %
2 Stockfish1.7.1x64 : 2848 44 44 144 58.0 % 2792 41.0 %
3 DeepRybka3x64 : 2831 43 43 144 55.2 % 2795 43.8 %
4 Naum4.2x64 : 2829 45 45 144 54.9 % 2795 38.9 %
5 DeepFritz12 : 2752 45 45 144 42.0 % 2808 38.2 %
6 DeepShredder12x64 : 2750 45 45 144 41.7 % 2809 38.9 %
7 ZappaMexicoIIx64 : 2706 47 47 144 34.7 % 2816 34.7 %
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: GGT2/ rnd 1-24/12 Gambits/Total Scores

Post by beram »

Hi Rainer, Indeed do I mean Long Time Control.
I must have overread the rating table, sorry.
Thx for the tip of using the CODE option I will try that in future, things look prettier than.

You gonna try Fire 1.3 and Rybka 4 in this tournament or in near future ?

It is a shame that CEGT and CCRL won't try Fire, they probably can't stand the heat in the kitchen :)

regards Bram
rainhaus
Posts: 187
Joined: Sun Feb 01, 2009 7:26 pm
Location: Germany
Full name: Rainer Neuhäusler

Re: GGT2/ rnd 1-24/12 Gambits/Total Scores

Post by rainhaus »

beram wrote:Hi Rainer, Indeed do I mean Long Time Control.
I must have overread the rating table, sorry.
Thx for the tip of using the CODE option I will try that in future, things look prettier than.
You gonna try Fire 1.3 and Rybka 4 in this tournament or in near future ?
Of course I'll let them fight each other, no mercy with the calculators :mrgreen: However, I don't need to compete for the first results. I'm very interested in how the tournament resp. the participants will perform in the next gambits. There are coming a lot of d4-d5 and d4-Sf6 openings like Queen's Gambit and the Indian variations witch have a more moderate character than the just played King's Gambit. Could be Rybka will make up some points in this opening area.
beram wrote: It is a shame that CEGT and CCRL won't try Fire, they probably can't stand the heat in the kitchen :)
regards Bram

Everybody should feel free for testing an engine or not, and there should be no need for justifying it. And so it happens that I'm testing Firebird and others don't do it.

Cheers
Rainer
rainhaus
Posts: 187
Joined: Sun Feb 01, 2009 7:26 pm
Location: Germany
Full name: Rainer Neuhäusler

GGT2/ rnd 13-24/ The Single Rounds (with a few surprises)

Post by rainhaus »

After the Total Scores now a few data about the single rounds.

The Single Rounds

FireBird 1.2 SMP wins C31 Falkbeer Countergambit
Opening: C30-C39 KING'S GAMBIT
Round 13/14, Position 27
1.e4 e5 2.f4 d5


Ernst Karl Falkbeer (Brno 1819–1885 Vienna) was an Austrian chess master and journalist. Historic Elo: 2524

Code: Select all

                                  1  2  3  4  5  6  7  
1 FireBird 1.2 x64              ** ½0 ½1 10 ½1 1½ 1½   7.5/12
2 Naum 4.2 x64                  ½1 ** ½1 ½½ 0½ 11 0½   7.0/12
3 Rybka 3 x64                   ½0 ½0 ** 1½ ½½ ½½ 1½   6.0/12
4 Stockfish 1.7.1 JA x64        01 ½½ 0½ ** ½½ ½½ 10   5.5/12  33.50
5 Zappa Mexico II x64           ½0 1½ ½½ ½½ ** 0½ 10   5.5/12  33.50
6 Deep Shredder 12 x64          0½ 00 ½½ ½½ 1½ ** 1½   5.5/12  31.00
7 Deep Fritz 12                 0½ 1½ 0½ 01 01 0½ **   5.0/12
-------------------------------------------------------------
    42 games
Still everything is fine. Again a double round which separates correctly an upper group A with FireBird, Rybka, Stockfish, Naum and a lower group B with Shredder, Fritz and Zappa.

Deep Rybka 3 wins C32 Falkbeer Gambit, Charousek Variation
Opening: C30-C39 KING'S GAMBIT
Round 15/16, Position 28
1.e4 e5 2.f4 d5 3.exd5 e4 4.d3 Sf6 5.dxe4 Nxe4 6.Qe2


Rudolf Rezso Charousek (1873 Prague – 1900 Budapest), historic Elo (2734), was a Hungarian-Jewish chess player. A brilliant player, Reuben Fine described him as the John Keats of chess. He defeated world champion Emanuel Lasker at Nuremberg in 1896. Lasker was so impressed that he said “I shall have to play a championship match with this man some day”. However, this day has never come because Charousek had a tragically short career, dying at the age of 27 from tuberculosis. Charousek is also immortalized in literature. One of three central figures of the famous novel THE GOLEM by GUSTAV MEYRINK is the medicine student and chess player Innozenz Charousek: "..standing beside me was Charousek, the collar of his thin, threadbare coat turned up; I could hear his teeth chattering. The poor student will catch his death of cold in this icy, draughty archway.."
http://en.wikipedia.org/wiki/Rudolf_Charousek
http://www.britannica.com/bps/additiona ... Der-Golem-

Code: Select all

                                  1  2  3  4  5  6  7  
1   Rybka 3 x64                   ** 1½ ½½ ½½ 11 ½½ 1½   8.0/12
2   Naum 4.2 x64                  0½ ** ½½ ½½ 1½ ½1 11   7.5/12
3   FireBird 1.2 newSMP x64       ½½ ½½ ** ½½ ½1 1½ ½½   7.0/12  39.25
4   Stockfish 1.7.1 JA x64        ½½ ½½ ½½ ** ½½ ½½ 11   7.0/12  39.00
5   Deep Shredder 12 x64          00 0½ ½0 ½½ ** ½½ ½1   4.5/12
6   Deep Fritz 12                 ½½ ½0 0½ ½½ ½½ ** 00   4.0/12  26.75
7   Zappa Mexico II x64           0½ 00 ½½ 00 ½0 11 **   4.0/12  21.25
    -----------------------------------------------------------
    42 games 
That's a perfect fit of A and B.

Deep Shredder 3 wins C37 Wild Muzio Gambit
Opening: C30-C39 KING'S GAMBIT
Round 17/18, Position 29
1.e4 e5 2.f4 exf4 3.Nf3 g5 4.Bc4 g4 5.Bxf7


Named after the chess player Muzio d'Alessandro from Naples (7th century) A typical gambit of this time by a quick and uncompromising attack against the king with a piece sacrifice on f7.

Code: Select all

                              1  2  3  4  5  6  7  
1   Deep Shredder 12 x64      ** 01 01 10 11 11 01   8.0/12
2   Stockfish 1.7.1 JA x64    10 ** 10 11 01 01 1½   7.5/12
3   FireBird 1.2 newSMP x64   10 01 ** 10 01 01 11   7.0/12
4   Zappa Mexico II x64       01 00 01 ** 10 10 01   5.0/12  29.50
5   Deep Fritz 12             00 10 10 01 ** 10 01   5.0/12  29.00
6   Rybka 3 x64               00 10 10 01 01 ** 10   5.0/12  29.00
7   Naum 4.2 x64              10 0½ 00 10 10 01 **   4.5/12
------------------------------------------------------------
    42 games
Revolution 1
The wild Muzio doesn't seem to be a good choice for the gambit player, 37 (88%) games were lost by White. The piece-sacrifice can hardly be compensated. Exactly these 3 engines are in front which could win as White (Shredder even 2x). For the first time (since 1176 games, GGT1 inclusive) an engine of the lower group could win a gambit. Another novelty, two engines from the upper group have gone to the bottom. A crazy opening indeed. Maybe someone would like analysing the engines wrong turn in this gambit variation!?

Deep Rybka 3 x64 wins C38 Hanstein Gambit
Opening: C30-C39 KING'S GAMBIT
Round 19/20, Position 30
1.e4 e5 2.f4 exf4 3.Nf3 g5 4.Bc4 Bg7 5.0-0


Wilhelm Hanstein(1811, Berlin–1850, Magdeburg) was a German chess player and writer. He was on of the founders and members of the influential Berliner Schachschule and one of the strongest chess players of his time.

Code: Select all

                               1  2  3  4  5  6  7  
1   Rybka 3 x64                ** 1½ 11 1½ ½1 00 01   7.5/12
2   Naum 4.2 x64               0½ ** 01 01 11 10 1½   7.0/12
3   FireBird 1.2 newSMP x64    00 10 ** ½1 ½1 10 1½   6.5/12  35.75
4   Stockfish 1.7.1 JA 64      0½ 10 ½0 ** 01 ½1 11   6.5/12  34.25
5   Deep Fritz 12              ½0 00 ½0 10 ** 11 11   6.0/12
6   Deep Shredder 12 x64       11 01 01 ½0 00 ** 10   5.5/12
7   Zappa Mexico II x64        10 0½ 0½ 00 00 01 **   3.0/12
------------------------------------------------------------
42 games
Statistics have called to order. A normal race, upper and lower group are represented correctly,

Deep Fritz 12 wins C39 Allgaier Gambit
Opening: C30-C39 KING'S GAMBIT
Round 21/22, Position 31
1.e4 e5 2.f4 exf4 3.Nf3 g5 4.h4 g4 5.Ng5


Johann Baptist Allgaier (1763, Schussenried–1823, Vienna) was a German-Austrian chess master and theoretician. Named as "German Philidor", he was reputed to be the strongest chess player in Europe at the Napoleonic era. It is assumed that he was temporarily one of the hidden operators of the famous TURK (Automaton Chess Player by W.v. Kempelen)

Code: Select all

                                  1  2  3  4  5  6  7  
1   Deep Fritz 12                 ** ½1 10 11 10 ½1 ½1   8.5/12
2   FireBird 1.2 newSMP x64       ½0 ** ½1 1½ ½½ ½1 11   8.0/12
3   Stockfish 1.7.1 JA x64        01 ½0 ** ½½ 10 11 11   7.5/12
4   Zappa Mexico II x64           00 0½ ½½ ** ½1 11 01   6.0/12
5   Naum 4.2 x64                  01 ½½ 01 ½0 ** ½0 10   5.0/12
6   Rybka 3 x64                   ½0 ½0 00 00 ½1 ** 01   3.5/12  19.25
7   Deep Shredder 12 x64          ½0 00 00 10 01 10 **   3.5/12  18.75
    -----------------------------------------------------------
    42 games 
Revolution 2
King's Gambit makes it possible! Yet another winner from the lower group in quick succession, probably caused by a similar bad opening line. And once again the break down of Naum and Rybka!

Deep Rybka 3 wins C40 Elephant Gambit
Opening:
Round 23/24, Position 32
1.e4 e5 2.Nf3 d5

On my study of sources I found no hint for what does the Elephant stand for. According to Wikipedia the gambit is also called Queen's Pawn Counter Gambit, Englund Counterattack or Maroczy Gambit. In German it is also called "Mittelgambit im Nachzug"

Code: Select all

                                  1  2  3  4  5  6  7  
1   Rybka 3 x64                   ** ½½ 0½ 11 11 01 11   8.5/12
2   FireBird 1.2 newSMP x64       ½½ ** 01 1½ ½1 1½ 01   7.5/12
3   Naum 4.2 x64                  1½ 10 ** 01 01 10 ½1   7.0/12
4   Deep Fritz 12                 00 0½ 10 ** ½½ 11 1½   6.0/12
5   Deep Shredder 12 x64          00 ½0 10 ½½ ** 1½ ½1   5.5/12
6   Stockfish 1.7.1 JA x64        10 0½ 01 00 0½ ** 10   4.0/12
7   Zappa Mexico II x64           00 10 ½0 0½ ½0 01 **   3.5/12
    ------------------------------------------------------------
    42 games
Groups A and B with minor flaws!

Next
GGT2 C40: Latvian, C47 Belgrade and Halloween
rainhaus
Posts: 187
Joined: Sun Feb 01, 2009 7:26 pm
Location: Germany
Full name: Rainer Neuhäusler

Re: GGT2/Total Scores/rnd 1-36/Fire leads, Rybka stagnates

Post by rainhaus »

GGT2/Total Scores, rnd 1-36/Firebird leads, Rybka stagnates

GGT2 Total Scores Round 1-36; 18 Gambits; Opening 21-38; Eco C02-C58;[/b] Nimzo,Danish,Calabrese,Lewis,Wing,Urusov,Falkbeer,Charousek,WildMuzio,Hanstein,Allgaier,Elephant, Latvian,Belgrade,Halloween,Fegatello,2x Two Nights.
new: Belgrade, Halloween, Italian, Fegatello, Two Knights

Code: Select all

                Fire       Stock      Naum      DRybka      DFritz     DShred     ZapMex     Points
                ----------------------------------------------------------------------------------
FireBird 1.2     *****     18.5-17.5  21.5-14.5  19.0-17.0  25.0-11.0  25.0-11.0  25.5-10.5  134.5
Stockfish1.7.1  17.5-18.5    *****    20.0-16.0  17.0-19.0  21.5-14.5  24.5-12.0  28.5- 7.5  128.5
Naum4.2         14.5-21.5  16.0-20.0    *****    19.5-16.5  24.0-12.0  23.5-12.5  24.0-12.0  121.5
Deep Rybka3     17.0-19.0  19.0-17.0  16.5-19.5    *****    22.5-13.5  21.5-14.5  23.0-13.0  119.5
DeepFritz12     11.0-25.0  14.5-21.5  12.0-24.0  13.5-22.5    *****    20.0-16.0  17.0-19.0   88.0 
DeepShredder12  11.0-25.0  12.0-24.0  12.5-23.5  14.5-21.5  16.0-20.0    *****    21.0-15.0   87.0
ZappaMexico II  10.5-25.5   7.5-28.5  12.0-24.0  13.0-23.0  19.0-17.0  15.0-21.0    *****     77.0
--------------------------------------------------------------------------------------------------
total 756 games
12 rounds more haven't changed the characteristic of this tournament.

1 Significant jump between the best four and the last three
2 The upper group is mainly characterized by
- constant leadership of Firebird
- Rybka doesn't reach its usual ranking
- bad scoring of Naum against Firebird
- individual match balance between Firebird, Stockfish and Rybka
3 The traditional head to head by Fritz and Shredder

GGT2 Total Performance Round 1-36

Code: Select all

                   Win  Draw  Loss  Points Perform  Games
                   ------------------------------------
FireBird 1.2       82   105    29   134.5   62%     216
Stockfish 1.7.1    84    89    43   128.5   59%     216
Naum 4.2           79    85    52   121.5   56%     216
Deep Rybka 3       70    99    47   119.5   55%     216
DeepFritz 12       47    82    87    88.0   40%     216
DeepShredder 12    44    86    86    87.0   40%     216
ZappaMexico II     37    80    99    77.0   35%     216
-------------------------------------------------------
total 756 games
Firebird wins only three games more than the third placed Naum, but it loses much rarer (only 13%) than the opponents.

Played games GGT2: 18 gambits=36 rounds

Code: Select all

                         Games 
1  engine/round             6
1  engine/gambit           12 (double round, switched colours)
1  engine pair/18 gambits  36 (for ex., FireBird against Rybka)
1  gambit                  42 (7x6)
1  engine/18gambits       216 (12x18)
12 gambits                756 (42x18)
There are still 12 gambits to play. Finally you'll get 30 gambits, 60 rounds, 1260 games, 360 games by each engine and 60 matches by each engine pair. At the end of the tournament the error margins will range about +/- 27 Elo points. This will probably be sufficient to classify 3-4 significant engine ranks.

GGT2 Total Elo-Ranking Round 1-36 with CEGT-Calibration

Code: Select all

  Program                          Elo    +   -   Games   Score    Elo   +  -
                                   GGT                             CEGT
1 FireBird 1.2  x64              : 3187   34  33   216    62.3 %   0000
2 Stockfish 1.7.1 x64            : 3169   36  36   216    59.5 %   3159  11 11
3 Naum 4.2 x64                   : 3150   36  36   216    56.2 %   3138  14 14 
4 Deep Rybka 3 x64               : 3144   34  34   216    55.3 %   3181  10 10
5 Deep Fritz 12                  : 3056   37  37   216    40.7 %   3054  14 14
6 Deep Shredder 12 x64           : 3054   36  36   216    40.3 %   3063   9  9
7 Zappa Mexico II x64            : 3024   37  38   216    35.6 %   3018   9  9
------------------------------------------------------------------------------
756 games; Starting Value EloStat:3112; List: CEGT 40/20,4 threads, June 2010
I tried the calibration with as many engines as possible, and I got finally a starting Elo of 3112. which fits excellently to the scores of CEGT, with the exception of Rybka. According to the table above, the scores of Stockfish, Naum, Fritz, Shredder and Zappa doesn't show significances between CEGT and GGT on the 5% error level! Please pay attention, by using the extreme narrow and exigent error bars of the CEGT between 9 and 14 Elo points!
Of course you could get a better correlation with Rybka by adjusting the Start Elo. Calibration is nothing more than a simple linear transformation which doesn't change the ranks and the Elo distances between the engines.

GGT2 Total Elo-Ranking Round 1-36 with CCRL-Calibration

Code: Select all

  Program                          Elo    +   -   Games   Score    CCRL  +  -
1 FireBird 1.2  x64              : 3242   34  33   216    62.3 %   0000
2 Stockfish 1.7.1 x64            : 3227   36  36   216    59.5 %   3221  24 24 
3 Naum 4.2 x64                   : 3208   36  36   216    56.2 %   3184  28 28
4 Deep Rybka 3 x64               : 3202   34  34   216    55.3 %   3232  23 22
5 Deep Fritz 12                  : 3114   37  37   216    40.7 %   3087  40 40 
6 Deep Shredder 12 x64           : 3112   36  36   216    40.3 %   3131  19 19
7 Zappa Mexico II x64            : 3082   37  38   216    35.6 %   3074  13 13 
------------------------------------------------------------------------------
756 games; Starting Value EloStat:3170; List: CEGT 40/40,4 threads, June 2010
CCRL is calibrated significantly higher than CEGT. The top mp-engines are scoring far beyond the human ratings. Is this really the true human/engine proportion or should the superiority of the programs corrected better downwards? The subject was discussed just again in this forum.
http://talkchess.com/forum/viewtopic.php?t=35125
Starting with 3170 Elo you got the table above. The correlation between CCRL and GGT is very strong too.

PMCC=Pearson product-moment correlation coefficient

Of course you can calculate a correlation coefficient to express the degree of relationship between the rating lists. The value +1 stands for the highest positive correlation (for instance, CCRL correlated with itself) and 0 stands for no relationship at all. Could be you are just about to see the first member-made correlation calculation in this forum or al least between actual ranking lists ?! :shock:
You get the following coefficients for 6 of the engines (without FireBird)

GGT/CEGT= 0.96
GGT/CCRL= 0.94
CEGT/CCRL= 0.98

up to 0.2 => very low; up to 0.5 => low; up to 0.7 => middle; up to 0.9 => strong; above 0.9 => very strong

In spite of the small sample (only 6 engines = 4 degrees of freedom) the coefficients are highly significant on the 1% error level. To express it statistically correct, you can say:

Concerning the 6 selected engines, there is a very strong linear correlation between the three ratings. That was to be expected in view of the very stringent concordance of the calibrated Elo rankings.

PGN-Link:
http://www.file-upload.net/download-265 ... 8.pgn.html
Next:
GGT2 21-60. Tournament finished.
Book
50 gambit starting positions. GGT1:Eco00 - B44. GGT2:EcoC02 - E60
Test conditions
Time Control: tournament level 40/20', 20/10', 10'+12''
System: Intel Core i7 920, oc 3600-3800 MHz, 6 GB DDR3 RAM. Vista 64
Hyperthreading off, Turbo Mode off.
Engine parameters: 3 threads. Ponder off. 1,2 GB Hash.
EGTB 3,4,5: Nalimov, TotalBases, sometimes TripleBases. Stockfish don't use EGTB. Bitbases are not needed. FireBird's TotalBases and RAM-resident TripleBases don't work always properly.
Fritz12-GUI: remis late, resign late/never.