SF13 vs FF2: Representative opening suite

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

SF13 vs FF2: Representative opening suite

Post by gaard »

I received a note from AS that shallow opening books, such as those used in my other tests here: http://talkchess.com/forum3/viewtopic.php?f=6&t=76687 would not best reflect FF2's strength, and that a more representative suite would better demonstrate FF2's strength (paraphrasing). I thought this was a fair criticism and this test is made up of more than 3,000 high level correspondence games.

For every game, Cute Chess will select a random game in the suite, up to depth 8 (full moves), and play two games per round per opponent, with colors reversed, just as my other tournaments (but with different openings). Given the deeper opening book, and (assuming) all else being equal, I expect some compression in the results, and possibly doubles.

Threads=1
Contempt=0
Analysis Contempt=Off
Hash=128
TC=10"+0.1"

Positions: swapped
Openings: custom (
syzygy: none
CPU: i7-9750H

cutechess-cli: -resign movecount=5 score=900

ordo: -W -D

compilation: make build ARCH=x86-64-bmi2 COMP=mingw

Results (369 of 9600 games finished:

Code: Select all

   # PLAYER                        : RATING    POINTS  PLAYED    (%)
   1 Stockfish 13 MyComp           :    0.0     137.5     245   56.1%
   2 Fat Fritz 2 Private MyComp    :  -27.2     124.0     246   50.4%
   3 Fat Fritz 2 Public MyComp     :  -59.7     107.5     247   43.5%

White advantage = 33.72
Draw rate (equal opponents) = 74.28 %
Previous results:

Code: Select all

   # PLAYER                 : RATING    POINTS  PLAYED    (%)
   1 Stockfish 13           :    0.0    1233.5    2132   57.9%
   2 Fat Fritz 2 Private    :  -27.5    1112.0    2134   52.1%
   3 Fat Fritz 2 Public     :  -85.2     854.5    2134   40.0%

White advantage = 31.27
Draw rate (equal opponents) = 80.05 %
Unless FF2 makes a big gain in the meantime, I will end this tournament at 1600 games.
carldaman
Posts: 2287
Joined: Sat Jun 02, 2012 2:13 am

Re: SF13 vs FF2: Representative opening suite

Post by carldaman »

Unless FF2 makes a big gain in the meantime, I will end this tournament at 1600 games.
Perhaps I misunderstood your reasoning, but why would you make a decision on ending or extending a tournament (or match) in progress based on how one engine is currently performing? It is more unbiased to decide the length of the match before you start the test and stick to it, unless something is clearly broken.
Modern Times
Posts: 3807
Joined: Thu Jun 07, 2012 11:02 pm

Re: SF13 vs FF2: Representative opening suite

Post by Modern Times »

Representative of what ? "Representative" in itself has different meaning to different people. Give different experts the opportunity to build a representative openings suite, and they'd all come up with something different. Anyway, in this case I suspect it will make little if any difference.

The chess960 test I ran can't be disputed as not representative, it was all 960 positions that exist. Unless of course you dismiss chess960 altogether :mrgreen:

As for the other test, Stefan Pohls books and opening suites are well respected but I don't have a view as to whether they are "representative" or not.
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: SF13 vs FF2: Representative opening suite

Post by gaard »

carldaman wrote: Wed Feb 24, 2021 5:21 am
Unless FF2 makes a big gain in the meantime, I will end this tournament at 1600 games.
Perhaps I misunderstood your reasoning, but why would you make a decision on ending or extending a tournament (or match) in progress based on how one engine is currently performing? It is more unbiased to decide the length of the match before you start the test and stick to it, unless something is clearly broken.
My main interest is to show the relative strength of SF13/FF2 Private/FF2 Public. If the results are undecided at 1600 then I will continue the match, if not I will go to 3200, then 6400, and then to the max of 9600.

If SF13 has a LOS over FF2 of 99%, and the same with regards to FF2 Private and FF2 Public, then I see no reason to continue the tournament, since my intention from the beginning was to only establish the LOS of SF13/FF2 Private/FF2 Public. I should have made that more clear from the beginning so your question was definitely relevant.
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: SF13 vs FF2: Representative opening suite

Post by gaard »

Modern Times wrote: Wed Feb 24, 2021 5:28 am Representative of what ? "Representative" in itself has different meaning to different people. Give different experts the opportunity to build a representative openings suite, and they'd all come up with something different. Anyway, in this case I suspect it will make little if any difference.

The chess960 test I ran can't be disputed as not representative, it was all 960 positions that exist. Unless of course you dismiss chess960 altogether :mrgreen:

As for the other test, Stefan Pohls books and opening suites are well respected but I don't have a view as to whether they are "representative" or not.
Good point. I meant representative as in games played in ICCF, FICGS and IECC, where:

1) Both players are rated 2500+ Elo
2) Since 2016
3) With 20 moves or more
4) Ending in 1-0, 1/2-1/2, or 0-1

For a more comprehensive explanation, see the attached suite in the link provided.

Current results:

Code: Select all

   # PLAYER                 : RATING    POINTS  PLAYED    (%)
   1 Stockfish 13           :    0.0     364.0     660   55.2%
   2 Fat Fritz 2 Private    :  -11.6     348.5     661   52.7%
   3 Fat Fritz 2 Public     :  -61.9     278.5     661   42.1%

White advantage = 34.94
Draw rate (equal opponents) = 75.22 %
Modern Times
Posts: 3807
Joined: Thu Jun 07, 2012 11:02 pm

Re: SF13 vs FF2: Representative opening suite

Post by Modern Times »

You'll probably end up with the same -11 that I got with my arguably not so representative 2871 suite :mrgreen:
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: SF13 vs FF2: Representative opening suite

Post by gaard »

Modern Times wrote: Wed Feb 24, 2021 5:46 am You'll probably end up with the same -11 that I got with my arguably not so representative 2871 suite :mrgreen:
Right now I have FF2 at -9.6 relative to SF13, so you may be correct. If I'm not mistaken, fishtest also showed FF2 to be -11 Elo wrt to SF13, using a different, but probably compatible, opening suite.
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: SF13 vs FF2: Representative opening suite

Post by gaard »

Current results:

Code: Select all

   # PLAYER                 : RATING    POINTS  PLAYED    (%)
   1 Stockfish 13           :    0.0     526.5     932   56.5%
   2 Fat Fritz 2 Private    :  -18.7     491.0     934   52.6%
   3 Fat Fritz 2 Public     :  -74.1     382.5     934   41.0%

White advantage = 31.48
Draw rate (equal opponents) = 73.78 %
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: SF13 vs FF2: Representative opening suite

Post by gaard »

Current results:

Code: Select all


   # PLAYER                 : RATING    POINTS  PLAYED    (%)
   1 Stockfish 13           :    0.0     585.0    1032   56.7%
   2 Fat Fritz 2 Private    :  -21.0     539.5    1032   52.3%
   3 Fat Fritz 2 Public     :  -74.5     423.5    1032   41.0%

White advantage = 30.13
Draw rate (equal opponents) = 72.79 %
gaard
Posts: 463
Joined: Mon Jun 07, 2010 3:13 am
Location: Holland, MI
Full name: Martin W

Re: SF13 vs FF2: Representative opening suite

Post by gaard »

Final results:

Code: Select all

   # PLAYER                 : RATING  ERROR   POINTS  PLAYED    (%)
   1 Stockfish 13           :    0.0   ----    709.5    1248   56.9%
   2 Fat Fritz 2 Private    :  -21.2   12.1    654.0    1248   52.4%
   3 Fat Fritz 2 Public     :  -76.8   12.3    508.5    1248   40.7%

White advantage = 30.38
Draw rate (equal opponents) = 72.08 %
LOS:

Code: Select all

Stockfish 13                 100.0  100.0
Fat Fritz 2 Private     0.0         100.0
Fat Fritz 2 Public      0.0    0.0
Games: