Stockfish 18 for STC Rating List

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
Rebel
Posts: 7493
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Stockfish 18 for STC Rating List

Post by Rebel »

Results from file sf18.pgn: 27.000 games

Code: Select all

No. Name                  Win   Draw  Loss   Score Games   %
--------------------------------------------------------------
  1 Stockfish-18        +8829 =16773 -1398 17215.5 27000 63.8%
  2 Reckless-0.90-dev-3  +307  =2114  -579  1364.0  3000 45.5%
  3 PlentyChess-7.0.37   +267  =2102  -631  1318.0  3000 43.9%
  4 Obsidian-16          +157  =1981  -862  1147.5  3000 38.2%
  5 Stockfish-15         +142  =1942  -916  1113.0  3000 37.1%
  6 Alexandria-8.1.2     +136  =1898  -966  1085.0  3000 36.2%
  7 Caissa-1.24          +107  =1777 -1116   995.5  3000 33.2%
  8 Clover-9.1            +78  =1739 -1183   947.5  3000 31.6%
  9 Viridithas-18.0.0    +110  =1641 -1249   930.5  3000 31.0%
 10 Berserk-13            +94  =1579 -1327   883.5  3000 29.4%

Total Games:   27000
White Wins:     8535 (31.6%)
Black Wins:     1692 (6.3%)
Draws:         16773 (62.1%)

Code: Select all

Stockfish-17.1      :  3776
Stockfish-18        :  3804  +28
Other engines : https://rebel7775.wixsite.com/rebel/stc-rating-list
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7493
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Stockfish 18 for STC Rating List

Post by Rebel »

Next step, clash of the giants.

Leela-0.32.1-BT4 vs Stockfish-18 , 3000 games.

Code: Select all

# PLAYER                   :  RATING  ERROR   POINTS  PLAYED      W      D      L  D(%)
1 Leela-0.32.1-BT4         :  3820.3    5.1   1674.5    3000    614   2121    265    71
2 Stockfish-18             :  3804.8    1.8  17215.5   27000   8829  16773   1398    62
Previous match : Leela-0.32.1-BT4 vs Stockfish-17.1 , 3000 games : 55.8% for Leela.

After the first 1000 games now we get :

Code: Select all

Results from file leela-BT4-Stockfish-18.pgn:

No. Name              Win Draw Loss Unf.  Score Games       %
-------------------------------------------------------------
  1 Stockfish-18     +145 =727 -131   *0  508.5  1003   50.7%
  2 Leela-0.32.1-BT4 +131 =727 -145   *0  494.5  1003   49.3%
Still 2000 games to go.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7493
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Stockfish 18 for STC Rating List

Post by Rebel »

Results from file leela-BT4-Stockfish-18.pgn: 3000 games.

Code: Select all

No. Name              Win Draw Loss Unf.  Score Games       %
-------------------------------------------------------------
  1 Stockfish-18     +422 =2196 -382   *0 1520.0  3000   50.7%
  2 Leela-0.32.1-BT4 +382 =2196 -422   *0 1480.0  3000   49.3%

Total Games:    3000
White Wins:      731 (24.4%)
Black Wins:       73 (2.4%)
Draws:          2196 (73.2%)
New top-10

Code: Select all

 # PLAYER                   :  RATING  ERROR   POINTS  PLAYED       W      D      L  D(%)
 1 Leela-0.32.1-BT4         :  3810.5    3.1   3154.5    6000     996   4317    687    72
 2 Stockfish-18             :  3806.2    2.7  18735.5   30000    9251  18969   1780    63
 3 Reckless-0.90-dev-3      :  3776.1    1.7  23858.5   40000   11036  25645   3319    64
 4 PlentyChess-7.0.37       :  3763.3    2.0  22786.5   39000   10126  25321   3553    65
 5 Obsidian-16              :  3721.8    1.4  34803.0   68778   11691  46224  10863    67
 6 Alexandria-8.1.2         :  3695.3    2.2  32313.0   68778    9755  45116  13907    66
 7 Viridithas-19.0.1        :  3680.2    2.6  13949.5   30000    4256  19387   6357    65
 8 Caissa-1.24              :  3669.7    1.3  22181.5   50778    5700  32963  12115    65
 9 Clover-9.1               :  3655.3    1.2  28582.0   68776    6290  44584  17902    65
10 Berserk-13               :  3641.3    1.6  27303.0   68778    6319  41968  20491    61
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7493
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Stockfish 18 for STC Rating List

Post by Rebel »

Some games :

[pgn][Event "Best of Chess"]
[Site "King-Attacks, Sacrifices, Short Games"]
[Date "2026.02.02"]
[Round "?"]
[White "Stockfish-18"]
[Black "Leela-0.32.1-BT4"]
[Result "1-0"]
[PlyCount "60"]
[King "4392"]
[Short "4500"]
[Sac "7000"]
[Total "15892"]

1. e4 g6 2. d4 Bg7 3. Nc3 a6 4. f4 c6 5. Nf3 d5 6. e5 Bg4 7. h3 Bxf3 8.
Qxf3 Qb6 9. a3 e6 10. Bd2 Nd7 11. g4 Qc7 12. O-O-O c5 13. dxc5 Nxc5 14. Be3
Ne7 15. h4 O-O 16. Qf2 Rac8 17. h5 f6 18. f5 fxe5 19. hxg6 hxg6 20. Qh2 Kf7
21. Bc4 dxc4 22. fxg6+ Nxg6 23. Qh5 Rh8 24. Rdf1+ Bf6 25. Bh6 Rxh6 26.
Rxf6+ Kxf6 27. Qxh6 Rh8 28. Nd5+ exd5 29. Rf1+ Ke7 30. Qxg6 Ne4 1-0
[/pgn]

[pgn][Event "Best of Chess"]
[Site "King-Attacks, Sacrifices, Short Games"]
[Date "2026.02.02"]
[Round "?"]
[White "Stockfish-18"]
[Black "Leela-0.32.1-BT4"]
[Result "1-0"]
[PlyCount "84"]
[King "13152"]
[Short "0"]
[Sac "6000"]
[Total "19152"]

1. e4 c5 2. Nf3 e6 3. d4 cxd4 4. Nxd4 Nc6 5. Nc3 a6 6. Be3 Qc7 7. Qd2 Nf6
8. O-O-O Bb4 9. f3 O-O 10. g4 Ne5 11. g5 Nh5 12. Rg1 Ng6 13. Nde2 Ne5 14.
Ng3 Nxf3 15. Qf2 Nxg3 16. Rxg3 Ne5 17. Rh3 Be7 18. Kb1 b5 19. Qh4 h6 20.
Qg3 d6 21. Bd3 Bd7 22. Bf4 b4 23. Bxe5 dxe5 24. Rxh6 Qd8 25. Qh4 gxh6 26.
Qxh6 Bc5 27. Nd5 exd5 28. exd5 f5 29. g6 Rf7 30. gxf7+ Kxf7 31. Qh7+ Kf6
32. Qh6+ Kf7 33. Qh7+ Kf6 34. Rf1 Qc8 35. Qh4+ Kf7 36. Bxf5 Bxf5 37. Qg5
Ke8 38. Rxf5 Kd7 39. Qh6 Be7 40. Rxe5 Qc5 41. Qe6+ Kc7 42. Qxe7+ Qxe7 1-0
[/pgn]
90% of coding is debugging, the other 10% is writing bugs.
Jouni
Posts: 3817
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Stockfish 18 for STC Rating List

Post by Jouni »

So Leela at 1000€ GPU loses to one core Stockfish :!: . Leela project should be stopped/banned to wasting power ?
Jouni
Uri Blass
Posts: 11161
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish 18 for STC Rating List

Post by Uri Blass »

Jouni wrote: Tue Feb 03, 2026 12:19 pm So Leela at 1000€ GPU loses to one core Stockfish :!: . Leela project should be stopped/banned to wasting power ?
By this liogic you need to ban also all the other engines and maybe it is the opposite and stockfish project should be stopped/banned because stockfish is too strong and they need to stop to allow other to be able to be competitive?
Hote that I do not think somebody needs to stop because of chess results.
User avatar
RubiChess
Posts: 660
Joined: Fri Mar 30, 2018 7:20 am
Full name: Andreas Matthies

Re: Stockfish 18 for STC Rating List

Post by RubiChess »

This is probably just Jouni's way to say that the testing environment of Rebel
- limiting a CPU engine to use 6,25% of a (max) 1000€ CPU (~62,50€)
- giving Leela the same 6,25% of the CPU (or even more? did he change the default of iirc 2 threads in Leela?) and 100% of a massively in parallel working > 1000€ GPU (>1062,50€)
is crap. Even without looking at energy consumption.

Never thought that I would agree with Jouni sometime.
User avatar
Rebel
Posts: 7493
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Stockfish 18 for STC Rating List

Post by Rebel »

RubiChess wrote: Tue Feb 03, 2026 7:49 pm This is probably just Jouni's way to say that the testing environment of Rebel
- limiting a CPU engine to use 6,25% of a (max) 1000€ CPU (~62,50€)
- giving Leela the same 6,25% of the CPU (or even more? did he change the default of iirc 2 threads in Leela?) and 100% of a massively in parallel working > 1000€ GPU (>1062,50€)
is crap. Even without looking at energy consumption.

Never thought that I would agree with Jouni sometime.
In your infinite wisdom, what's the right configuration of a fair GPU vs CPU match ?

I am pretty sure you won't find consensus.
90% of coding is debugging, the other 10% is writing bugs.
jorose
Posts: 384
Joined: Thu Jan 22, 2015 3:21 pm
Location: Zurich, Switzerland
Full name: Jonathan Rosenthal

Re: Stockfish 18 for STC Rating List

Post by jorose »

I don't think there is a fair configuration. That being said, I am a bit confused, on your site you write you are running the list with 16 cores and a 4080. So is Jouni mistaken and you are running these matches on 16 cores or is the information on your website misleading?

Regarding matches between CPU and GPU based engines, I would also appreciate if you would be transparent in your posts exactly which hardware and settings you are running in these matches. I shouldn't have to go to your website to find that information if you are already sharing numbers here. You are not the only person guilty of this, but as the topic arose and I respect you a lot I feel it is a good time to point it out.
-Jonathan
Uri Blass
Posts: 11161
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish 18 for STC Rating List

Post by Uri Blass »

RubiChess wrote: Tue Feb 03, 2026 7:49 pm This is probably just Jouni's way to say that the testing environment of Rebel
- limiting a CPU engine to use 6,25% of a (max) 1000€ CPU (~62,50€)
- giving Leela the same 6,25% of the CPU (or even more? did he change the default of iirc 2 threads in Leela?) and 100% of a massively in parallel working > 1000€ GPU (>1062,50€)
is crap. Even without looking at energy consumption.

Never thought that I would agree with Jouni sometime.
I have no problem with testing assuming the tester does not hide the conditions.
Even if some tester gives all engines 100 seconds per move and stockfish 1 second per move then I have no problem with it when the tester does not hide the conditions.

I see no problem with Odds.
Leela use material odds in games and
IMO hardware odds or time odds in computer-computer ganes are also no problem.