Stockfish vs. Lc0: IMHO disappointing result for Lc0

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

supersharp77
Posts: 1242
Joined: Sat Jul 05, 2014 7:54 am
Location: Southwest USA

Re: Stockfish vs. Lc0: IMHO disappointing result for Lc0

Post by supersharp77 »

mvanthoor wrote: Sat Jul 27, 2019 10:40 pm
Chessqueen wrote: Sat Jul 27, 2019 10:19 pm You are using an inferior hardware for LCO, the GTX 1070. is too slow in comparison to what Stockfish was using in your test, either you get a RTX 2080 Ti or a slower hardware for Stockfish in order for the match to be even :shock:
Truly? So you recommend I get a €1200 graphics card, just to be able to make Lc0 capable of defeating Stockfish 10 running on a currently 4 year old quadcore? That sounds incredibly non-economical. Maybe a GTX 1080 Ti (which is about twice as fast as the GTX 1070 if I remember correctly) would put it on par with the old 6700K, but even though it's older, that GPU still runs €350-400 in the second-hand market.

As I said: I don't game a lot. I only buy a new graphics card if there's a new game I really, really want to play in top quality, such as the Witcher 3. As long as nothing else I want to play comes along, my graphics card will be in the next computer as long as possible. (I've had my GTX 560 Ti for 6 years, and if The Witcher 3 hadn't been released, I would have still had it today.)

I *do* have a use for a lot of CPU power though (at least sometimes), so me getting a 12 or even 16 core CPU in two years or so is highly likely. I'll have to find a chess GUI in which I can set different time controls for each engine to match the CCRL-settings, and then test Stockfish 11 or 12 against 10. Version 10 will be obliterated. Lc0, running on the GTX 1070, won't probably even be able to draw a single game.

If I need a €1200+ GPU to make NN engine a match for an "old" A/B engine running on a €350-500 CPU, I'll be sticking with the A/B-engines for some time to come.
With a time control that fast...you won't be getting any "High Quality" Chess Games....."it just a waste of time" imho
5 min + 5 sec....or 5 min +10 sec......etc....Fischer time.... :) :wink:
User avatar
Graham Banks
Posts: 41416
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: Stockfish vs. Lc0: IMHO disappointing result for Lc0

Post by Graham Banks »

mvanthoor wrote: Sat Jul 27, 2019 11:08 pmI didn't get any superior hardware for Stockfish. I buy what I need. In 2016, when I bought this computer, I bought a 6700K (for somewhere around €350) because that CPU had the best price/performance for what I wanted to do at the time. I got a GTX 1070 (for around €425 or so) because that card could run The Witcher 3 at the highest settings, and any newer games for the forseeable future.

In a new computer, I'll obviously get a new CPU, probably a 12 or 16 core (between €350 and €500 at this point in time), but the GTX 1070 graphics card possibly stays if there isn't a newer game I want to play.

As I said, if I'd need to buy a graphics card costing €1200+ to be able to make Lc0 (or any other NN-engine) a match for a €400 CPU, then it's not worth it for me. Then I'll happily stick with the A/B CPU-based engines.
I suspect that this is the case for most.
gbanksnz at gmail.com
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Stockfish vs. Lc0: IMHO disappointing result for Lc0

Post by mvanthoor »

Result of SF10 vs. Lc0 0.21.1 JH.T6.532:

+6 -1 =13, +88 Elo for SF10.

This puts Lc0 at 3547 - 88 = 3459 Elo, which is actually lower than the 3486 from the CCRL 40/4 list.

It can actually be true. If my computer is much faster than the computer used for the CCRL testing, my time controls will be shorter. Therefore, Lc0's time controls will also be shorter. Case in point is: the faster the CPU becomes, the shorter the time controls will be to stay CCRL-compatible, and thus the the weaker Lc0 will play, because the graphics card also plays with the shorter time controls.

So if a new CPU is twice as fast as an older one, you can just cut the time control in half (40/2 instead of 40/4), and the performance of the A/B-engine will stay roughly the same. If you use the same graphics card in that newer computer (with the halved time control), the GPU lost half its thinking time as well, without the 2x speed increase. It thus loses half it's speed, effectively.

It's thus possible that Lc0, running on a GTX 1050, competing against an A/B-engine on a very slow CPU (using CCRL-compatible time controls for the CPU) performs a lot better than running Lc0 on a GTX 1070 or even GTX 1080 Ti, competing against that same A/B-engine on a blazing fast CPU.

Therefore it's actually impossible to state "Lc0 has X Elo on GTX 1070", because that rating will become lower and lower as the CPU's running the A/B-engines become faster and faster, when using CCRL-compatible time controls.

It's basically two different computers competing. I should have known. I don't know how I could equalize this, how to find out what GTX 1070 GPU time control would be equivalent to which 6700K CPU time control.

As the ELO difference is roughly 60-90 points in favor of Stockfish, and assuming the 60-70 points per speed doubling still holds true, Stockfish on 1, maybe 2 threads would be comparable to Lc0 on a GTX 1070.

So indeed, if you don't own something like an RTX 2070 or up, it's probably best to just stick with a CPU-based A/B-engine for now, especially on the shorter time controls.
Last edited by mvanthoor on Sun Jul 28, 2019 2:41 am, edited 1 time in total.
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
sovaz1997
Posts: 261
Joined: Sun Nov 13, 2016 10:37 am

Re: Stockfish vs. Lc0: IMHO disappointing result for Lc0

Post by sovaz1997 »

Only 20 games, but the conclusions have already been made. Just no comment.
Zevra 2 is my chess engine. Binary, source and description here: https://github.com/sovaz1997/Zevra2
Zevra v2.5 is last version of Zevra: https://github.com/sovaz1997/Zevra2/releases
User avatar
mvanthoor
Posts: 1784
Joined: Wed Jul 03, 2019 4:42 pm
Location: Netherlands
Full name: Marcel Vanthoor

Re: Stockfish vs. Lc0: IMHO disappointing result for Lc0

Post by mvanthoor »

If I'd run 100 or a 1000 games, I'd roughly expect the result to be "+30 -5 =65" or "+300 -50 =650" in favor of Stockfish 10 on this computer, give or take a few points. I'm not going to put any more time into this. I would have tested more, if all the 20 games were drawn, and if the games where somewhat exciting to look at, but most end in one of three ways:

- The engines reach a drawn position, and start to shuffle around until either the GUI or the Tablebase adjudicates the game.
- Three-fold repetition, after Stockfish runs out of options to prevent it.
- Lc0 makes a blunder and loses the game before move 40
Author of Rustic, an engine written in Rust.
Releases | Code | Docs | Progress | CCRL
supersharp77
Posts: 1242
Joined: Sat Jul 05, 2014 7:54 am
Location: Southwest USA

Re: Stockfish vs. Lc0: IMHO disappointing result for Lc0

Post by supersharp77 »

mvanthoor wrote: Sun Jul 28, 2019 2:43 am If I'd run 100 or a 1000 games, I'd roughly expect the result to be "+30 -5 =65" or "+300 -50 =650" in favor of Stockfish 10 on this computer, give or take a few points. I'm not going to put any more time into this. I would have tested more, if all the 20 games were drawn, and if the games where somewhat exciting to look at, but most end in one of three ways:

- The engines reach a drawn position, and start to shuffle around until either the GUI or the Tablebase adjudicates the game.
- Three-fold repetition, after Stockfish runs out of options to prevent it.
- Lc0 makes a blunder and loses the game before move 40
What opening book? LC0 does not play very well on Fritz Gui.......Best in The Chess OK ..GUI 2nd is Shredder GUI... 8-)
sovaz1997
Posts: 261
Joined: Sun Nov 13, 2016 10:37 am

Re: Stockfish vs. Lc0: IMHO disappointing result for Lc0

Post by sovaz1997 »

mvanthoor wrote: Sun Jul 28, 2019 2:43 am If I'd run 100 or a 1000 games, I'd roughly expect the result to be "+30 -5 =65" or "+300 -50 =650" in favor of Stockfish 10 on this computer, give or take a few points. I'm not going to put any more time into this. I would have tested more, if all the 20 games were drawn, and if the games where somewhat exciting to look at, but most end in one of three ways:

- The engines reach a drawn position, and start to shuffle around until either the GUI or the Tablebase adjudicates the game.
- Three-fold repetition, after Stockfish runs out of options to prevent it.
- Lc0 makes a blunder and loses the game before move 40
Cognitive biases prevent you from knowing the truth. Run a test for 1000 games. And you will see result. 20 games is not result. It's luck or unluck.
Zevra 2 is my chess engine. Binary, source and description here: https://github.com/sovaz1997/Zevra2
Zevra v2.5 is last version of Zevra: https://github.com/sovaz1997/Zevra2/releases
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Stockfish vs. Lc0: IMHO disappointing result for Lc0

Post by Dann Corbit »

mvanthoor wrote: Sun Jul 28, 2019 2:43 am If I'd run 100 or a 1000 games, I'd roughly expect the result to be "+30 -5 =65" or "+300 -50 =650" in favor of Stockfish 10 on this computer, give or take a few points. I'm not going to put any more time into this. I would have tested more, if all the 20 games were drawn, and if the games where somewhat exciting to look at, but most end in one of three ways:

- The engines reach a drawn position, and start to shuffle around until either the GUI or the Tablebase adjudicates the game.
- Three-fold repetition, after Stockfish runs out of options to prevent it.
- Lc0 makes a blunder and loses the game before move 40
In testing a change, i have seen the proposed change produce +100 elo after 100 games and drop back to zero after 1000.

Flip a penny twenty times and count the heads and the tails
Then, by counting the heads and tails, tell me which is stronger,heads or tails,
Quite likely, there will be a clear winner, but if you do 10,000 flips, it will be really, really close to zero
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
MikeGL
Posts: 1010
Joined: Thu Sep 01, 2011 2:49 pm

Re: Stockfish vs. Lc0: IMHO disappointing result for Lc0

Post by MikeGL »

LC0 does not play very well on Fritz Gui.......Best in The Chess OK ..GUI 2nd is Shredder GUI... 8-)
Very good point. I think, GUI which uses graphics card heavily can hurt the performance of NN engines (i.e.Lc0) since GPU speed would be reduced. Better to run this match of SF vs Lc0 via command line.
I told my wife that a husband is like a fine wine; he gets better with age. The next day, she locked me in the cellar.
ChickenLogic
Posts: 154
Joined: Sun Jan 20, 2019 11:23 am
Full name: kek w

Re: Stockfish vs. Lc0: IMHO disappointing result for Lc0

Post by ChickenLogic »

Chessqueen wrote: Sat Jul 27, 2019 10:19 pm
mvanthoor wrote: Sat Jul 27, 2019 10:06 pm Today I ran a short 20-game match between Stockfish 10 and Lc0. Specs of the match:

Stockfish 10 x64 BMI2 on Intel i7-6700K, 4 threads, 8GB hashtable
Lc0 0.21.3, w42850 on GTX 1070. 4 threads, everything else default.
Syzygy 5 men tablebase, 8 move Performance.bin opening book.
Adjucation by GUI bo overwhelming material advantage or Syzygy when win/draw/loss in endgame.

The result was +4 -1 =15 in favor of Stockfish 10.

To be honest, after all the hype surrounding Lc0, I find the result to be disappointing. I'd expected the result to be the reverse, to be honest.When looking into networks, I found https://www.sp-cc.de/lc0-testing.htm, and the network I used is stronger than the ones used there (+60 ELO).

I haven't looked into things such as Leela Ratio or anything yet. I'm not trying to match one engine against another on the same hardware or anything: I wanted to know: how much stronger or weaker is Lc0, running on a GTX 1070, compared to Stockfish running to the specifications of CCRL 40/4?

I ran the match at a time control of 40 moves in 85 seconds as, on my computer, that is the setting to use for CCRL 40/4. In CCRL 40/4. I wanted to know where a full power Lc0 on GTX1070 would fall in the CCRL 40/4 list. Stockfish has a rating of 3547, and the result of +4-1=15 shows a rating advantage of +52 of Stockfish over Lc0, setting Lc0 at 3495. That is only 6 points above the rating of 3486 which Lc0 attains in the CCRL 40/4 list (al be it with a different network), despite the GTX 1070 being a much more powerful card that the GTX 1050. That seems disappointing.

Also, the games are not very interesting. Often, after 30-35 moves or so, everything has been traded down to an endgame. Also, it's often Stockfish preventing a draw by threefold repetition (because of the default contempt probably), and even so, many games ended in threefold repetition. In some games, Leela makes exceedingly weird moves, and lost game 1 in 21 moves because of a blunder. With regard to Stockfish, I can mostly understand what it's trying to do with a move, but with Lc0, I'm often left guessing. Because Lc0 "only" searches 10K nodes or so in the endgame, while Stockfish is often already into the 10+ million, Stockfish reaches the endgame database much faster. I often see Leela struggling to look beyond 12 ply or so, while Stockfish is soaring into the 40 ply range, reaching the endgame database from the late middle game.

Of course, my expectation wasn't for Lc0 to blow Stockfish out of the water with a 20-0 result, but I did expect it to win with a +2 score or so. Could/should I be using a different network (I've seen some networks that were smaller, faster, and had a higher ELO-rating than the 42850 I used)? Are my expectations wrong, and is a GTX 1070 just not powerful enough?

I don't play a lot of games. I always pick a midrange card; in this case I picked the GTX 1070 in 2016, because of The Witcher 3, but if I don't acquire a newer game that needs a lot more power, this card is likely to also be in my next computer. I do need/use a lot of CPU-power for some of my tasks, so the 6700K will probably be replaced by a 12 core machine, at least. If Stockfish already wins by +4-1, running on an old i7-6700K against Lc0 on a GTX 1070, I shudder to think how it would decimate Lc0 @ GTX 1070 when running on one of the new Zen3 CPU's with 12 or 16 cores if I should get a new computer (but not a new graphics card).
You are using an inferior hardware for LCO, the GTX 1070 is too slow in comparison to what Stockfish was using in your test, either you get a RTX 2080 Ti or a slower hardware for Stockfish in order for the match to be even :shock:
A 2080TI for merely 4 rather fast cores? Have you lost your mind? For the price of a 2080TI you can get 8 cores or more (depending if you want a mainboard + CPU for ~1100€ or a CPU for ~1100€). A 1070 should be more than enough to be about equal. Though I do not understand why Lc0 was using 4 threads... 1.5 to 2 threads per GPU is what you should use. I think that weakened Leela a bit.