Lc0 question

jp · Post by jp » Sun Jul 07, 2019 6:39 pm

kranium wrote: ↑Sun Jul 07, 2019 5:43 pm
jp wrote: ↑Sun Jul 07, 2019 5:11 pm Humans' best chance of distinguishing the engines is from their bad opening or endgame play, not their good middlegame play.
Engines play the openings and endgames badly?

But, playing against the strongest humans, SF and/or LC0 would not lose a single game, and probably give up very few draws, if any...
even against the most 'opening' savvy/prepared GM.

How have some humans determined these engines are playing badly? I don't get it.
Seems a bit pretentious...or some desperate need to remain relevant.

Perhaps you should read other people's posts more carefully, and then you might "get it".

The context of that discussion (which you cut out of your reply) was obviously about the claim that NN engines play more "human-like" chess than AB engines. It was nothing to do with engines playing "badly" compared with humans.

The point was about whether humans can tell AB moves from NN moves. I doubt they can, which if true suggests it's not true that one type of engine has a more "human-like" style. This has been discussed in this forum before. When it was discussed before, there was an objection that we could tell because NN engines supposedly play the openings better than AB engines. There's also the objection that AB engines play the endgames better than NN engines. (Lc0 has in endgames played like a troll and that might allow the human seeing those moves to identify the player as Lc0 and not SF.) If those objections are correct, then the human might be able to tell SF from Lc because of SF's relatively "bad" opening play compared with Lc, and Lc's relatively "bad" endgame play compared with SF.

That's why the test of humans' ability to tell AB and NN engines apart was proposed to be middlegame positions.

It had nothing to do with comparing engine strength with human strength. (It had nothing to do with playing strength at all, unless that might reveal the player's identity.) It had nothing to do with being "pretentious" or having a "desperate need". If you wish to misunderstand completely other people's posts, you might try at least to misunderstand them in a more polite way.

supersharp77 · Post by **supersharp77** » Sun Jul 07, 2019 8:31 pm

mclane wrote: ↑Sun Jul 07, 2019 4:00 pm Oh my god. Another „is LC0 stronger then Stockfish“ debate.

It’s so boring,

And how to choose the hardware. To make it possible.

Hardware. How boring.
I prefer decreasing the speed of hardware.
I don’t understand that you want faster hardware. Why ?

Agree 1000%.. this sort of 'Hardware based Tactic' has been going on for years in this and in other forums...Cores and superfast hardware.. ie Stockfish etc....They have just replaced Stockfish with LC0 and continued on...Old Stockfish is still "Super Strong" even on my Tablet or Smartphone...Minimalist Approach in my opinion is best...

Modern Times · Post by **Modern Times** » Sun Jul 07, 2019 9:09 pm

mclane wrote: ↑Sun Jul 07, 2019 4:00 pm Oh my god. Another „is LC0 stronger then Stockfish“ debate.

Yes, and then the endless arguments as to whether the hardware is "equal" in any given match.

Just my opinion, but I think it is an impossibility to say any sort of GPU and CPU hardware are "equal" performance wise. Such radically different architecture, radically different A/B and NN engines, just not possible. Yes I know about the Leela ratio, and I don't believe in it except for comparing two machines relative to each other. So use it to compare machine A with CPU A and GPU A with machine B with CPU B and GPU B, yes sure, but it is too much of a stretch for me for it to be used as a measure of hardware "equality".

Laskos · Post by **Laskos** » Sun Jul 07, 2019 9:34 pm

Modern Times wrote: ↑Sun Jul 07, 2019 9:09 pm
mclane wrote: ↑Sun Jul 07, 2019 4:00 pm Oh my god. Another „is LC0 stronger then Stockfish“ debate.

Yes, and then the endless arguments as to whether the hardware is "equal" in any given match.

Just my opinion, but I think it is an impossibility to say any sort of GPU and CPU hardware are "equal" performance wise. Such radically different architecture, radically different A/B and NN engines, just not possible. Yes I know about the Leela ratio, and I don't believe in it except for comparing two machines relative to each other. So use it to compare machine A with CPU A and GPU A with machine B with CPU B and GPU B, yes sure, but it is too much of a stretch for me for it to be used as a measure of hardware "equality".

Well, this arbitrary "Leela Ratio" of about 1, give or take a factor of 2, is what we see for similarly priced (or adjusted for kWh spent) very new hardware. We simply have to adapt to two measures, CPU and GPU, instead of just one. The debates here are long and idiotic just due to the quantity of idiots here.

mclane · Post by **mclane** » Sun Jul 07, 2019 9:44 pm

I wonder why the people need this race between Stockfish and lc0.
Isn’t it enough Stockfish and Komodo and Houdini do this race ??

todd · Post by **todd** » Mon Jul 08, 2019 3:03 am

kranium wrote: ↑Sun Jul 07, 2019 5:43 pm How have some humans determined these engines are playing badly?

In openings, it's pretty easy to demonstrate even if we grant an initial assumption (which we shouldn't!) that human input is of no value at all.

Engines, in games, only spend minutes per move at most.

Humans have used the same engines to analyze the same positions to much greater depth.

On top of this, humans also have done deep engine analysis of many other opening positions reachable from the root position, so that, reasoning backward, we effectively have an even higher depth assessment of the root position.

We also have access to many games (between both engines and humans) where we can see which opening moves have resulted in more success than others, and this leads us to investigate the difference by analyzing the relevant positions more deeply.

Laskos · Post by **Laskos** » Mon Jul 08, 2019 4:24 pm

lkaufman wrote: ↑Sun Jul 07, 2019 5:17 am But on the other hand, I do rather suspect that Lc0 just can't benefit much from long time limits. Have you done runs where you ran the same two versions of Stockfish and Lc0 against each other at different time limits? If so, have you noticed a consistent pattern? I would expect that SF would always do better at longer time limits, even if it still loses the match. That might mean that there is some time limit beyond which SF would win, you just haven't tested at such a long TC yet.

I had the same impression, but the facts convinced me that it is not so. I never tested anything like tournament TC, but we have TCEC and other results. I get at the same ratio of nodes (call it "Leela Ratio" of about 1) between Lc0 T40 and SF_dev similar results to TCEC results, but at a factor of x1,000 less nodes. Roughly, at 10k Leela nodes versus 10m SF_dev nodes I get similar results to TCEC 10m Leela nodes versus 10b SF_dev nodes. And TCEC openings are a bit weirder than mine short and balanced, and are disadvantaging a bit Leela. So, Leela T40 scales as well as SF_dev (if not a bit better) from ultra-bullet on strong hardware to tournament TC on strong hardware.

And Leela T40 does change its mind when needed, at long TC too. I saw this in a test suite collected by me, WAC145 (corrected WAC300), of fairly easy, typical for actual games tactical puzzles, most of them very easy for good AB engines. Leela fares terribly at short TC (1s/position), worse than weak AB engines, but improves steadily to longer TC, changing its mind in time to the correct solution. Here are the results:

4 fast i7 cores, RTX 2070 GPU

SF_dev:
1s/position: 143/145
5s/position: 145/145

Leela 42724:
1s/position: 127/145 (worse than CCRL Elo 2200 AB engines)
30s/position: 139/145 --- changed its mind 12 times to the correct solution. The number of blunders is 3 fold fewer.
120s/position: 143/145 --- changed its mind 4 times to the correct solution. The number of blunders is another factor of 3 less.

So, to LTC, Leela T40 is no longer an easy tactical prey. It blunders tactically progressively less and less, and the MCTS search is steadily improving Leela's tactics.

The positional play also improves at LTC (I can see it on my suites), but Leela is already VERY strong there even at short time controls (in my conditions MUCH stronger than anything regular engines).

All in all, I don't think that Leela T40 has scaling issues to LTC or VLTC compared to SF or Komodo.

mwyoung · Post by **mwyoung** » Mon Jul 08, 2019 5:44 pm

Laskos wrote: ↑Mon Jul 08, 2019 4:24 pm
lkaufman wrote: ↑Sun Jul 07, 2019 5:17 am But on the other hand, I do rather suspect that Lc0 just can't benefit much from long time limits. Have you done runs where you ran the same two versions of Stockfish and Lc0 against each other at different time limits? If so, have you noticed a consistent pattern? I would expect that SF would always do better at longer time limits, even if it still loses the match. That might mean that there is some time limit beyond which SF would win, you just haven't tested at such a long TC yet.
I had the same impression, but the facts convinced me that it is not so. I never tested anything like tournament TC, but we have TCEC and other results. I get at the same ratio of nodes (call it "Leela Ratio" of about 1) between Lc0 T40 and SF_dev similar results to TCEC results, but at a factor of x1,000 less nodes. Roughly, at 10k Leela nodes versus 10m SF_dev nodes I get similar results to TCEC 10m Leela nodes versus 10b SF_dev nodes. And TCEC openings are a bit weirder than mine short and balanced, and are disadvantaging a bit Leela. So, Leela T40 scales as well as SF_dev (if not a bit better) from ultra-bullet on strong hardware to tournament TC on strong hardware.

And Leela T40 does change its mind when needed, at long TC too. I saw this in a test suite collected by me, WAC145 (corrected WAC300), of fairly easy, typical for actual games tactical puzzles, most of them very easy for good AB engines. Leela fares terribly at short TC (1s/position), worse than weak AB engines, but improves steadily to longer TC, changing its mind in time to the correct solution. Here are the results:

4 fast i7 cores, RTX 2070 GPU

SF_dev:
1s/position: 143/145
5s/position: 145/145

Leela 42724:
1s/position: 127/145 (worse than CCRL Elo 2200 AB engines)
30s/position: 139/145 --- changed its mind 12 times to the correct solution. The number of blunders is 3 fold fewer.
120s/position: 143/145 --- changed its mind 4 times to the correct solution. The number of blunders is another factor of 3 less.

So, to LTC, Leela T40 is no longer an easy tactical prey. It blunders tactically progressively less and less, and the MCTS search is steadily improving Leela's tactics.

The positional play also improves at LTC (I can see it on my suites), but Leela is already VERY strong there even at short time controls (in my conditions MUCH stronger than anything regular engines).

All in all, I don't think that Leela T40 has scaling issues to LTC or VLTC compared to SF or Komodo.

I agree, Lc0 does benefit from longer time controls/faster hardware. And the nonsense you hear that Lc0 is only better when tested with short books is also incorrect.

Here is the current tournament results with all players playing with a 35 move opening book, with learning turned on, and set to optimal.
Set at a CCRL(f) of 3.5h/40 moves.

As in most of my testing. Regardless of players, books, and settings. Stockfish and Lc0 are the only engines fighting for the top spot.
145 games played of 168.

Raiders update 3.jpg

Ovyron · Post by **Ovyron** » Wed Jul 10, 2019 1:30 am

mclane wrote: ↑Sun Jul 07, 2019 9:44 pmIsn’t it enough Stockfish and Komodo and Houdini do this race ??

What race? Stockfish won that one by a mile long time ago. If I recall correctly Komodo topped the rating lists only once, and for a very short while, so I'm not sure Stockfish was in danger of losing that race at any point.

mwyoung · Post by **mwyoung** » Wed Jul 10, 2019 1:44 am

Ovyron wrote: ↑Wed Jul 10, 2019 1:30 am
mclane wrote: ↑Sun Jul 07, 2019 9:44 pmIsn’t it enough Stockfish and Komodo and Houdini do this race ??
What race? Stockfish won that one by a mile long time ago. If I recall correctly Komodo topped the rating lists only once, and for a very short while, so I'm not sure Stockfish was in danger of losing that race at any point.

The statement is irrational. Why would we want to limit progress on a game that is impossible to be solved.
No, it is not enough for Stockfish, Komodo, and Houdini to do this race.

New programs, and fresh ideas are always welcome.

Lc0, and other NN engines are the fresh ideas at this moment. And showing Stockfish has not won the race....

Lc0 question

Re: Lc0 question

Re: Lc0 question

Re: Lc0 question

Re: Lc0 question

Re: Lc0 question

Re: Lc0 question

Re: Lc0 question

Re: Lc0 question

Re: Lc0 question

Re: Lc0 question