Ah ok. Thanks!Laskos wrote:I don't think SF9 depth=1 is excessively weak and misses much compared to lesser pruning, older engines. An older test showing depth=1 results in RR games from regular openings:Michel wrote:Well we will have to wait to see how good (or bad) LC0 will eventually become at tactics. I am hoping that the majority of chess tactics actually depend on fairly standard patterns and that the NN (value head and policy head) can learn to recognize those patterns. This would be similar to how humans handle tactics.If the move probabilties are supposed to single out "unclear" moves, then things could work. But I don't really see how the whole updating process would work towards identifying "unclear" moves.
Recent experiments (by Kai and Killiani) show that the policy network of LC0 is on par with SF at depth 1 (with quiescence search). This might mean that LC0 already statically recognizes some recapture patterns. Unfortunately it may also mean that SF simply prunes too much at depth 1 to be competitive...And IIRC the newer results are not that different.Code: Select all
Rank Name ELO Games Score Draws 1 Komodo 8 92 1000 63% 18% 2 Houdini 4 78 1000 61% 26% 3 Hannibal 1.4 56 1000 58% 23% 4 SF 14122014 43 1000 56% 20% 5 Hiarcs 14 -22 1000 47% 16% 6 Shredder 6PB (2002) -302 1000 15% 14% Finished match
LCzero sacs a knight for nothing
Moderators: hgm, Rebel, chrisw
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: LCzero sacs a knight for nothing
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: LCzero sacs a knight for nothing
How exactly is the search working? I have been trying to glean information, but had no luck. Is it using the tree search described by the Deep Mind paper, or is it something else entirely?noobpwnftw wrote:For hybrid approach, I have an idea: couldn't we run some MCTS threads and make use of their simulations for root move ordering? Let's just say if we can feed one GPU with 2 CPU threads, then we have them running independently, from time to time we could reorder root moves by their eval scores scaled to win rate estimation, it may help with favoring moves that score a few centi-pawns less but more favorable in the NN's view.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 10297
- Joined: Thu Mar 09, 2006 12:37 am
- Location: Tel-Aviv Israel
Re: LCzero sacs a knight for nothing
You assume that simplifications mean worse evaluation.noobpwnftw wrote:I think the approach is trying to summarize simulation results, which is good at handling general cases.
Those tactical lines are isolated incidents which it can never solve, while with minimax the search will develop that line deep and fast enough to see it.
It is not like we cannot write better evaluation code, but once a while it turns out that a simplification actually gains ELO because the search will run faster. LC0 is doing the opposite, and people seem to ignore the fact that it's evaluation is just slow, and blame the hardware for poor performance, even with A0's hardware you get some 80k NPS, convert that naively to CPU, 1 TPU ~= 10x 1080TI, one 1080TI ~= 32 CPU cores. so for A0 that's 4*10*32 = 1280 CPU cores.
Given that many CPU cores I'm sure I can get more than +100 ELO against a 64-core SF8 to get that result.
It is not clear and sometimes something that is more complicated is simply not better.
I believe that the problem of LC0 is not slower evaluation but a bad search algorithm and with the same speed of evaluation it can be significantly stronger with a better search algorithm.
-
- Posts: 10948
- Joined: Wed Jul 26, 2006 10:21 pm
- Full name: Kai Laskos
Re: LCzero sacs a knight for nothing
Interesting. I tested on tactical middlegame suite, 20s/position, and ID160 indeed improved significantly compared to earlier nets, but it is still far away compared to similar in strength in these conditions standard A/B engine:Werewolf wrote:I'd love to see your tactical results on ID 160.Laskos wrote:Yes, ID160 seems the strongest (at least in my test). Now I am checking its scaling, seems to scale nicely from 1s/move to 4s/move compared to similar in strength Jabba 1.0 (in my conditions).George Tsavdaris wrote:Laskos wrote: Yes, some sort of list. For ECM200.epd middlegame tactical suite (200 positions), analyzed for 20s/position. At this time control and my hardware, LC0 performs overall (Elo-wise) comparably to GreKo 6.5 2330 Elo CCRL standard A/B engine, which fares much better tactically (but much worse positionally). And it seems on this tactical middlegame suite ID124 is still the best of the nets.
Having watched around 100+ games of ID150+ and ~40 games of ID 156 versus 2100-3100 CCRL ELO opponents, i see that LC0(with that IDs as also with previous) completely outplays positionaly the other engines in many many cases, just to miss in at least 80% of them a tactical hit that either cost LC0 the win or even the draw and it loses.
LC0 is on par i dare to say with Stockfish dev in evaluation, but of course is ultra weak in tactics. It's even better than Stockfish in King attacks as i have seen. In placing its pieces to attack. Not in executing the attack since in that aspect is fails miserably due to bad tactics. The pattern recognition its NNs are offering it to see how to attack the King, seem to be extremely prosperous.
Meanwhile ID160 had a good jump in self-play ELO.
My own tests are no longer totally negative, but are very mixed.
Below is the "easiest" position in my testsuite, which I've posted many times but LCZero ID 160 still cannot get in 20 minutes.
[pgn] 1.e4 e5 2.Nf3 d6 3.Nc3 g6 4.Bc4 Bg4 5.Ne5 [/pgn]
and yet, curiously it gets the following position which is MUCH harder for alpha betas.
[pgn] 1. d4 d5 2. c4 dxc4 3. Nc3 e5 4. e3 exd4 5. exd4 Nf6 6. Bxc4 Be7 7. Nf3 Nbd7 8. Bxf7+ [/pgn]
This position was a real challenge until around 1993-1995 because dedicateds thought 8.Ng5 was a simpler way to win (it's not) and 8.Bf7 requires seeing quite deeply in one line.
Yet LCZero 160 finds this in 12 seconds!
Code: Select all
ID143:
ECM200
score=63/200 [averages on correct positions: depth=12.8 time=2.56 nodes=791]
ID148:
ECM200
score=67/200 [averages on correct positions: depth=11.9 time=1.84 nodes=567]
ID156:
ECM200
score=68/200 [averages on correct positions: depth=12.6 time=2.44 nodes=944]
ID160
score=75/200 [averages on correct positions: depth=13.2 time=3.15 nodes=1107]
==============================================
Compare with a similar in strength standard A/B engine:
GreKo 6.5 (2330 CCRL):
ECM200
score=143/200 [averages on correct positions: depth=7.3 time=1.91 nodes=4718200]
games at 1s/move:
Code: Select all
Games Completed = 200 of 200 (Avg game length = 102.240 sec)
Settings = Gauntlet/64MB/1000ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817)
Time = 5426 sec elapsed, 0 sec remaining
1. LCZero CPU ID160 80.0/200 62-102-36 (L: m=102 t=0 i=0 a=0) (D: r=26 i=6 f=3 s=0 a=1) (tpm=953.0 d=12.46 nps=185)
2. Jabba 1.0 120.0/200 102-62-36 (L: m=62 t=0 i=0 a=0) (D: r=26 i=6 f=3 s=0 a=1) (tpm=802.5 d=9.30 nps=0)
games at 4s/move (2 doublings):
Code: Select all
Games Completed = 200 of 200 (Avg game length = 448.253 sec)
Settings = Gauntlet/64MB/4000ms per move/M 9000cp for 30 moves, D 150 moves/EPD:C:\LittleBlitzer\3moves_GM_04.epd(817)
Time = 22769 sec elapsed, 0 sec remaining
1. LCZero CPU ID160 116.5/200 96-63-41 (L: m=63 t=0 i=0 a=0) (D: r=24 i=8 f=8 s=1 a=0) (tpm=2904.2 d=14.53 nps=430)
2. Jabba 1.0 83.5/200 63-96-41 (L: m=96 t=0 i=0 a=0) (D: r=24 i=8 f=8 s=1 a=0) (tpm=3802.0 d=10.74 nps=0)
Therefore, just a factor of 4 in time control (or hardware) (2 doublings) gives a boost of 128 Elo points compared to standard A/B engine. Or 64 Elo points per doubling. One can extrapolate:
On one CPU core at 4 s/move, from this match LC0 ID160 is about 2100 CCRL Elo. A top GPU, say Nvidia 1080 Ti, is faster by a factor of 25 compared to 1 CPU core. Tournament TC is about 40 longer than 4s/move. So, all in all, a total of a factor of 1000 time-hardware wise, or 10 doublings. So, ID160 on a top GPU and LTC would be about 2750 CCRL Elo. And on DeepMind hardware used in exhibition match, would be 3100+ CCRL Elo.
I am even beginning to suspect that they didn't release to the general consumer their products because on a normal i7 CPU and average GPU, the performance of their AlphaZero Go and Chess programs would be not that impressive, or even pretty lame (compared to the hype), especially in Chess.
-
- Posts: 1796
- Joined: Thu Sep 18, 2008 10:24 pm
Re: LCzero sacs a knight for nothing
Very, very interesting.Laskos wrote:
Therefore, just a factor of 4 in time control (or hardware) (2 doublings) gives a boost of 128 Elo points compared to standard A/B engine. Or 64 Elo points per doubling. One can extrapolate:
On one CPU core at 4 s/move, from this match LC0 ID160 is about 2100 CCRL Elo. A top GPU, say Nvidia 1080 Ti, is faster by a factor of 25 compared to 1 CPU core. Tournament TC is about 40 longer than 4s/move. So, all in all, a total of a factor of 1000 time-hardware wise, or 10 doublings. So, ID160 on a top GPU and LTC would be about 2750 CCRL Elo. And on DeepMind hardware used in exhibition match, would be 3100+ CCRL Elo.
New Volta cards are due out in Q3 this year and ID 160 could therefore be CCRL 2800 elo on one of those. Allowing for progress in the meantime...it could be very strong.
I think the tactics could slowly come. But I also think it will have "holes" in irregular positions which are not thematic.
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: LCzero sacs a knight for nothing
I suspect they could not care less about releasing them as general consumer products, and nor was that ever even on the table.Laskos wrote:I am even beginning to suspect that they didn't release to the general consumer their products because on a normal i7 CPU and average GPU, the performance of their AlphaZero Go and Chess programs would be not that impressive, or even pretty lame (compared to the hype), especially in Chess.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
-
- Posts: 12038
- Joined: Mon Jul 07, 2008 10:50 pm
Re: LCzero sacs a knight for nothing
so if alphazero elo is 3200, it is only 100 elo stronger than lc0 which means lc0 will soon stall ?Laskos wrote:
Therefore, just a factor of 4 in time control (or hardware) (2 doublings) gives a boost of 128 Elo points compared to standard A/B engine. Or 64 Elo points per doubling. One can extrapolate:
On one CPU core at 4 s/move, from this match LC0 ID160 is about 2100 CCRL Elo. A top GPU, say Nvidia 1080 Ti, is faster by a factor of 25 compared to 1 CPU core. Tournament TC is about 40 longer than 4s/move. So, all in all, a total of a factor of 1000 time-hardware wise, or 10 doublings. So, ID160 on a top GPU and LTC would be about 2750 CCRL Elo. And on DeepMind hardware used in exhibition match, would be 3100+ CCRL Elo.
https://imgur.com/a/c04yc
-
- Posts: 558
- Joined: Sat Mar 25, 2006 8:27 pm
Re: LCzero sacs a knight for nothing
Well, first off, that is a pretty big "if". Whenever you extrapolate upwards like this, there is a real risk that your assumptions won't continue to hold.duncan wrote:so if alphazero elo is 3200, it is only 100 elo stronger than lc0 which means lc0 will soon stall ?Laskos wrote:
Therefore, just a factor of 4 in time control (or hardware) (2 doublings) gives a boost of 128 Elo points compared to standard A/B engine. Or 64 Elo points per doubling. One can extrapolate:
On one CPU core at 4 s/move, from this match LC0 ID160 is about 2100 CCRL Elo. A top GPU, say Nvidia 1080 Ti, is faster by a factor of 25 compared to 1 CPU core. Tournament TC is about 40 longer than 4s/move. So, all in all, a total of a factor of 1000 time-hardware wise, or 10 doublings. So, ID160 on a top GPU and LTC would be about 2750 CCRL Elo. And on DeepMind hardware used in exhibition match, would be 3100+ CCRL Elo.
https://imgur.com/a/c04yc
-
- Posts: 12038
- Joined: Mon Jul 07, 2008 10:50 pm
Re: LCzero sacs a knight for nothing
do you have an estimate for elo of alphazero ?Robert Pope wrote:Well, first off, that is a pretty big "if". Whenever you extrapolate upwards like this, there is a real risk that your assumptions won't continue to hold.duncan wrote:so if alphazero elo is 3200, it is only 100 elo stronger than lc0 which means lc0 will soon stall ?Laskos wrote:
Therefore, just a factor of 4 in time control (or hardware) (2 doublings) gives a boost of 128 Elo points compared to standard A/B engine. Or 64 Elo points per doubling. One can extrapolate:
On one CPU core at 4 s/move, from this match LC0 ID160 is about 2100 CCRL Elo. A top GPU, say Nvidia 1080 Ti, is faster by a factor of 25 compared to 1 CPU core. Tournament TC is about 40 longer than 4s/move. So, all in all, a total of a factor of 1000 time-hardware wise, or 10 doublings. So, ID160 on a top GPU and LTC would be about 2750 CCRL Elo. And on DeepMind hardware used in exhibition match, would be 3100+ CCRL Elo.
https://imgur.com/a/c04yc
-
- Posts: 52
- Joined: Sat Mar 24, 2018 4:18 pm
Re: LCzero sacs a knight for nothing
Exactly, and what's even more remarkable is that according to the A0 paper (figure 2) 4xTPUs will do about 80k payouts in 1s and at 80k playots A0 is only 100 - 150 elo weaker than at 1 min / move (5000k playots)Robert Pope wrote:Well, first off, that is a pretty big "if". Whenever you extrapolate upwards like this, there is a real risk that your assumptions won't continue to hold.duncan wrote:so if alphazero elo is 3200, it is only 100 elo stronger than lc0 which means lc0 will soon stall ?Laskos wrote:
Therefore, just a factor of 4 in time control (or hardware) (2 doublings) gives a boost of 128 Elo points compared to standard A/B engine. Or 64 Elo points per doubling. One can extrapolate:
On one CPU core at 4 s/move, from this match LC0 ID160 is about 2100 CCRL Elo. A top GPU, say Nvidia 1080 Ti, is faster by a factor of 25 compared to 1 CPU core. Tournament TC is about 40 longer than 4s/move. So, all in all, a total of a factor of 1000 time-hardware wise, or 10 doublings. So, ID160 on a top GPU and LTC would be about 2750 CCRL Elo. And on DeepMind hardware used in exhibition match, would be 3100+ CCRL Elo.
https://imgur.com/a/c04yc
Also 1x 1080Ti (11 TFLOPS) vs 4xTPU (180 TFLOPS) means nps gets reduced to 4.8k nps Even if we assumed that the TPU is somehow more effective flops to flops by factor of 4x the resulting 1080Ti playouts would be still close to 80k per minute. Thus to me it seems quite convincing that A0 on 1080Ti would be with good confidence max 150 elo weaker at 1min / move compared to 4xTPU configuration. (and most likely not more than 100 elo weaker)
Since LC0 is tactically much weaker than A0 maybe it would scale better but unless someone measures it I would be quite skeptical that the difference could be 350+ elo (instead of something like 150 - 200 max) But to actually have data (not extrapolations) on lc0 scaling from e.g. 1000 to 5mil. playouts per move would be quite nice.