LCZero Accomplishments and Goals Thus Far

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27795
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: LCZero Accomplishments and Goals Thus Far

Post by hgm »

noobpwnftw wrote:There is a limited number of GPUs you can pack in a "single machine", so does CPUs, so the per unit performance matters, no? Without that, how do you compare the price tag?
The four TPUs in AlphaZero's hardware were just a single card fitting in a slot of the PCI bus of the PC they were running it on. So I don't see what point you are trying to make.
JJJ
Posts: 1346
Joined: Sat Apr 19, 2014 1:47 pm

Re: LCZero Accomplishments and Goals Thus Far

Post by JJJ »

Is autoresign really a good Idea ? At fast time control I see Leela switching a lot between winning and loosing. Leela still need to play better in ending instead of resigning too fast.
noobpwnftw
Posts: 560
Joined: Sun Nov 08, 2015 11:10 pm

Re: LCZero Accomplishments and Goals Thus Far

Post by noobpwnftw »

hgm wrote:
noobpwnftw wrote:There is a limited number of GPUs you can pack in a "single machine", so does CPUs, so the per unit performance matters, no? Without that, how do you compare the price tag?
The four TPUs in AlphaZero's hardware were just a single card fitting in a slot of the PCI bus of the PC they were running it on. So I don't see what point you are trying to make.
From the current implementation(A0, Leela) it appears that the input pre-processing is done on the CPU-side, which means the IO throughput of a PCI-E bus is also a factor.

CPUs only have limited lanes for PCI-E which limits how many GPUs a program can drive in parallel efficiently.

Theoretically you can use layered bus extenders to link several hundred GPUs, but then you get poor IO performance like the mining rigs.
They even use Celeron CPUs to drive a good number of modern GPUs, because their programs are mostly autonomous on the GPU-side, which is not the case here.

When it comes to scalability, those things do matter, let's just say you have a machine that has a max number of GPUs, probably due to the above limiting factor, how to go beyond that and still "scale well" is a question.
Last edited by noobpwnftw on Mon Apr 30, 2018 9:02 pm, edited 1 time in total.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: LCZero Accomplishments and Goals Thus Far

Post by Daniel Shawul »

mar wrote:
Daniel Shawul wrote:Sigh..wake me up when it is 2800 elo running on singe CPU core, which is what every other engine uses in rating lists. As far as I am concerned, it is still a 2100 elo engine there.
Well, on my hw Leela seems to use CPU 2 cores + GPU, so I played a short match against my engine and Leela performed like 2800+ engine already (it was net 217) and the TC was 40/1 minute so rather fast for Leela.
Also people who want a strong chess entity don't care about 1 CPU.
LCZero can run on the CPU so I've made comparisons with ID 125 using 2 threads and got about 2100 elo on an i7 laptop. If I used the Integrated intel HD graphics card I have instead, the result would be worse I am sure as it has slower nps there.

TCEC made a fair 44-core comparsions that probably showed it scales better than other engines, but the fact remains that you have to use either massive hardware or a 1 year + 1 month time control for it on single CPU core.
frankp
Posts: 228
Joined: Sun Mar 12, 2006 3:11 pm

Re: LCZero Accomplishments and Goals Thus Far

Post by frankp »

noobpwnftw wrote:
frankp wrote: I was just pointing out that we seem to be comparing the merits of too very different approaches on the basis of the arrangements of semiconductors they use.

(It seems that any discussion of A0/leela turns into a A0/SF 'willy-waving' contest. I am still fascinated that leela play such a high standard of chess, not compared to SF of course, on my consumers grade two generation old graphics card.)
I think there is nothing wrong with the approach, and it appears to scale well because we are at the stage where people usually have 1 GPU in their computers, which is easier to do doubling than where they already had some 8 CPU cores.
Yes, from a practical point of view, I guess both the ability to train leela and its strength in use (NN size and speed) will be limited, perhaps rather quickly, by the power of affordable consumer grade graphics cards. Well at least in the short term. Still interesting to see how far it can go. Still a young project.
User avatar
hgm
Posts: 27795
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: LCZero Accomplishments and Goals Thus Far

Post by hgm »

Indeed, the PCI bus is a lot slower than CPU-DRAM traffic. But remember that AlphaZero needed only 80knps to beat Stockfish' 70Mnps. So I guess you don't care so much if things are a factor 100 slower, if you have a factor 1000 more time.
FWCC
Posts: 117
Joined: Wed Aug 22, 2007 4:39 pm

Re: LCZero Accomplishments and Goals Thus Far

Post by FWCC »

So Daniel I take it you're not too excited about the Leela project?
2800 on 1 cpu? LcZero is GPU orientated.
Joost Buijs
Posts: 1563
Joined: Thu Jul 16, 2009 10:47 am
Location: Almere, The Netherlands

Re: LCZero Accomplishments and Goals Thus Far

Post by Joost Buijs »

hgm wrote:Again you refer to 'massive hardware'. But the hardware of AlphaZero was not really massive. Just different. AlphaZero needs fewer transistors to run at 3500 Elo than Stockfish needs to run at 3400 Elo, at 'standard' TC. It is just that the transistors have to be connected in an entirely different way.

That the AlphaZero hardware can do some things 100 times faster than equivalent (in terms of number of transistors) x64 CPUs, doesn't mean a thing. My rowing boat is infinitely faster than my Ferrari, for crossing the English Channel.
How do you know how many transistors AlphaZero needs to play its' games?

The article is very vague about everything, they used 4 TPU's (gen1 or gen2?) do they mean 4 gen2 TPU modules with 4 asics each?

Are you very sure about the level of 3500 Elo?, they only show you 10 games it won against a crippled version of Stockfish 8 and for the rest you have to believe them on their blue eyes.

They don't tell you how many matches of 100 games they played and how many matches they lost, statistically the error margin on 100 games is rather high and since the network always plays the same move there is a possibility that there were many games repeated, maybe the SMP randomness in Stockfish gave some variation, I don't know.

Time will tell I guess, LCZero is already past 10 million training games, Google seems to have used 44 million, with the current speed it will take another 3 months for LCZero to reach 44 million, and I wonder if it is able to close the gap of 600 Elo with Stockfish within 3 months.
Last edited by Joost Buijs on Mon Apr 30, 2018 9:09 pm, edited 1 time in total.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: LCZero Accomplishments and Goals Thus Far

Post by Daniel Shawul »

Albert Silver wrote:
Daniel Shawul wrote:Sigh..wake me up when it is 2800 elo running on singe CPU core, which is what every other engine uses in rating lists. As far as I am concerned, it is still a 2100 elo engine there.
I see so many excited people giving a hardware advantage to LCzero, like CCLS does for instance uses a GPU for LCzero and single core CPU for the rest of the engines.
Well, to begin with, I remember when Rybka was the first engine to take advantage of the 64-bit environment when every standard OS was 32-bit. It had a big speed-up, and no one was able to do the same at first. Shredder 64-bit came out a couple of months later but with zero speedup. I don't recall people saying that it needed to run in a 32-bit environment like everyone else to be 'fair'.

The advantage you complain about is just sour grapes in my book. For one thing, if CCLS or whomever offer a GPU, then it is up to the authors to take advantage of it, not for the one who is able to, to learn to dumb down his machine for 'fairness'.

Leela is designed to use a GPU for best performance. it is inherent in its design. If it reaches 100 Elo better than everyone else on my computer because it alone can use the GPU to best advantage, while all others are weaker because they are only able to use the CPU, guess how much I (and everyone who analyzes with engines) will care?
If it is designed solely for the GPU, why is it that its performance on my Intel HD graphics card (a GPU) worse than its performance on the CPU ?
TCEC run it a 44-core CPU and got better results than a GTX 1080 can.
So it is not that it is designed for CPU or GPU, but for a high-end machine CPU/GPU.
On the other hand, Stockfish can perform 3200+ elo on a mobile processor.
User avatar
Werner
Posts: 2871
Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle

Re: LCZero Accomplishments and Goals Thus Far

Post by Werner »

Anybody knows how to run the engine now on Google Colab now on the new net?
This line is no longer true I think:
!echo '0;XgemmBatched;128;16;128;16;
Werner