LCZero Accomplishments and Goals Thus Far

Daniel Shawul · Post by **Daniel Shawul** » Tue May 01, 2018 11:49 am

frankp wrote:It is clear that lczero runs best on a graphics card. Why its merits can only be judged against its performance of a single cpu is bizarre. And after all it is a consumer grade card, not specialised hardware.

Yes, it will be limited by hardware - as everything ultimately is. But history suggests that hardware improves. Hope they can manage 256x20, which I believe A0 used.

Perhaps we should judge video games by only their ability to run a single cpu ... LOL

That is not entirely true. CPUs have cheap integrated graphics cards ( still GPUs) and on those LCzero performs worse than it does on CPUs. Infact a highly optimized vectorized CPU implementation using AVX512 or something like that should not be more than 3-4x slower than a GPU in most applications. There is no requirement that LCzero has to run on the GPU -- that is simply BS; people who insist otherwise are just asking for a high-end machine be it a GPU or multicore CPU.

Note that every time the network size is doubled, you are probably pushing the minimum hardware requirement up by the same factor. At the moment, it is effectively thousands of elos far from the top on single core CPU because of this minimum hardware requirement. People want to know its performs in all ranges ( short/long TCs, single core CPU to GPUs etc) and especially at the hardware that is in most common use.

hgm · Post by **hgm** » Tue May 01, 2018 11:52 am

jp wrote:I think the big point about LC0 is that we can test it properly in any way we want with any metric we want on any hardware we want. We can do all of those things, and we can do it all in the correct way, report it all openly and objectively, & everyone can then be happy. That is the huge plus of having LC0.

You wouldn't need LC0 for that at all. Micro-Max is also more than an order of magnitude better than Stockfish, right?

jp · Post by jp » Tue May 01, 2018 12:03 pm

hgm wrote: You wouldn't need LC0 for that at all. Micro-Max is also more than an order of magnitude better than Stockfish, right?

Sorry, I don't get it... If this is a joke, it'll need to be explained to me...

jkiliani · Post by **jkiliani** » Tue May 01, 2018 12:08 pm

Joost Buijs wrote:I'm not accusing them of anything, and I'm not paranoid either because I couldn't care less, I just said that things sometimes look better than they actually are.

Of course they achieved a wonderfull result with GO, but that doesn't mean the same technique is optimal for a very tactical game like chess. Maybe using a NN to guide an a-b search would give a better result in the end.

It's nothing more than a myth that Go is a less tactical game than chess. It's in fact trivially easy to bungle a whole game in a single move, by causing one of your groups to die. Go and chess are equally games that are both positional and tactical in nature. Chess just has a lower branching factor and better approximations for an evaluation function (i.e. material balance), which allows Alpha-Beta search to be successful with chess but not with Go. Tactics being underrepresented in Go has nothing to do with it.

duncan · Post by **duncan** » Tue May 01, 2018 12:28 pm

jp wrote:
hgm wrote: You wouldn't need LC0 for that at all. Micro-Max is also more than an order of magnitude better than Stockfish, right?
Sorry, I don't get it... If this is a joke, it'll need to be explained to me...

using the metric elo/byte ?

duncan · Post by **duncan** » Tue May 01, 2018 12:35 pm

Daniel Shawul wrote: Similarly, to compare evals, you use the same search, and see how well the NN eval fairs against the hand-made one on the same hardware.

so let's say stockfish can search 30 ply. you allow stockfish to put in all the eval they want and give it extra time so it can still search to 30 ply and it plays against alpha zero on gpu using same amount of time.

who do you think will win ?

frankp · Post by **frankp** » Tue May 01, 2018 12:41 pm

Daniel Shawul wrote:
That is not entirely true. CPUs have cheap integrated graphics cards ( still GPUs) and on those LCzero performs worse than it does on CPUs. Infact a highly optimized vectorized CPU implementation using AVX512 or something like that should not be more than 3-4x slower than a GPU in most applications. There is no requirement that LCzero has to run on the GPU -- that is simply BS; people who insist otherwise are just asking for a high-end machine be it a GPU or multicore CPU.

Note that every time the network size is doubled, you are probably pushing the minimum hardware requirement up by the same factor. At the moment, it is effectively thousands of elos far from the top on single core CPU because of this minimum hardware requirement. People want to know its performs in all ranges ( short/long TCs, single core CPU to GPUs etc) and especially at the hardware that is in most common use.

My point is that lczero runs and appears to be designed to run best on a graphics card. At the moment a consumer grade graphics card is enough.
As you say, there are other compile (being developed) for the intel MKL library for example. But fundamentally I do not see why it is an issue for a general user to use their consumer gpu rather than consumer cpu. Not clear at all why all programs have to run best on a cpu - video games do not for example.

Still, stepping back, from a non-technical perspective, this is the most exciting chess project I have seen in a very long time. An entity that learns to play chess for itself, already at a high level, and runs on my PC. Interested to see how far it can go.

Clearly you have a different perspective on the project and its achievements so far, for whatever reasons. Fine. I have a different perspective.

jp · Post by jp » Tue May 01, 2018 12:47 pm

duncan wrote:
jp wrote:
hgm wrote: You wouldn't need LC0 for that at all. Micro-Max is also more than an order of magnitude better than Stockfish, right?
Sorry, I don't get it... If this is a joke, it'll need to be explained to me...
using the metric elo/byte ?

I found Micro-Max eventually, no thanks to Google.

hgm · Post by **hgm** » Tue May 01, 2018 12:52 pm

Daniel Shawul wrote:Almost everybody, even one's who used NN evals like Giraffe, cared about performance on single core upto now. It is simply the most widely used computing hardware now.

That is why AlphaZero was called a 'paradigm change'. 20 years ago everyone had a fixed phone line at home. Now they all have cell phones...

That all people were using the same flawed metric was only because they were all doing essentially the same, with minuscule difference. If I want to describe a collection of meatballs, it doesn't matter whether I use height, width or length as a metric of which is more nutricious. But now someone brings in a Frankfurter sausage, and a Hamburger, which forces me to abandon my evil ways.

Graphics cards are mostly integrated ones on which LCzero performs equally or worse than the CPU. You are asking for a hardware that is available to gamers or people who do number crunching.

Computer Chess is number crunching. What people have now and what people will have in the future might be very different. Only catering to what they have now might very well be betting on a dead horse. Species that completely specialize on consuming a dwindling food source usually go extinct together with that food source.

Good graphics cards are not more expensive than extra CPUs and sockets for them on the motherboard. Which is what people that really care about engine performance have now. 10 years ago all CPUs were single core, and no one cared about SMP. Now everyone has at least 4 cores, and if your engine does not suppored SMP it will be considered worthless, no matter how well it does in single-CPU tests.

A question: Do you see an algorithmic improvement in DeepBlue's acceleration of its eval ? If not, what makes it different with A0's ? That GPU cards are easily accessible is irrelevant from an algorithmic point of view.

I am not very familiar with Deep Blues algorithm, but I always though it was just a hardware implementation of existing eval techniques (like PST, mobility, etc.) So not a different algorithm at all, perhaps even a simplified one because of hardware constraints.

If you want to compare MCTS vs alpha-beta, you use the same eval for them and see who fairs better in chess/Go. Similarly, to compare evals, you use the same search, and see how well the NN eval fairs against the hand-made one on the same hardware.

The latter is not a well-defined procedure. Because the outcome will be completely dependent on what hardware you choose. What if this 'equal hardware' was a neural network, which you had to train to compute your hand-crafted evaluation?

This is a lot like comparing engines only available as binaries, one only available for x64 PC, the other only for ARM. They can of course run on both tablets and PCs, through the applicable emulator. Now how much good will this 'equal hardware' comparison do you? If you test on a phone, the ARM program will win. If you test on a PC, the windows .exe will win.

Now an ARM and an x64 are suffciently similar that you can hope to get your hand on the algorithms in high-level source form, and compile both for ARM or PC, and conduct the test on either one of them. Hoping that the rsults would not depend too much on whether you use ARM or x64. But if the architectures are really different, this is hopeless. I had that in the 90s, when I was competing on Pentium-II PC with people using a Cray YMP supercomputer. Every programming technique that made it fast on the PC slowed it down on a Cray, and vice versa. So the high-level algorithm had to be completely different. Closer to home, try to compare a magic-bitboard engine against a mailbox on the 'equal hardware' of an 8-bit microprocessor like 6502 (which does not even have 8-bit multiply instruction). Which algorithm do you think would come out as highly superior?

If there is no hardware equivalence in comparing the evals the comparison is meaningless ofcourse as one can just add more knowledge without worry. Infact, a NN is probably the most inefficient tool for that, though it is a more generic one.
It waists too many FLOPs doing unnecessary multiplications for a not-so-important feature that a handmade eval would probably ignore. No one doubts (even without using NN) that you can increase your elo to your satisfaction by adding more knowledge -- the question is if can you get a better quality eval with the same FLOPs ?

But if FLOPS are nearly free, who cares? The NN does not need any branches. So from its point of view hand-crafted eval does many unnecessary poorly predictable branches. You get a complely lopsided comparison if you just stress one aspect of the algorithm, ignoring all other essential parts. You have to weight in the cost of branching, caching, out-of-order execution, in a realistic way.

hgm · Post by **hgm** » Tue May 01, 2018 12:54 pm

duncan wrote:using the metric elo/byte ?

Indeed, but at source-code level. So Elo/character.

Not a metric that the average buyer of a Chess engine would give a hoot about, of course.

LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far

Re: LCZero Accomplishments and Goals Thus Far