Is AlphaZero-LC0 (Leela0) A chess Engine?
Moderators: hgm, Rebel, chrisw
-
- Posts: 267
- Joined: Fri Mar 17, 2006 8:01 am
- Location: Russia
- Full name: Vladimir Medvedev
Re: Is AlphaZero-LC0 (Leela0) A chess Engine?
Sorry for off-topic, but could anybody give me a link to lc0 network (weights file) which played in TCEC-15 Superfinal? As I know it is not a regular net available from the official lc0 site.
-
- Posts: 177
- Joined: Wed May 23, 2018 9:29 pm
Re: Is AlphaZero-LC0 (Leela0) A chess Engine?
-
- Posts: 1242
- Joined: Sat Jul 05, 2014 7:54 am
- Location: Southwest USA
Re: Is AlphaZero-LC0 (Leela0) A chess Engine?
Thank You....Great Answer My Friend..Then Why are there people all over the internet and on You tube boasting about how SMART and BRILLIANT LC0/Leela0/Alpha Zero is/was compared to Stockfish and other Chess Engines(ex Houdini-Komodo-SugaR every day?Robert Pope wrote: ↑Tue Jun 04, 2019 7:50 pmThey are huge for one key reason: Neural nets are a very GENERAL solution for a very specific problem.supersharp77 wrote: ↑Tue Jun 04, 2019 7:00 pmOk Dann Nice answer..Then why are these successful Neural Nets so HUGE(ex 90 million games NN) and why does the speed factor so large in the performance results? Thx ARDann Corbit wrote: ↑Sat Feb 23, 2019 12:42 am It does not work like that. It starts with the rules of the game.
It does self play to learn what things are good and what things are bad.
It is not memorizing board positions.
It writes out a file of numbers called a network that contains the values for different board features.
It uses this network to judge board positions.
The MCTS sampling is just a search method where it looks at future positions (alpha-beta also does this, but not by sampling).
It is a different approach. But it is still a chess engine.
Two implications that come out of that:
1. There is probably a much more effective way to encode the knowledge of chess, but WE don't KNOW it. So for now, we use a giant sausage grinder, and A LOT of meat.
2. There is NO "SMARTS" to how the information in the net is coded. The neural net doesn't have ANY LOGIC to let it generalize like that, so it needs a huge number of neurons (and games) to figure it out. Part of that is intentional, though. A neural net will figure out on its own that if it can get a head start for it's pawn, the enemy king will never stop it from promoting. 100/300/300/500/900 will never be able to teach that.
3. Because it is a general solution, you need A LOT of DIFFERENT EXAMPLES to fine tune the solution. 200,000 positions might be enough to tune a 64x64 piece-square table in a traditional engine. Tuning a 64x64x64x64 piece square table would be a whole different can of worms.
Point #2 A thought came to my mind after all this discussion of Card Speeds, GPU's TPU's CPU's Geoforce cards..Overclocking and Super systems to "maximize LC0" Heres the thought.."If Someone Has A Old VW Beetle and Then Puts a Ferrari engine suspension and transmission in it and special wheels and tires and a Nitro system in it for performance..Is it still a VW or Is It Now really a Ferrari? wheres the Cutoff?" Thx AR
-
- Posts: 558
- Joined: Sat Mar 25, 2006 8:27 pm
Re: Is AlphaZero-LC0 (Leela0) A chess Engine?
It's generality also makes it capable of discovering connections that human programmers would never stumble on in a million years. Imagine that human programmers start with the piece values 100/300/300/500/900. And then eventually, piece square tables are discovered. And then different piece square tables start to be in endgames. And then certain specific material combinations are found to be weaker than others.supersharp77 wrote: ↑Wed Jun 05, 2019 8:24 pmThank You....Great Answer My Friend..Then Why are there people all over the internet and on You tube boasting about how SMART and BRILLIANT LC0/Leela0/Alpha Zero is/was compared to Stockfish and other Chess Engines(ex Houdini-Komodo-SugaR every day?Robert Pope wrote: ↑Tue Jun 04, 2019 7:50 pmThey are huge for one key reason: Neural nets are a very GENERAL solution for a very specific problem.supersharp77 wrote: ↑Tue Jun 04, 2019 7:00 pmOk Dann Nice answer..Then why are these successful Neural Nets so HUGE(ex 90 million games NN) and why does the speed factor so large in the performance results? Thx ARDann Corbit wrote: ↑Sat Feb 23, 2019 12:42 am It does not work like that. It starts with the rules of the game.
It does self play to learn what things are good and what things are bad.
It is not memorizing board positions.
It writes out a file of numbers called a network that contains the values for different board features.
It uses this network to judge board positions.
The MCTS sampling is just a search method where it looks at future positions (alpha-beta also does this, but not by sampling).
It is a different approach. But it is still a chess engine.
Two implications that come out of that:
1. There is probably a much more effective way to encode the knowledge of chess, but WE don't KNOW it. So for now, we use a giant sausage grinder, and A LOT of meat.
2. There is NO "SMARTS" to how the information in the net is coded. The neural net doesn't have ANY LOGIC to let it generalize like that, so it needs a huge number of neurons (and games) to figure it out. Part of that is intentional, though. A neural net will figure out on its own that if it can get a head start for it's pawn, the enemy king will never stop it from promoting. 100/300/300/500/900 will never be able to teach that.
3. Because it is a general solution, you need A LOT of DIFFERENT EXAMPLES to fine tune the solution. 200,000 positions might be enough to tune a 64x64 piece-square table in a traditional engine. Tuning a 64x64x64x64 piece square table would be a whole different can of worms.
All of those "discoveries" are important to help the evaluation be more accurate, but each is only a tiny sliver of the whole picture, and it gets harder and harder for programmers to find new slivers of importance as engines get more refined. The neural nets take a whole different tack, by using a massive network to try to approximate the whole "truth of chess", within the capabilities of its framework. It's understanding of the truth is poor to start, but every game that is added helps it get a more accurate gestalt.
And the brilliant part comes in because there are basically no missing "slivers" of knowledge that it lacks, only an incomplete (and improving) understanding of all slivers as a whole.
-
- Posts: 12541
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Is AlphaZero-LC0 (Leela0) A chess Engine?
It really is rather amazing what LC0 and Alpha0 manage to do.
Consider the decades that people have spent refining alpha-beta chess search, and the enormous effort spent figuring out how to evaluate correctly.
In a few short years, the GPU approach has caught up.
That's kind of astonishing.
On the other hand, if you look at the pure millions of instructions per second produced by the GPU cards compared even to the strongest mutli-core CPUs, then it is a terrible return (Elo per instruction executed).
The RTX 2080 does 20.14 TFLOPS in 16 bit FP mode.
Two of those produce 40 TFLOPS. {trillion floating point operations per second}
Geekbench base processor is Intel Core i5-2520M @ 2.50 GHz
13.80GFlops = 2500 geekbench points
Here is the record score:
Nov 08, 2018 Dell Inc. PowerEdge R840 Intel Xeon Platinum 8180M 3800 MHz (112 cores) Linux 64-bit mattl 4700 155050
62.02 faster than base processor, gives 855.876 GFlops. {billion floating point operations per second}
So two RTX 2080 cards are 46 times faster than the fastest machine ever benched on Geekbench in terms of Flops from a CPU.
Two RTX cards help LC0 to be just stronger than Stockfish. Not at all remarkable, considering the pure compute power thrown at it.
Consider the decades that people have spent refining alpha-beta chess search, and the enormous effort spent figuring out how to evaluate correctly.
In a few short years, the GPU approach has caught up.
That's kind of astonishing.
On the other hand, if you look at the pure millions of instructions per second produced by the GPU cards compared even to the strongest mutli-core CPUs, then it is a terrible return (Elo per instruction executed).
The RTX 2080 does 20.14 TFLOPS in 16 bit FP mode.
Two of those produce 40 TFLOPS. {trillion floating point operations per second}
Geekbench base processor is Intel Core i5-2520M @ 2.50 GHz
13.80GFlops = 2500 geekbench points
Here is the record score:
Nov 08, 2018 Dell Inc. PowerEdge R840 Intel Xeon Platinum 8180M 3800 MHz (112 cores) Linux 64-bit mattl 4700 155050
62.02 faster than base processor, gives 855.876 GFlops. {billion floating point operations per second}
So two RTX 2080 cards are 46 times faster than the fastest machine ever benched on Geekbench in terms of Flops from a CPU.
Two RTX cards help LC0 to be just stronger than Stockfish. Not at all remarkable, considering the pure compute power thrown at it.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 1242
- Joined: Sat Jul 05, 2014 7:54 am
- Location: Southwest USA
Re: Is AlphaZero-LC0 (Leela0) A chess Engine?
The Flaw in That Logic is The We Have All Been Told That There Are "Infinite Numbers Of Possibilities In The Game Of Chess" How Large does That NN (Neural Net)) Have To Be To Encapsulate "That Infinite Number Of Move Combinations and Permutations?" And How fast The Machine? .....Thx ARRobert Pope wrote: ↑Wed Jun 05, 2019 9:18 pmIt's generality also makes it capable of discovering connections that human programmers would never stumble on in a million years. Imagine that human programmers start with the piece values 100/300/300/500/900. And then eventually, piece square tables are discovered. And then different piece square tables start to be in endgames. And then certain specific material combinations are found to be weaker than others.supersharp77 wrote: ↑Wed Jun 05, 2019 8:24 pmThank You....Great Answer My Friend..Then Why are there people all over the internet and on You tube boasting about how SMART and BRILLIANT LC0/Leela0/Alpha Zero is/was compared to Stockfish and other Chess Engines(ex Houdini-Komodo-SugaR every day?Robert Pope wrote: ↑Tue Jun 04, 2019 7:50 pmThey are huge for one key reason: Neural nets are a very GENERAL solution for a very specific problem.supersharp77 wrote: ↑Tue Jun 04, 2019 7:00 pmOk Dann Nice answer..Then why are these successful Neural Nets so HUGE(ex 90 million games NN) and why does the speed factor so large in the performance results? Thx ARDann Corbit wrote: ↑Sat Feb 23, 2019 12:42 am It does not work like that. It starts with the rules of the game.
It does self play to learn what things are good and what things are bad.
It is not memorizing board positions.
It writes out a file of numbers called a network that contains the values for different board features.
It uses this network to judge board positions.
The MCTS sampling is just a search method where it looks at future positions (alpha-beta also does this, but not by sampling).
It is a different approach. But it is still a chess engine.
Two implications that come out of that:
1. There is probably a much more effective way to encode the knowledge of chess, but WE don't KNOW it. So for now, we use a giant sausage grinder, and A LOT of meat.
2. There is NO "SMARTS" to how the information in the net is coded. The neural net doesn't have ANY LOGIC to let it generalize like that, so it needs a huge number of neurons (and games) to figure it out. Part of that is intentional, though. A neural net will figure out on its own that if it can get a head start for it's pawn, the enemy king will never stop it from promoting. 100/300/300/500/900 will never be able to teach that.
3. Because it is a general solution, you need A LOT of DIFFERENT EXAMPLES to fine tune the solution. 200,000 positions might be enough to tune a 64x64 piece-square table in a traditional engine. Tuning a 64x64x64x64 piece square table would be a whole different can of worms.
Point #2 A thought came to my mind after all this discussion of Card Speeds, GPU's TPU's CPU's Geoforce cards..Overclocking and Super systems to "maximize LC0" Heres the thought.."If Someone Has A Old VW Beetle and Then Puts a Ferrari engine suspension and transmission in it and special wheels and tires and a Nitro system in it for performance..Is it still a VW or Is It Now really a Ferrari? wheres the Cutoff?" Thx AR
All of those "discoveries" are important to help the evaluation be more accurate, but each is only a tiny sliver of the whole picture, and it gets harder and harder for programmers to find new slivers of importance as engines get more refined. The neural nets take a whole different tack, by using a massive network to try to approximate the whole "truth of chess", within the capabilities of its framework. It's understanding of the truth is poor to start, but every game that is added helps it get a more accurate gestalt.
And the brilliant part comes in because there are basically no missing "slivers" of knowledge that it lacks, only an incomplete (and improving) understanding of all slivers as a whole.
-
- Posts: 4319
- Joined: Tue Apr 03, 2012 4:28 pm
Re: Is AlphaZero-LC0 (Leela0) A chess Engine?
Do you know the mean error between the actual output of the value head and the target train value?Robert Pope wrote: ↑Wed Jun 05, 2019 9:18 pmIt's generality also makes it capable of discovering connections that human programmers would never stumble on in a million years. Imagine that human programmers start with the piece values 100/300/300/500/900. And then eventually, piece square tables are discovered. And then different piece square tables start to be in endgames. And then certain specific material combinations are found to be weaker than others.supersharp77 wrote: ↑Wed Jun 05, 2019 8:24 pmThank You....Great Answer My Friend..Then Why are there people all over the internet and on You tube boasting about how SMART and BRILLIANT LC0/Leela0/Alpha Zero is/was compared to Stockfish and other Chess Engines(ex Houdini-Komodo-SugaR every day?Robert Pope wrote: ↑Tue Jun 04, 2019 7:50 pmThey are huge for one key reason: Neural nets are a very GENERAL solution for a very specific problem.supersharp77 wrote: ↑Tue Jun 04, 2019 7:00 pmOk Dann Nice answer..Then why are these successful Neural Nets so HUGE(ex 90 million games NN) and why does the speed factor so large in the performance results? Thx ARDann Corbit wrote: ↑Sat Feb 23, 2019 12:42 am It does not work like that. It starts with the rules of the game.
It does self play to learn what things are good and what things are bad.
It is not memorizing board positions.
It writes out a file of numbers called a network that contains the values for different board features.
It uses this network to judge board positions.
The MCTS sampling is just a search method where it looks at future positions (alpha-beta also does this, but not by sampling).
It is a different approach. But it is still a chess engine.
Two implications that come out of that:
1. There is probably a much more effective way to encode the knowledge of chess, but WE don't KNOW it. So for now, we use a giant sausage grinder, and A LOT of meat.
2. There is NO "SMARTS" to how the information in the net is coded. The neural net doesn't have ANY LOGIC to let it generalize like that, so it needs a huge number of neurons (and games) to figure it out. Part of that is intentional, though. A neural net will figure out on its own that if it can get a head start for it's pawn, the enemy king will never stop it from promoting. 100/300/300/500/900 will never be able to teach that.
3. Because it is a general solution, you need A LOT of DIFFERENT EXAMPLES to fine tune the solution. 200,000 positions might be enough to tune a 64x64 piece-square table in a traditional engine. Tuning a 64x64x64x64 piece square table would be a whole different can of worms.
All of those "discoveries" are important to help the evaluation be more accurate, but each is only a tiny sliver of the whole picture, and it gets harder and harder for programmers to find new slivers of importance as engines get more refined. The neural nets take a whole different tack, by using a massive network to try to approximate the whole "truth of chess", within the capabilities of its framework. It's understanding of the truth is poor to start, but every game that is added helps it get a more accurate gestalt.
And the brilliant part comes in because there are basically no missing "slivers" of knowledge that it lacks, only an incomplete (and improving) understanding of all slivers as a whole.
-
- Posts: 565
- Joined: Thu Nov 13, 2014 12:03 pm
Re: Is AlphaZero-LC0 (Leela0) A chess Engine?
well doh it's a ferrari in a beetle shellsupersharp77 wrote: ↑Wed Jun 05, 2019 8:24 pm."If Someone Has A Old VW Beetle and Then Puts a Ferrari engine suspension and transmission in it and special wheels and tires and a Nitro system in it for performance..Is it still a VW or Is It Now really a Ferrari? wheres the Cutoff?" Thx AR
-
- Posts: 558
- Joined: Sat Mar 25, 2006 8:27 pm
Re: Is AlphaZero-LC0 (Leela0) A chess Engine?
That would be dependent on the exact net you choose to examine, as well as the specific data you are comparing against. Also, not really relevant to his question or my response.
-
- Posts: 1470
- Joined: Mon Apr 23, 2018 7:54 am
Re: Is AlphaZero-LC0 (Leela0) A chess Engine?
Neural networks are many decades old, in fact roughly the same age as alpha-beta, so we should not think this has happened rapidly.Dann Corbit wrote: ↑Wed Jun 05, 2019 9:33 pm Consider the decades that people have spent refining alpha-beta chess search, and the enormous effort spent figuring out how to evaluate correctly.
In a few short years, the GPU approach has caught up.
That's kind of astonishing.
In this sense, NN engines rely completely on brute-force computing power.Dann Corbit wrote: ↑Wed Jun 05, 2019 9:33 pm On the other hand, if you look at the pure millions of instructions per second produced by the GPU cards compared even to the strongest mutli-core CPUs, then it is a terrible return (Elo per instruction executed).
So two RTX 2080 cards are 46 times faster than the fastest machine ever benched on Geekbench in terms of Flops from a CPU.
Two RTX cards help LC0 to be just stronger than Stockfish. Not at all remarkable, considering the pure compute power thrown at it.
Last edited by jp on Thu Jun 06, 2019 4:42 pm, edited 4 times in total.