CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
Moderators: hgm, Rebel, chrisw
-
- Posts: 3546
- Joined: Thu Jun 07, 2012 11:02 pm
Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
That is exactly how I see it Guenther. Perhaps we need to alter the wording in our FAQ to be clearer. It was written more than a decade ago long before GPU and NN existed.
-
- Posts: 6991
- Joined: Thu Aug 18, 2011 12:04 pm
Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
I have said 2 times in this thread that learning during the games should not be allowed for the same reason you point out.Guenther wrote: ↑Sat Feb 23, 2019 10:28 amYour wish is completely unrealistic, if you think a bit about it.Rebel wrote: ↑Sat Feb 23, 2019 10:05 am HGM - In this thread I am trying to convince the CCRL folks (and perhaps the CEGT people as well) to review their restrictions on learning, especially when I read: positional learning not allowed. It's not about you or me. And yet as often for reasons that escape me you sooner or later make it that way with your fighting language use and the fun of the discussion goes away, as in this case.
For the CCRL folks
Your FAQ states: positional learning not allowed. And that is what LZ does, only in a much more advanced way.
The goal of CCRL and CEGT is to establish rating lists and this needs scientific conditions.
With any kind of learning 'during the games' played in the rating process it is impossible to guarantee
entities have the same state always, which would be mandatory for being useful, otherwise each single game
would add noise and this noise would sum up exponentially with each further game.
Lc0 hasn't a learned eval. She has no eval at all. She works with WDL statistics thus learned positions by self-play but in a much more advanced way, stored in a NN cell. The only knowledge Lc0 has is 1-0 | ½-½ | 0-1. (Z)ero knowledge.
To read -
https://github.com/LeelaChessZero/lc0/wiki/Why-Zero
https://github.com/LeelaChessZero/lc0/releases
If you have understood the above about zero-versions then it should be allowed for traditional engines as well.
If you have read my posts in this thread I have argued in a similar way.
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 6991
- Joined: Thu Aug 18, 2011 12:04 pm
Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
Something else, when I download that version, play a match with it, I notice she uses 2 threads. Was that the same in your testing?Graham Banks wrote: ↑Tue Feb 19, 2019 7:46 amThat's on 1CPU only. No GPU.Rebel wrote: ↑Tue Feb 19, 2019 6:47 amPerhaps it's an idea to include the average NPS in the name for LZ as a sort of indication of the strength of the GPU used.Code: Select all
Rank Name Rating 45 Lc0 0.20.1 w36089 64-bit 3022
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 4605
- Joined: Wed Oct 01, 2008 6:33 am
- Location: Regensburg, Germany
- Full name: Guenther Simon
Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
I called it eval for laymen. It is not positions, but you don't understand it (WDL is just a part of the NN and also just recently added BTW), there is a huge difference to micro patterns (aka highly complicated interacting eval terms). A position is a real chess position...on the complete board.Rebel wrote: ↑Sat Feb 23, 2019 12:43 pmI have said 2 times in this thread that learning during the games should not be allowed for the same reason you point out.Guenther wrote: ↑Sat Feb 23, 2019 10:28 amYour wish is completely unrealistic, if you think a bit about it.Rebel wrote: ↑Sat Feb 23, 2019 10:05 am HGM - In this thread I am trying to convince the CCRL folks (and perhaps the CEGT people as well) to review their restrictions on learning, especially when I read: positional learning not allowed. It's not about you or me. And yet as often for reasons that escape me you sooner or later make it that way with your fighting language use and the fun of the discussion goes away, as in this case.
For the CCRL folks
Your FAQ states: positional learning not allowed. And that is what LZ does, only in a much more advanced way.
The goal of CCRL and CEGT is to establish rating lists and this needs scientific conditions.
With any kind of learning 'during the games' played in the rating process it is impossible to guarantee
entities have the same state always, which would be mandatory for being useful, otherwise each single game
would add noise and this noise would sum up exponentially with each further game.
Lc0 hasn't a learned eval. She has no eval at all. She works with WDL statistics thus learned positions by self-play but in a much more advanced way, stored in a NN cell. The only knowledge Lc0 has is 1-0 | ½-½ | 0-1. (Z)ero knowledge.
Last edited by Guenther on Sat Feb 23, 2019 1:16 pm, edited 1 time in total.
-
- Posts: 27790
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
Well, as I explained, I think that would be a very bad idea, so you should not be supprised if I oppose you on that. 'Position learning' is just a fancy way of creating an opening book. (Well, actually any book, but only opening positions would have any chance of ever being encountered, so making books for some middle-game position is just a waste of time.)Rebel wrote: ↑Sat Feb 23, 2019 10:05 am HGM - In this thread I am trying to convince the CCRL folks (and perhaps the CEGT people as well) to review their restrictions on learning, especially when I read: positional learning not allowed. It's not about you or me. And yet as often for reasons that escape me you sooner or later make it that way with your fighting language use and the fun of the discussion goes away, as in this case.
For the CCRL folks
Your FAQ states: positional learning not allowed. And that is what LZ does, only in a much more advanced way.
Allowing own books for a rating list is not a bad idea per se, but to make it fair every engine should be allowed to bring its own unique book, as you will measure strength of the book just as much as strength of the engine. But most people are only interested in the strength of the engine, as it is easy enough to pick any book you want with any engine. Because they use engines for analyzing middle-game positions, not for playing games from the opening. This is the group of people CCRL intends to cater for, and they are fortunately careful to prevent their measurements from being contaminated by book quality. By forcing all engines (including LZ) to play from the same book.
For the umptieth time: LZ does NOT do positional learning, not in a simple way and not in an advanced way. It just tunes a general evaluation function that can be applied to any position, not just to the negligible sub-set of positions it has been trained on (as a book would). Many conventional alpha-beta engines are tuned in exactly the same way ('Texel tuning'), based on the WDL value of a huge set of training examples. The problem is that you really seem to have no clue at all how LC0 works, and as a result try to 'support' your case with totally false and non-sensical statements.
-
- Posts: 182
- Joined: Sun Jun 12, 2016 5:44 pm
- Location: London
- Full name: Vincent
Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
Perhaps position learning could be allowed, given conditions, such as:
- learning files must be supplied by the author, and may not change during testing
- the engine must attempt to generalise from the learning data, rather than just creating a database of positions
-
- Posts: 27790
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
That would not be 'position learning', but just 'data mining' of the games played by the engine. Many engines use material tables, and engine authors can fill them by whatever method they like. It is not very useful to have the engine do it, however, as it is typically done 'off line'. What you obviously would not want (and I think everyone agrees about that) is an engine that changes its behaviour during testing. Because that would basically make then engine untestable, as you could never play more than one game with the same engine before it changes itself.
-
- Posts: 41419
- Joined: Sun Feb 26, 2006 10:52 am
- Location: Auckland, NZ
Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
No.Rebel wrote: ↑Sat Feb 23, 2019 12:54 pmSomething else, when I download that version, play a match with it, I notice she uses 2 threads. Was that the same in your testing?Graham Banks wrote: ↑Tue Feb 19, 2019 7:46 amThat's on 1CPU only. No GPU.Rebel wrote: ↑Tue Feb 19, 2019 6:47 amPerhaps it's an idea to include the average NPS in the name for LZ as a sort of indication of the strength of the GPU used.Code: Select all
Rank Name Rating 45 Lc0 0.20.1 w36089 64-bit 3022
gbanksnz at gmail.com
-
- Posts: 6991
- Joined: Thu Aug 18, 2011 12:04 pm
Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
Guenther wrote: ↑Sat Feb 23, 2019 12:56 pmI called it eval for laymen.Rebel wrote: ↑Sat Feb 23, 2019 12:43 pmI have said 2 times in this thread that learning during the games should not be allowed for the same reason you point out.Guenther wrote: ↑Sat Feb 23, 2019 10:28 amYour wish is completely unrealistic, if you think a bit about it.Rebel wrote: ↑Sat Feb 23, 2019 10:05 am HGM - In this thread I am trying to convince the CCRL folks (and perhaps the CEGT people as well) to review their restrictions on learning, especially when I read: positional learning not allowed. It's not about you or me. And yet as often for reasons that escape me you sooner or later make it that way with your fighting language use and the fun of the discussion goes away, as in this case.
For the CCRL folks
Your FAQ states: positional learning not allowed. And that is what LZ does, only in a much more advanced way.
The goal of CCRL and CEGT is to establish rating lists and this needs scientific conditions.
With any kind of learning 'during the games' played in the rating process it is impossible to guarantee
entities have the same state always, which would be mandatory for being useful, otherwise each single game
would add noise and this noise would sum up exponentially with each further game.
Lc0 hasn't a learned eval. She has no eval at all. She works with WDL statistics thus learned positions by self-play but in a much more advanced way, stored in a NN cell. The only knowledge Lc0 has is 1-0 | ½-½ | 0-1. (Z)ero knowledge.
Let's not call it what it isn't.
From the AZ paper:
Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case.
The AlphaZero algorithm is a more generic version of the AlphaGo Zero algorithm that was first introduced in the context of Go (29). It replaces the handcrafted knowledge and domainspecific augmentations used in traditional game-playing programs with deep neural networks and a tabula rasa reinforcement learning algorithm.
There is no chess knowledge in Zero versions, no evaluation function.
From the AZ paper:Guenther wrote: ↑Sat Feb 23, 2019 10:28 amIt is not positions, but you don't understand it (WDL is just a part of the NN and also just recently added BTW), there is a huge difference to micro patterns (aka highly complicated interacting eval terms). A position is a real chess position...on the complete board.
Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root sroot to leaf. Each simulation proceeds by selecting in each state s a move a with low visit count, high move probability and high value (averaged over the leaf states of simulations that selected a from s) according to the current neural network f.
The search returns a vector representing a probability distribution over moves, either proportionally or greedily with respect to the visit counts at the root state.
The parameters of the deep neural network in AlphaZero are trained by self-play reinforcement learning, starting from randomly initialised parameters . Games are played by selecting moves for both players by MCTS, at t. At the end of the game, the terminal position sT is scored according to the rules of the game to compute the game outcome z: -1 for a loss, 0 for a draw, and +1 for a win.
Whether it's called WDL or as Deepmind calls it a probability distribution over moves it's about a statistic represented by a floating point value between 0.0 and 1.0 as can be seen in the network file.
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 6991
- Joined: Thu Aug 18, 2011 12:04 pm
Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)
Time to put you on ignore. Congrats you are the first person who accomplished that.
90% of coding is debugging, the other 10% is writing bugs.