CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Modern Times
Posts: 3546
Joined: Thu Jun 07, 2012 11:02 pm

Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Post by Modern Times »

That is exactly how I see it Guenther. Perhaps we need to alter the wording in our FAQ to be clearer. It was written more than a decade ago long before GPU and NN existed.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Post by Rebel »

Guenther wrote: Sat Feb 23, 2019 10:28 am
Rebel wrote: Sat Feb 23, 2019 10:05 am HGM - In this thread I am trying to convince the CCRL folks (and perhaps the CEGT people as well) to review their restrictions on learning, especially when I read: positional learning not allowed. It's not about you or me. And yet as often for reasons that escape me you sooner or later make it that way with your fighting language use and the fun of the discussion goes away, as in this case.

For the CCRL folks
Your FAQ states: positional learning not allowed. And that is what LZ does, only in a much more advanced way.
Your wish is completely unrealistic, if you think a bit about it.
The goal of CCRL and CEGT is to establish rating lists and this needs scientific conditions.
With any kind of learning 'during the games' played in the rating process it is impossible to guarantee
entities have the same state always, which would be mandatory for being useful, otherwise each single game
would add noise and this noise would sum up exponentially with each further game.
I have said 2 times in this thread that learning during the games should not be allowed for the same reason you point out.
Guenther wrote: Sat Feb 23, 2019 10:28 amAnd I don't understand why you still say LC0 learns. It does not - it has learnt (eval), but it does not learn any further and remains in its state from beginning to end of the rating games.
Lc0 hasn't a learned eval. She has no eval at all. She works with WDL statistics thus learned positions by self-play but in a much more advanced way, stored in a NN cell. The only knowledge Lc0 has is 1-0 | ½-½ | 0-1. (Z)ero knowledge.

To read -

https://github.com/LeelaChessZero/lc0/wiki/Why-Zero

https://github.com/LeelaChessZero/lc0/releases
Guenther wrote: Sat Feb 23, 2019 10:28 amNo one forbids you to extract info from previous learning and add it to your program, except real moves and positions
If you have understood the above about zero-versions then it should be allowed for traditional engines as well.
Guenther wrote: Sat Feb 23, 2019 10:28 am which could be seen as internal book.
If you have read my posts in this thread I have argued in a similar way.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Post by Rebel »

Graham Banks wrote: Tue Feb 19, 2019 7:46 am
Rebel wrote: Tue Feb 19, 2019 6:47 am

Code: Select all

Rank  Name                     Rating 
 45   Lc0 0.20.1 w36089 64-bit  3022
Perhaps it's an idea to include the average NPS in the name for LZ as a sort of indication of the strength of the GPU used.
That's on 1CPU only. No GPU.
Something else, when I download that version, play a match with it, I notice she uses 2 threads. Was that the same in your testing?
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Guenther
Posts: 4605
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Post by Guenther »

Rebel wrote: Sat Feb 23, 2019 12:43 pm
Guenther wrote: Sat Feb 23, 2019 10:28 am
Rebel wrote: Sat Feb 23, 2019 10:05 am HGM - In this thread I am trying to convince the CCRL folks (and perhaps the CEGT people as well) to review their restrictions on learning, especially when I read: positional learning not allowed. It's not about you or me. And yet as often for reasons that escape me you sooner or later make it that way with your fighting language use and the fun of the discussion goes away, as in this case.

For the CCRL folks
Your FAQ states: positional learning not allowed. And that is what LZ does, only in a much more advanced way.
Your wish is completely unrealistic, if you think a bit about it.
The goal of CCRL and CEGT is to establish rating lists and this needs scientific conditions.
With any kind of learning 'during the games' played in the rating process it is impossible to guarantee
entities have the same state always, which would be mandatory for being useful, otherwise each single game
would add noise and this noise would sum up exponentially with each further game.
I have said 2 times in this thread that learning during the games should not be allowed for the same reason you point out.
Guenther wrote: Sat Feb 23, 2019 10:28 amAnd I don't understand why you still say LC0 learns. It does not - it has learnt (eval), but it does not learn any further and remains in its state from beginning to end of the rating games.
Lc0 hasn't a learned eval. She has no eval at all. She works with WDL statistics thus learned positions by self-play but in a much more advanced way, stored in a NN cell. The only knowledge Lc0 has is 1-0 | ½-½ | 0-1. (Z)ero knowledge.
I called it eval for laymen. It is not positions, but you don't understand it (WDL is just a part of the NN and also just recently added BTW), there is a huge difference to micro patterns (aka highly complicated interacting eval terms). A position is a real chess position...on the complete board.
Last edited by Guenther on Sat Feb 23, 2019 1:16 pm, edited 1 time in total.
https://rwbc-chess.de

trollwatch:
Chessqueen + chessica + AlexChess + Eduard + Sylwy
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Post by hgm »

Rebel wrote: Sat Feb 23, 2019 10:05 am HGM - In this thread I am trying to convince the CCRL folks (and perhaps the CEGT people as well) to review their restrictions on learning, especially when I read: positional learning not allowed. It's not about you or me. And yet as often for reasons that escape me you sooner or later make it that way with your fighting language use and the fun of the discussion goes away, as in this case.

For the CCRL folks
Your FAQ states: positional learning not allowed. And that is what LZ does, only in a much more advanced way.
Well, as I explained, I think that would be a very bad idea, so you should not be supprised if I oppose you on that. 'Position learning' is just a fancy way of creating an opening book. (Well, actually any book, but only opening positions would have any chance of ever being encountered, so making books for some middle-game position is just a waste of time.)

Allowing own books for a rating list is not a bad idea per se, but to make it fair every engine should be allowed to bring its own unique book, as you will measure strength of the book just as much as strength of the engine. But most people are only interested in the strength of the engine, as it is easy enough to pick any book you want with any engine. Because they use engines for analyzing middle-game positions, not for playing games from the opening. This is the group of people CCRL intends to cater for, and they are fortunately careful to prevent their measurements from being contaminated by book quality. By forcing all engines (including LZ) to play from the same book.

For the umptieth time: LZ does NOT do positional learning, not in a simple way and not in an advanced way. It just tunes a general evaluation function that can be applied to any position, not just to the negligible sub-set of positions it has been trained on (as a book would). Many conventional alpha-beta engines are tuned in exactly the same way ('Texel tuning'), based on the WDL value of a huge set of training examples. The problem is that you really seem to have no clue at all how LC0 works, and as a result try to 'support' your case with totally false and non-sensical statements.
konsolas
Posts: 182
Joined: Sun Jun 12, 2016 5:44 pm
Location: London
Full name: Vincent

Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Post by konsolas »

Perhaps position learning could be allowed, given conditions, such as:
  • learning files must be supplied by the author, and may not change during testing
  • the engine must attempt to generalise from the learning data, rather than just creating a database of positions
For example, engines could create a learning file based on the outcomes of different material imbalances in a position that it has experienced, rather than the entire position. This would generalise well to different types of positions, and thus increase the strength of analysis of middlegame positions which have not been seen before by position learning.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Post by hgm »

That would not be 'position learning', but just 'data mining' of the games played by the engine. Many engines use material tables, and engine authors can fill them by whatever method they like. It is not very useful to have the engine do it, however, as it is typically done 'off line'. What you obviously would not want (and I think everyone agrees about that) is an engine that changes its behaviour during testing. Because that would basically make then engine untestable, as you could never play more than one game with the same engine before it changes itself.
User avatar
Graham Banks
Posts: 41419
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Post by Graham Banks »

Rebel wrote: Sat Feb 23, 2019 12:54 pm
Graham Banks wrote: Tue Feb 19, 2019 7:46 am
Rebel wrote: Tue Feb 19, 2019 6:47 am

Code: Select all

Rank  Name                     Rating 
 45   Lc0 0.20.1 w36089 64-bit  3022
Perhaps it's an idea to include the average NPS in the name for LZ as a sort of indication of the strength of the GPU used.
That's on 1CPU only. No GPU.
Something else, when I download that version, play a match with it, I notice she uses 2 threads. Was that the same in your testing?
No.
gbanksnz at gmail.com
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Post by Rebel »

Guenther wrote: Sat Feb 23, 2019 12:56 pm
Rebel wrote: Sat Feb 23, 2019 12:43 pm
Guenther wrote: Sat Feb 23, 2019 10:28 am
Rebel wrote: Sat Feb 23, 2019 10:05 am HGM - In this thread I am trying to convince the CCRL folks (and perhaps the CEGT people as well) to review their restrictions on learning, especially when I read: positional learning not allowed. It's not about you or me. And yet as often for reasons that escape me you sooner or later make it that way with your fighting language use and the fun of the discussion goes away, as in this case.

For the CCRL folks
Your FAQ states: positional learning not allowed. And that is what LZ does, only in a much more advanced way.
Your wish is completely unrealistic, if you think a bit about it.
The goal of CCRL and CEGT is to establish rating lists and this needs scientific conditions.
With any kind of learning 'during the games' played in the rating process it is impossible to guarantee
entities have the same state always, which would be mandatory for being useful, otherwise each single game
would add noise and this noise would sum up exponentially with each further game.
I have said 2 times in this thread that learning during the games should not be allowed for the same reason you point out.
Guenther wrote: Sat Feb 23, 2019 10:28 am And I don't understand why you still say LC0 learns. It does not - it has learnt (eval), but it does not learn any further and remains in its state from beginning to end of the rating games.
Lc0 hasn't a learned eval. She has no eval at all. She works with WDL statistics thus learned positions by self-play but in a much more advanced way, stored in a NN cell. The only knowledge Lc0 has is 1-0 | ½-½ | 0-1. (Z)ero knowledge.
I called it eval for laymen.

Let's not call it what it isn't.

From the AZ paper:

Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case.

The AlphaZero algorithm is a more generic version of the AlphaGo Zero algorithm that was first introduced in the context of Go (29). It replaces the handcrafted knowledge and domainspecific augmentations used in traditional game-playing programs with deep neural networks and a tabula rasa reinforcement learning algorithm
.

There is no chess knowledge in Zero versions, no evaluation function.
Guenther wrote: Sat Feb 23, 2019 10:28 amIt is not positions, but you don't understand it (WDL is just a part of the NN and also just recently added BTW), there is a huge difference to micro patterns (aka highly complicated interacting eval terms). A position is a real chess position...on the complete board.
From the AZ paper:

Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root sroot to leaf. Each simulation proceeds by selecting in each state s a move a with low visit count, high move probability and high value (averaged over the leaf states of simulations that selected a from s) according to the current neural network f.

The search returns a vector representing a probability distribution over moves, either proportionally or greedily with respect to the visit counts at the root state.

The parameters of the deep neural network in AlphaZero are trained by self-play reinforcement learning, starting from randomly initialised parameters . Games are played by selecting moves for both players by MCTS, at t. At the end of the game, the terminal position sT is scored according to the rules of the game to compute the game outcome z: -1 for a loss, 0 for a draw, and +1 for a win.


Whether it's called WDL or as Deepmind calls it a probability distribution over moves it's about a statistic represented by a floating point value between 0.0 and 1.0 as can be seen in the network file.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: CCRL 40/40, 40/4 and FRC lists updated (16th February 2019)

Post by Rebel »

hgm wrote: Sat Feb 23, 2019 1:07 pm The problem is that you really seem to have no clue at all how LC0 works, and as a result try to 'support' your case with totally false and non-sensical statements.
Time to put you on ignore. Congrats you are the first person who accomplished that.
90% of coding is debugging, the other 10% is writing bugs.