Re: My failed attempt to change TCEC NN clone rules
Posted: Sat Sep 14, 2019 3:15 pm
AllieStein is a clone. I won't watch the superfinal. I m only interested in final with Stockfish and Lczero.
Computer Chess Club
https://talkchess.com/
I'm not surprised, it's nothing more than a basement tournament run on steroids.crem wrote: ↑Sat Sep 14, 2019 9:32 am I wanted to bring up this topic several times already, but my drafts were too long to post.
Now, as Allie has good chances to go into finals, I think it’s time to bring this topic back.
Also I get frequent questions why don’t I bring this topic up if I don’t agree.
So, I tried to bring this topic up with TCEC administration, with no success.
Timeline:
March 10th (TCEC 15 just started, and Allie suddenly appeared there)
I contacted Anton Mihailov, TD director, telling that:
1. Current TCEC clone rules are not applied correctly:
Allie+Stein is a clone according to TCEC rules (both Allie is not unique relative to Lc0, and Stein is not unique comparing to Leela’s weights, see also http://talkchess.com/forum3/viewtopic.php?p=792755).
2. Rules themselves are poor and have to be changed.
3. I proposed various changes and ideas to the rules (with most of them Allie+Stein would be able to participate, and in some also “DeusX” could).
4. TCEC15 is already messed up, but let’s do that right for TCEC16, we have plenty of time (3 months until the new season start).
Anton responded:
1. This is very important, please keep writing.
2. Please keep it TOP SECRET! Noone should know!
(I tried to convince them that such discussions should be public, only got irrelevant answers, that I don’t know how to manage large communities)
I wrote lots of material, in different forms: one-line summary, one paragraph summary, diagrams, very detailed description etc.
In the end it was clear that noone from TCEC read even one-line summary.
March 29th
(not very relevant but for completeness)
We created a discord server, and Anton invited an undisclosed guest expert to the discussion, with whom we had an interesting, but short and not very relevant to TCEC rules discussion (because TCEC rules are up to TCEC team, really).
March 31th
I received the last message from Anton stating how important is this, and that I should continue writing my proposals.
April
I was pinging them periodically, with no reaction.
May 1st
I deleted the conversations (it was Google Doc and Discord server). Noone seemed to notice. Noone contacted me after that.
In the end I wasted ~15 hours of my time drawing all the diagrams and explanations of different detailization, all to save TCEC admins time.
From chats with them it was clear though that they didn’t even spend 5 minutes on that, they didn’t even read the 1-line summary.
Then TCEC-16 started with no changes at all.
At this point I decided that trying to convey any message to TCEC doesn't worth the effort.
Some screenshots from the document:
I don't like the way these three criteria are presented as mutually independent, as if it's possible to satisfy both #1 and #3 without satisfying #2. That put aside, two engines having the exact same neural network (weights file), but different training code and game-playing code, would be considered distinct. So, it would have been quite possible to have two engines that play exactly the same, yet both be considered original, while at the same time there could be two engines with wildly different styles which would, by the same rules, be considered clones.Definition: A neural network is a computer system modeled on the human brain and nervous system. For the purpose of TCEC a participant is considered a neural network (NN) engine if it generally requires the use of GPU and consists of at least the following 3 parts:Uniqueness: For an NN engine to be unique in the TCEC context, at least two of the three defining parts mentioned above have to be unique.
- The code for training the neural network
- The neural network (and weights file) itself
- The engine that executes this network It is the parts 2 and 3 that will actually be a playing combination at TCEC. Part 1 is used in preparation.
I’m sorry, you tested these engines with the same network? I would expect something close to 100% similarity. The fact that it’s so low is shocking.Rebel wrote: ↑Sat Sep 14, 2019 9:25 pm I recently tested Allie vs Lc0 on similarity, same NN's thus testing the code base.
http://rebel13.nl/html/nn-500ms.html
http://rebel13.nl/html/nn-1000ms.html
75% is pretty high.
But then NN is a whole different subject in comparison when it's about similarity.
Ah, the run times are pretty low. You might find some variation at such low node counts. Really, plugging the same network into a PUCT algorithm is akin to plugging the same eval function into a negamax algorithm: you should get close to 100% similarity (though Allie uses a slightly different backup strategy than lc0, if I recall).dkappe wrote: ↑Sat Sep 14, 2019 11:36 pmI’m sorry, you tested these engines with the same network? I would expect something close to 100% similarity. The fact that it’s so low is shocking.Rebel wrote: ↑Sat Sep 14, 2019 9:25 pm I recently tested Allie vs Lc0 on similarity, same NN's thus testing the code base.
http://rebel13.nl/html/nn-500ms.html
http://rebel13.nl/html/nn-1000ms.html
75% is pretty high.
But then NN is a whole different subject in comparison when it's about similarity.
Any entity that reuses the Lczero engine with another set of weights, retrained, trained, RL or SL, whatever, is a straight copy of lczero with zero originality and anyone claiming original engine status doing that is simply delivering BS.crem wrote: ↑Sun Sep 15, 2019 12:08 am Just to clarify,
I’m not accusing authors (of Allie+Stein) of sending the engine which I don’t consider original enough. Authors can make an engine, and reuse any portion of code, that’s GPL after all. CCCC admins for example openly say that they allow clones, they pick engines based on entertainment value, that that’s totally fine.
And I’m also not trying to disqualify Alliestein from TCEC16, or affect the current season in any way (I agree I picked a bad timing for this thread). I'm not directly anti-Allie for the next season either. I want better rules, and rules applied better.
I do accuse TCEC admins though, even not so much for not following own rules and having poor rules, but for lack of ANY interest in improving those.
Today after my post Anton wrote me that my concern is not forgotten and there’s tremendous work going on to fix NN clone rules (for more than 6 months already!).
Even if I believed that (which I don’t), I still would think it would be done very wrongly. I did write a lot of input, I really spent 4 hours almost every evening for more than a week coming up with pros and cons. Don’t I deserve any feedback? Shouldn’t be I included into the discussion? Shouldn’t the discussion be public after all?
No, it’s “be sure that rules committee will review your input thoroughly and will take into account in the new version of rules, and please keep it TOP SECRET”. And new rules/competitors are usually published at the last day and very publicly when there’s too late to change anything.
I don’t know who “rules committee” is, but what they came up with last time was just silly.
I suspect last time they also got “input to review” from me, although I’m not sure.
It was during DeusX incident. I chatted about that with Anton and told that what ASilver did is he took scripts from Lc0 project and trained the net, and the engine itself is Lc0 too.
What was the resulting rule? You all know:
NN-based engine consists of 3 parts:
1. Neural network.
2. Engine.
3. Training script.
You need 2 of 3 to be unique to be unique.
It’s so detached from reality! What is “training script” doing here? I was really perplexed. Who came up with those “2 of 3”? Training script is something that can be written in 1-2 hours (and I’m surprised no one did that to work around TCEC rules), it’s really very minor piece of work compared to other (others are very ambiguous too, I’ll post about them too a bit later).
If “rules committee” is working the same way (reviewing inputs and generating rules without public discussion), it will be again same bad.