LCZero update

hgm · Post by **hgm** » Wed Mar 14, 2018 1:20 pm

No one has a clue how much random play is. There just aren't enough intermediate opponents to bridge the gap between random movers and searching engines. The weak engines that seem to fit in there are all very buggy, and do not behave according to the Elo model: they have a fixed finite probability to lose against any opponent, no matter how weak, because they crash.

So random play could be 3000 Elo below Stockfish. Or 30,000 Elo. We just don't know.

The CCRL list in this region is a bit suspect. E.g. RAM is supposed to be a random mover. So how can it be at the level of NEG, ~300 Elo above Brutus RND? I am also pretty sure that NEG scores 99% against a random mover, much more than the Elo difference suggests. The rating of the random mover must be highly inflated because it gets free points from buggy engine higher in the list. You cannot make a sensible rating list with engines that hand out free points, irrespective of opponent performance.

CheckersGuy · Post by **CheckersGuy** » Wed Mar 14, 2018 1:33 pm

koedem wrote:Why wouldn't random play be 5000 Elo weaker than SF? If we assume random play at -1500 Elo and SF at 3500 Elo (both seem reasonable) we get to a difference of 5000. Seems logical to me.

Then why would it be 5000 and not 6000 ? Assumptions dont actually help there. By running tests we could adjust the elo-scale

CMCanavessi · Post by **CMCanavessi** » Wed Mar 14, 2018 2:02 pm

Code: Select all

 152 Usurpator II x32                       &#58;  1019.6      55   40    5   10    77     9   653.7    23    22.5
 153 Talvmenni 0.1 x32                      &#58;   998.7      55   34   16    5    76    29   649.2    23    22.5
 154 StrategicDeep 1.25 x32                 &#58;   989.6      39    3    2   34    10     5  1501.8    23    22.1
 155 Hanzo the Razor x32                    &#58;   981.9      55   30   24    1    76    44   626.8    23    22.5
 156 MFChess 1.3 x32                        &#58;   954.1      55   31   17    7    72    31   653.0    23    22.5
 157 Hippocampe v0.4.2 x32                  &#58;   933.4     150   98   18   34    71    12   618.0    15    15.0
 158 Youk V1.05 x32                         &#58;   918.2      94   38   10   46    46    11   975.0    45    42.8
 159 Zoe 0.1 x32                            &#58;   818.4      55   28   14   13    64    25   628.8    23    22.5
 160 NSVChess 0.14 x32                      &#58;   800.5     205  103   54   48    63    26   626.3    29    22.7
 161 Pyotr Amateur Edition v0.6 x32         &#58;   787.7      55   26   16   13    62    29   616.2    23    22.5
 162 Dikabi v0.4209 x32                     &#58;   740.6      55   14   34    7    56    62   633.3    23    22.5
 163 Easy Peasy 1.0 x32                     &#58;   683.5     205  102   22   81    55    11   636.9    29    23.1
 164 Pyotr Novice Edition v2.6 x32          &#58;   613.6      55   19   11   25    45    20   654.0    23    22.5
 165 Leela Chess Zero Gen 6 x64             &#58;   587.8      55   18   12   25    44    22   638.3    23    22.5
 166 N.E.G. 1.2 x32                         &#58;   532.5     205   77   24  104    43    12   652.7    29    23.6
 167 Acqua ver. 20160918 x32                &#58;   527.7     205   82   15  108    44     7   646.2    29    23.1
 168 Ram 2.0 x32                            &#58;   391.8     205   50   38  117    34    19   650.8    29    22.5
 169 Leela Chess Zero Gen 4 x64             &#58;   383.9     150   43   18   89    35    12   654.6    15    15.0
 170 CPP1 0.1038 x32                        &#58;   331.7     205   39   43  123    30    21   651.9    29    22.9
 171 LaMoSca v0.10 x32                      &#58;   271.2     205    2   99  104    25    48   658.1    29    22.7
 172 POS v1.20 x32                          &#58;   153.2     205   15   39  151    17    19   674.6    29    23.4
 173 EtherealRandom &#40;8.97&#41; x64              &#58;    65.7      55    2    8   45    11    15   656.3    23    22.5
 174 EtherTrueRand 9.21 x64                 &#58;    40.1     205    2   40  163    11    20   677.8    29    23.2
 175 Teki Random Mover x64                  &#58;     0.0     205    0   36  169     9    18   675.2    29    22.7

Getting better.

jkiliani · Post by **jkiliani** » Wed Mar 14, 2018 2:25 pm

Jhoravi wrote:Hi. I explored some games on the given link http://162.217.248.187/user/GaryS But the blunderfeasted games are nowhere near 2000 elo IMO. Am I missing something?

Yes, you're missing something. The training games shown on http://162.217.248.187 are considerably more blunder-infested than full strength games, because of temperature=1 used for training. This means that moves are selected proportional to visit count, instead of greedily, which weakens the engine a lot.

You're correct however that 2000 Self-play Elo is nowhere near the same as 2000 Elo on a human scale, since 0 is defined as random play.

CMCanavessi · Post by **CMCanavessi** » Wed Mar 14, 2018 2:31 pm

hgm wrote:E.g. RAM is supposed to be a random mover

Well, I have no idea of what it is "supposed" to be, but RAM 2.0 is not even close to a random mover. It searches, it evaluates, and if you match it against a random mover, it will mate it easily, and no luck involved. It can also beat POS, CPP1 and it's around Acqua in strenght.

koedem · Post by **koedem** » Wed Mar 14, 2018 3:24 pm

I don't know how strong random play is however I simply replied that there's no reason to assume the initial LCZero is worse than random play.

jkiliani · Post by **jkiliani** » Wed Mar 14, 2018 3:30 pm

koedem wrote:I don't know how strong random play is however I simply replied that there's no reason to assume the initial LCZero is worse than random play.

Yes, it can't be, and it isn't. The initial LCZero was actually somewhat better than truly random play, since it did a search that often found mate in one.

Statements like "It's worse than random play" suffer from a cognitive bias where we just think something is random because we can't see the pattern yet. In reality, LCZero never was random.

Milos · Post by **Milos** » Wed Mar 14, 2018 4:36 pm

CheckersGuy wrote:
Milos wrote:
Ozymandias wrote:
Jhoravi wrote:Hi. I explored some games on the given link http://162.217.248.187/user/GaryS But the blunderfeasted games are nowhere near 2000 elo IMO. Am I missing something?
He's not saying that they're at a 2,000 Elo level, they're 2,000 Elo points ahead of random play. Now the question would be, what's the Elo for a random player?
Since random play is certainly not 5000 Elo weaker than SF, the only logical assumption is that initial LCZero was much weaker than random play.
Authors should have maybe thought of using it for suicide chess .
Impossible for lcZero to be weaker than random play

Why?

jkiliani · Post by **jkiliani** » Wed Mar 14, 2018 5:09 pm

I ran a tournament of the gen7 reinforcement learning net, against Stockfish Level 0 (which should be something like 1100-1200 Elo according to various threads I could find).

LCZero went 23 - 75 - 2 against Stockfish (L0), which would give it a conservative Elo rating around 900 by now.

Michel · Post by **Michel** » Wed Mar 14, 2018 5:21 pm

I ran a tournament of the gen7 reinforcement learning net, against Stockfish Level 0 (which should be something like 1100-1200 Elo according to various threads I could find).

LCZero went 23 - 75 - 2 against Stockfish (L0), which would give it a conservative Elo rating around 900 by now.

Would it be possible to post a few games?

LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update

Re: LCZero update