The real elo of lczero is in its name

Guenther · Post by **Guenther** » Wed Apr 18, 2018 9:35 am

Guenther wrote:
CMCanavessi wrote:
Guenther wrote:
Daniel Shawul wrote:What is astounding is that it lost 20-0 against standard scorpio on single core. Your GPU is getting about 1 kN/s that is 3x slower than TCEC's which was getting about 3-4 kN/s.l
The displayed speed in kN/s for Leela is wrong in TCEC.
See my other post.
It is not wrong, what may be misleading is the total number of nodes, because of tree reuse. Those are not part of the real speed, it's like cache.
If this is true I see no way to get a reliable bench with LCZero.
It can just print a fictional nps, because it does not show which nodes
are reused and which not.

ok the code says they don't count reused nodes for nps

Code: Select all

    // UCI requires long algebraic notation, so use_san=false
    std::string pvstring = get_pv(bh, *m_root, false);
    float feval = m_root->get_eval(color);
    // UCI-like output wants a depth and a cp, so convert winrate to a cp estimate.
    int cp = 290.680623072 * tan(3.096181612 * (feval - 0.5));
    // same for nodes to depth, assume nodes = 1.8 ^ depth.
    int depth = log(float(m_nodes)) / log(1.8);
    // To report nodes, use visits.
    //   - Only includes expanded nodes.
    //   - Includes nodes carried over from tree reuse.
    auto visits = m_root->get_visits();
    // To report nps, use m_playouts to exclude nodes added by tree reuse,
    // which is similar to a ponder hit. The user will expect to know how
    // fast nodes are being added, not how big the ponder hit was.
    myprintf_so("info depth %d nodes %d nps %0.f score cp %d time %lld pv %s\n",
             depth, visits, 1000.0 * m_playouts / (elapsed + 1),
             cp, elapsed, pvstring.c_str());
}

This means the new visited nodes are time/1000*nps

For the example below that is 7.758*199 ~ 1544 nodes vs. 501 reused nodes (total 2045)

Code: Select all

464246 <LCZero_06ID139(0): info depth 19 nodes 2045 nps 199 score cp 17 winrate 53.46% time 7758 pv dxe4 Nxe4 Nbd2 Bf6 Bxf6 Nxf6 h3 h6 a3 d5 c4 dxc4 bxc4 Bf5

CMCanavessi · Post by **CMCanavessi** » Wed Apr 18, 2018 1:18 pm

So Leela has had a serious bug since v0.5 (when the 128 networks were created) that prevented her seeing promotions to any piece except knights. That bug was just discovered yesterday and fixed asap, but all the regressions we were seeing in playing strenght were mostly due to this stupid bug.

I wonder how it would have performed @ those tcec demo matches without the bug and with a network that was not affected by it...

Uri Blass · Post by **Uri Blass** » Wed Apr 18, 2018 1:37 pm

CMCanavessi wrote:So Leela has had a serious bug since v0.5 (when the 128 networks were created) that prevented her seeing promotions to any piece except knights. That bug was just discovered yesterday and fixed asap, but all the regressions we were seeing in playing strenght were mostly due to this stupid bug.

I wonder how it would have performed @ those tcec demo matches without the bug and with a network that was not affected by it...

I wonder how do you know it because when I read in the forum of Lczero that I suspposed to include information about changes
I do not see even a subject that mention the underpromotion bug

https://groups.google.com/forum/#!forum/lczero

I also do not understand if the version that TCEC is going to use is going to have this bug that means that it will only promote to a knight in games.

Edit:I read in the TCEC chat that the version they used ID125 did not have the bug so it seems that the bug did not exist for a very long time because ID 125 is from 13.4.

I thought that there was no progress for 2 weeks.

CMCanavessi · Post by **CMCanavessi** » Wed Apr 18, 2018 2:48 pm

You need to separate the network and the engine, as they are independant things.

The engine has had the bug since v0.5, no matter what network you use. You can read about it here: https://github.com/glinscott/leela-chess/issues/349
It is quite an important bug and reduces the playing strenght a lot, even when playing white (because it thinks that black won't promote to a Queen).

Then you have the issues with the network. ID 125 is one of the first networks to come out after v0.5 was introduced, so it was mostly trained with bug-free matches. After that, the number of bugged matches started to increase eventually filling the training window with 100% bugged matches, that's why later networks play worse.

We'll have to see, now that the bug is fixed, how long it takes to the network to regain the lost strenght.

Nay Lin Tun · Post by **Nay Lin Tun** » Wed Apr 18, 2018 3:31 pm

May be, ingore all networks from 125 and above. Restart training from 125.

mirek · Post by **mirek** » Wed Apr 18, 2018 3:33 pm

CMCanavessi wrote: Then you have the issues with the network. ID 125 is one of the first networks to come out after v0.5 was introduced, so it was mostly trained with bug-free matches. After that, the number of bugged matches started to increase eventually filling the training window with 100% bugged matches, that's why later networks play worse.

We'll have to see, now that the bug is fixed, how long it takes to the network to regain the lost strenght.

Wouldn't it be possibly better after everyone switches to 0.7 just make first new net exact copy of ID125 and scratch all the previous bugged games?

edit: Nay Lin Tun seems to have been faster

George Tsavdaris · Post by **George Tsavdaris** » Wed Apr 18, 2018 4:01 pm

CMCanavessi wrote: Then you have the issues with the network. ID 125 is one of the first networks to come out after v0.5 was introduced, so it was mostly trained with bug-free matches. After that, the number of bugged matches started to increase eventually filling the training window with 100% bugged matches, that's why later networks play worse.

I don't get what you say here.
ID125 came after the bug free 0.5 version. So it was trained with bug free matches.
So how after that, the number of bugged matches started to increase?? Since the engine was bug free, why bugged matches increased?

[EDIT]
Oh forget it i misread. So versions 0.5 and 0.6 had that bug.
So all networks from 125 and then till it has been fixed yesterday are full of it.

OK so wouldn't be better what they suggest here, to start from 125 again instead of waiting for a "recovery" with the current bugged network??

CMCanavessi · Post by **CMCanavessi** » Wed Apr 18, 2018 4:03 pm

No, v0.5 _INTRODUCED_ the bug (while fixing other older bugs), which was fixed last night with v0.7

Daniel Shawul · Post by **Daniel Shawul** » Wed Apr 18, 2018 4:45 pm

It got better at 320+2 but still like -180 elos

Code: Select all

                        scorpio-mcts-min : (+ 10 ,=  2 ,-  3)

 1.                        scorpio-mcts-min     91    183    183     15    73.3%   -90    13.3%
 2.                                  lczero    -90    183    183     15    26.7%    91    13.3%

I had to shut down my lap top so this one has few games. Next one is 900+10 (15m+10) ...

CMCanavessi · Post by **CMCanavessi** » Wed Apr 18, 2018 4:51 pm

Daniel Shawul wrote:It got better at 320+2 but still like -180 elos
Code: Select all
                        scorpio-mcts-min : (+ 10 ,=  2 ,-  3)

 1.                        scorpio-mcts-min     91    183    183     15    73.3%   -90    13.3%
 2.                                  lczero    -90    183    183     15    26.7%    91    13.3%
I had to shut down my lap top so this one has few games. Next one is 900+10 (15m+10) ...

Is that the config I tested? How can we get so different results? What hardware config are you testing Leela on? How many threads for Scorpio?

The real elo of lczero is in its name

Re: Leela 43 cores Versus Scorpio 32 cores at TCEC

Re: Leela 43 cores Versus Scorpio 32 cores at TCEC

Re: Leela 43 cores Versus Scorpio 32 cores at TCEC

Re: Leela 43 cores Versus Scorpio 32 cores at TCEC

Re: Leela 43 cores Versus Scorpio 32 cores at TCEC

Re: Leela 43 cores Versus Scorpio 32 cores at TCEC

Re: Leela 43 cores Versus Scorpio 32 cores at TCEC

Re: Leela 43 cores Versus Scorpio 32 cores at TCEC

Re: First results

Re: First results