Strange Lc0 TCEC performance

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

crem
Posts: 177
Joined: Wed May 23, 2018 9:29 pm

Re: Strange Lc0 TCEC performance

Post by crem »

Uri Blass wrote: Wed Aug 15, 2018 10:19 am I read that lc0 changed pruning and I wonder if you used the same number that I read to be 0.604 in the TCEC chat in your tests
Also it turned out that dev version of lc0 was sent to TCEC instead of release, so it also contained other changes.
Change look fine if you look into code, it's possible that they have bug (e.g. visited_policy caching).

People on discord reported that that version with the same settings and single-thread works differently from the release, which should not be.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: Strange Lc0 TCEC performance

Post by corres »

I think the scaling of an NN based chess engine is determined decisively by two factor:
1. scaling of MCTS,
2. how big and how filled the NN is.
Supposing MCTS scales well, only a small and/or not fully filled NN itself can cause the bad scaling.
I am afraid in the case of Leela and its derivative their scaling is mainly determined by the issues of NN.
chrisw
Posts: 4317
Joined: Tue Apr 03, 2012 4:28 pm

Re: Strange Lc0 TCEC performance

Post by chrisw »

Poor score indicates something is broken. Which is always likely with entered late changes.

High draw rate indicates higher than "normal" width to depth ratio. In general depth finds interesting lines and possible wins. Width defends against overlooking stuff.

More width and less depth = safe, defensive play, but dull. High draw rate. Simples.

Could be broken somewhere, could be late changes conspired to alter the search profile, could be that the scaling to more nodes tends to reflect in width rather than depth. Could be random, but the operating assumption has to be that there is a problem.
jkiliani
Posts: 143
Joined: Wed Jan 17, 2018 1:26 pm

Re: Strange Lc0 TCEC performance

Post by jkiliani »

chrisw wrote: Wed Aug 15, 2018 12:58 pm Poor score indicates something is broken. Which is always likely with entered late changes.

High draw rate indicates higher than "normal" width to depth ratio. In general depth finds interesting lines and possible wins. Width defends against overlooking stuff.

More width and less depth = safe, defensive play, but dull. High draw rate. Simples.

Could be broken somewhere, could be late changes conspired to alter the search profile, could be that the scaling to more nodes tends to reflect in width rather than depth. Could be random, but the operating assumption has to be that there is a problem.
In the context of Leela, width vs depth is determined by PUCT. After a lot of tactical blunders in the past, we're now using higher PUCT values than before, but maybe this was an overcompensation? Also, optimal values for PUCT are likely quite dependent on typical search depth, unfortunately a CLOP run for low visit counts does not prove that the same value works well at higher visits. This type of optimisation will take a long time though, we may have to wait for TCEC 14 to see a well-optimised Leela.
chrisw
Posts: 4317
Joined: Tue Apr 03, 2012 4:28 pm

Re: Strange Lc0 TCEC performance

Post by chrisw »

jkiliani wrote: Wed Aug 15, 2018 1:11 pm
chrisw wrote: Wed Aug 15, 2018 12:58 pm Poor score indicates something is broken. Which is always likely with entered late changes.

High draw rate indicates higher than "normal" width to depth ratio. In general depth finds interesting lines and possible wins. Width defends against overlooking stuff.

More width and less depth = safe, defensive play, but dull. High draw rate. Simples.

Could be broken somewhere, could be late changes conspired to alter the search profile, could be that the scaling to more nodes tends to reflect in width rather than depth. Could be random, but the operating assumption has to be that there is a problem.
In the context of Leela, width vs depth is determined by PUCT. After a lot of tactical blunders in the past, we're now using higher PUCT values than before, but maybe this was an overcompensation? Also, optimal values for PUCT are likely quite dependent on typical search depth, unfortunately a CLOP run for low visit counts does not prove that the same value works well at higher visits. This type of optimisation will take a long time though, we may have to wait for TCEC 14 to see a well-optimised Leela.
Yes, I wondered if that wasn't it. Did a PUCT change take place between Div3 and Div4?

It may be that PUCT needs a meta parameter that knows about total node search per move. And then adjusts the search profile. Since Leela is competing with AB engines, you need to compensate high node searches to take into account how AB engines profile changes with node count.

I would guess Stockfish, for example, with lots of cores and lots of time and whizzing off beyond iteration 30 or whatever, is basically getting mostly more depth, it likely has max-ed out on width. Well, not entirely, but I think its width/depth profile ratio is going to decrease with very high iteration counts. So, under this tenuous theory based on not a lot of evidence and my intuition, LC0 needs to match that profile and give slightly more to expansion and slightly less to exploration as iteration number gets very high.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Strange Lc0 TCEC performance

Post by Laskos »

jkiliani wrote: Wed Aug 15, 2018 1:11 pm
chrisw wrote: Wed Aug 15, 2018 12:58 pm Poor score indicates something is broken. Which is always likely with entered late changes.

High draw rate indicates higher than "normal" width to depth ratio. In general depth finds interesting lines and possible wins. Width defends against overlooking stuff.

More width and less depth = safe, defensive play, but dull. High draw rate. Simples.

Could be broken somewhere, could be late changes conspired to alter the search profile, could be that the scaling to more nodes tends to reflect in width rather than depth. Could be random, but the operating assumption has to be that there is a problem.
In the context of Leela, width vs depth is determined by PUCT. After a lot of tactical blunders in the past, we're now using higher PUCT values than before, but maybe this was an overcompensation? Also, optimal values for PUCT are likely quite dependent on typical search depth, unfortunately a CLOP run for low visit counts does not prove that the same value works well at higher visits. This type of optimisation will take a long time though, we may have to wait for TCEC 14 to see a well-optimised Leela.
Yes, trying to fix manually, perturbatively say CPUCT and FPU, gave me incontrolable variations with time control (or nodes or depth). Doing fitting at short TC is completely irrelevant to long TC, and at long TC I cannot do much fitting, so I abandoned any fit and just use defaults (v16).
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Strange Lc0 TCEC performance

Post by zullil »

So in summary, it seems that Lc0 is experiencing the "growing pains" that all engines experience.

I also wonder if a lot of folks don't really understand how staggeringly strong a well-maintained, properly-configured, traditional engine can be on top-end hardware and long time controls.
chrisw
Posts: 4317
Joined: Tue Apr 03, 2012 4:28 pm

Re: Strange Lc0 TCEC performance

Post by chrisw »

zullil wrote: Wed Aug 15, 2018 2:24 pm So in summary, it seems that Lc0 is experiencing the "growing pains" that all engines experience.

I also wonder if a lot of folks don't really understand how staggeringly strong a well-maintained, properly-configured, traditional engine can be on top-end hardware and long time controls.
no, they (LC0) have it much much worse in the growing pains department. and yes, this you mention "well-maintained, properly-configured" makes an enormous difference
chrisw
Posts: 4317
Joined: Tue Apr 03, 2012 4:28 pm

Re: Strange Lc0 TCEC performance

Post by chrisw »

Laskos wrote: Wed Aug 15, 2018 2:12 pm
jkiliani wrote: Wed Aug 15, 2018 1:11 pm
chrisw wrote: Wed Aug 15, 2018 12:58 pm Poor score indicates something is broken. Which is always likely with entered late changes.

High draw rate indicates higher than "normal" width to depth ratio. In general depth finds interesting lines and possible wins. Width defends against overlooking stuff.

More width and less depth = safe, defensive play, but dull. High draw rate. Simples.

Could be broken somewhere, could be late changes conspired to alter the search profile, could be that the scaling to more nodes tends to reflect in width rather than depth. Could be random, but the operating assumption has to be that there is a problem.
In the context of Leela, width vs depth is determined by PUCT. After a lot of tactical blunders in the past, we're now using higher PUCT values than before, but maybe this was an overcompensation? Also, optimal values for PUCT are likely quite dependent on typical search depth, unfortunately a CLOP run for low visit counts does not prove that the same value works well at higher visits. This type of optimisation will take a long time though, we may have to wait for TCEC 14 to see a well-optimised Leela.
Yes, trying to fix manually, perturbatively say CPUCT and FPU, gave me incontrolable variations with time control (or nodes or depth). Doing fitting at short TC is completely irrelevant to long TC, and at long TC I cannot do much fitting, so I abandoned any fit and just use defaults (v16).
Are you assaulting PUCT via the external parameters, or modifying source and re-compiling? Changing and testing the search profile with nodecount effectively would have to be the latter. Needs code writing.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Strange Lc0 TCEC performance

Post by Laskos »

chrisw wrote: Wed Aug 15, 2018 3:00 pm
Laskos wrote: Wed Aug 15, 2018 2:12 pm
jkiliani wrote: Wed Aug 15, 2018 1:11 pm
chrisw wrote: Wed Aug 15, 2018 12:58 pm Poor score indicates something is broken. Which is always likely with entered late changes.

High draw rate indicates higher than "normal" width to depth ratio. In general depth finds interesting lines and possible wins. Width defends against overlooking stuff.

More width and less depth = safe, defensive play, but dull. High draw rate. Simples.

Could be broken somewhere, could be late changes conspired to alter the search profile, could be that the scaling to more nodes tends to reflect in width rather than depth. Could be random, but the operating assumption has to be that there is a problem.
In the context of Leela, width vs depth is determined by PUCT. After a lot of tactical blunders in the past, we're now using higher PUCT values than before, but maybe this was an overcompensation? Also, optimal values for PUCT are likely quite dependent on typical search depth, unfortunately a CLOP run for low visit counts does not prove that the same value works well at higher visits. This type of optimisation will take a long time though, we may have to wait for TCEC 14 to see a well-optimised Leela.
Yes, trying to fix manually, perturbatively say CPUCT and FPU, gave me incontrolable variations with time control (or nodes or depth). Doing fitting at short TC is completely irrelevant to long TC, and at long TC I cannot do much fitting, so I abandoned any fit and just use defaults (v16).
Are you assaulting PUCT via the external parameters, or modifying source and re-compiling? Changing and testing the search profile with nodecount effectively would have to be the latter. Needs code writing.
No, just external parameters. Even those are hard to optimize, and they heavily depend on time control (nodes, depth).