LeelaKnightOdds: BT4-6147500 vs 816730

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Marcus91
Posts: 30
Joined: Sun Jul 10, 2022 11:00 am
Full name: Marco Giorgio

LeelaKnightOdds: BT4-6147500 vs 816730

Post by Marcus91 »

I want just share some results about how well these network do in knight odds with different contempt setting.

Openings: Starting position are the 100 best lines with knight odd taken on
https://github.com/ChrisWhittington/Che ... ind/master

Opponent: Komodo 14.1 Skill: 24 Variety: 10, it's probably rated 2800 in FIDE rapid, since human are generally better giving odds i think it's comparable to a 2500-2600 range FIDE rapid, it's a bit better than leela 1 nodes in knight odds match.

Condition: 1kn per move to leela. You can think that's are too few nodes, however i didn't notice a better performance more than 50 elo with 10k nodes per move in few handreds games and no improvement in a few dozen with 100k nodes per move.
You can think also that using fixed nodes per move it's not a good way to measure engine's strength, however there is no evidence that a usual TC improved elo more than few elo points, moreover i think reproducibility in different hardware is more important in this kind of match.
Finally you can think that same nodes per move advantages BT4 since it's about 2 time slower (on my hardware), this is true, but this is a not a direct comparison between these two nets, just let's see what happen in different contempt configurations, however doubling nodes in 816730 it not should give ore than 20 elo.

Configuration: both engine's configuration it's the same used by LeelaKnightOdds in licess, in lc0 blog we can find it:
# Knight odd configuration (from leela knightodd lichess, lc0 blog, net: 816730)
# Temperature: 0.8
# TempValueCutoff: 0.4
# TempDecayMoves: 4
# MovesLeftThreshold: 0.95
# DrawScore: 0
# Contempt: 450
# ContemptMaxValue: 2000
# WDLCalibrationElo: 3300
# WDLMaxReasonableS: 2.5
# WDLDrawRateReference: 0.65
# WDLEvalObjectivity: 0.0

The only difference with BT4 it's WDLCalibrationElo that is setting as 3200.
Contempt used in 816730 was 450 but with BT4 is now 200 as metion in the blog when match vs GM Navara was scheduled.

*Note: 816730 was tested on v0.30.0 version and BT4 in v0.31.0-rc3
# vs komodo 14.1 lv24 variety: 10 [W, D, L]

# BT4 C: 250 1kn: [39, 7, 54]
# BT4 C: 200 1kn: [35, 15, 50]
# 816730 C: 450 10kn: [30, 12, 58]
# BT4 C: 150 1kn: [28, 14, 58]
# 816730 C: 400 1kn: [31, 7, 62]
# 816730 C: 300 1kn: [29, 9, 62]
# 816730 C: 350 1kn: [28, 5, 67]
# 816730 C: 450 1kn: [25, 10, 65]
# BT4 C: 100 1kn: [23, 15, 62]
# BT4 C: 300 1kn: [25, 7, 68]
# 816730 C: 250 1kn: [24, 6, 70]
# 816730 C: 500 1kn: [24, 4, 72]
# 816730 C: 200 1kn: [23, 5, 72]
# 816730 C: 100 nodes: [19, 10, 71]
# 816730 C: 0 1kn: [21, 5, 74]
# BT4 C: 50 1kn: [18, 10, 72]
# 816730 C: 100 1kn: [17, 7, 76]
# BT4 C: 0 1kn: [14, 12, 74]
# 816730 C: 150 1kn: [13, 7, 80]
# 816730 C: 50 1kn: [13, 5, 82]

Elo performance and errors bar:




Final thoughts: Despite the error bars being quite wide due to the mere 100 games, it seems quite clear that the contempt used by the Leela team is ideal for playing under these conditions (after all, they're not amateurs!). Furthermore, it appears that the BT4 network generally performs better at knight odds