New 11258 distilled networks released.

dkappe · Post by **dkappe** » Mon Feb 11, 2019 2:53 am

48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks

The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.

Graham Banks · Post by **Graham Banks** » Mon Feb 11, 2019 3:00 am

dkappe wrote: ↑Mon Feb 11, 2019 2:53 am 48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks

The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.

Thanks.

Eduard · Post by **Eduard** » Mon Feb 11, 2019 5:49 am

Thank you! On my slow PC, the 48x5 runs very well, and is not bad. Surprisingly good results in the analysis mode.

jp · Post by jp » Thu Feb 14, 2019 1:46 am

Eduard wrote: ↑Mon Feb 11, 2019 5:49 am Surprisingly good results in the analysis mode.

Can you say more about what are "surprisingly good results"?

Eduard · Post by **Eduard** » Thu Feb 14, 2019 2:50 am

jp wrote: ↑Thu Feb 14, 2019 1:46 am
Eduard wrote: ↑Mon Feb 11, 2019 5:49 am Surprisingly good results in the analysis mode.
Can you say more about what are "surprisingly good results"?

Here I have posted some analyzes of other distilled networks:

forum3/viewtopic.php?f=2&t=69820

The net 48x5 is very good compared to its size. The 48x5 network makes on my 2x2,4 GHz i3 1-2 kn/s.

This gives a Lc0 Ratio of about 1 against Rybka 4.1 with 1 core. Of course this is not the same as with a big network. But should there be an Lc0 for Android, then this little NN would be fantastic!

Currently running on my PC a match against Rybka 4.1 x64 (1 core). Time control is 30+10. Rybka leads 3-2.

But, Lc0 (48x5) is clearly better in the opening and middlegame! I see that the bonus time for Lc0 is too little. Lc0 makes about 60 moves, then Lc0 has to play the rest with bonus time.

We will see how it is after 20 games. Then I will post some games and analyzes.

jp · Post by jp » Sat Mar 30, 2019 10:07 pm

Eduard wrote: ↑Thu Feb 14, 2019 2:50 am Here I have posted some analyzes of other distilled networks:

forum3/viewtopic.php?f=2&t=69820

Thanks. I'll take a look.

Spill_The_Tea · Post by **Spill_The_Tea** » Fri Apr 12, 2019 9:01 pm

dkappe wrote: ↑Mon Feb 11, 2019 2:53 am 48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks

The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.

Can you define distilled networks?
Does this mean you are testing networks for best performance on cpu only (open blas backend)?

Raphexon · Post by **Raphexon** » Fri Apr 12, 2019 10:41 pm

Does anyone know how fast the 48x5 is with an RTX 2080ti or similar?

Also wth happened?

MM50 (Mephisto) port vs Lc0.
Both on a i5-4460 single thread
TC was 30+0.3 or 20+0.2 (I forgot)

[pgn][Event "?"] [Date "2019.04.12"] [Round "?"] [White "MM50-UCI"] [Black "lc0"] [Result "1/2-1/2"] [WhiteElo "?"] [BlackElo "?"] [Variant "Standard"] [TimeControl "-"] [ECO "B91"] [Opening "Sicilian Defense: Najdorf Variation, Zagreb (Fianchetto) Variation"] [Termination "Normal"] [Annotator "lichess.org"] 1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 a6 6. g3 { B91 Sicilian Defense: Najdorf Variation, Zagreb (Fianchetto) Variation } e5 7. Nde2 Be6 8. f4 Nbd7 9. f5 Bc4 10. b3 Bb5 11. Nxb5 axb5 12. Bg5 d5 13. exd5 h6 14. Be3 Bb4+ 15. c3 Bc5 16. Bg1 Bxg1 17. Nxg1 Qb6 18. Qd3 O-O 19. Qxb5 Qe3+ 20. Ne2 Nc5 21. Bg2 Nd3+ 22. Kd1 Ng4 23. Rf1 Nxh2 24. Rg1 Ng4 25. Bf1 e4 26. Kc2 Qf2 27. Qxb7 Rfb8 28. Qe7 Re8 29. Qd7 Nde5 30. Qc7 Rac8 31. Qd6 Nf3 32. Rh1 Rcd8 33. Qa6 Rxd5 34. Qc6 Red8 35. Kb2 e3 36. Qa6 Rd2+ 37. Ka3 R8d5 38. Bh3 Nge5 39. Bf1 Ng4 40. Bh3 Ngh2 41. Nf4 Rc5 42. Nd3 Qxg3 43. Nxc5 e2 44. Ne4 Qf4 45. Qc8+ Kh7 46. Nxd2 Qxd2 47. Qc7 Kg8 48. Qb8+ Kh7 49. Qe8 h5 50. Rhc1 Ng4 51. Bxg4 e1=Q 52. Rxe1 hxg4 53. Rh1+ Nh2 54. Qb8 g3 55. Qxg3 Qd6+ 56. Qxd6 f6 57. Rxh2+ Kg8 58. Rb1 Kf7 59. Ra1 Ke8 60. Rb1 Kf7 61. Ra1 Kg8 62. Rb1 Kf7 { The game is a draw. } 1/2-1/2[/pgn]

Raphexon · Post by **Raphexon** » Fri Apr 12, 2019 11:39 pm

Net I used was 16x2 btw.

Ferdy · Post by **Ferdy** » Sat Apr 13, 2019 6:11 pm

dkappe wrote: ↑Mon Feb 11, 2019 2:53 am 48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks

The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.

Thanks for creating these nets. Am trying to find the best net according to my old cpu (i7-2600K 3.4 Ghz), so will be using a blas backend.

I found 64x6 to be the best so far. Rating list refs are anchored on engines with CCRL 40/4 ratings. Will be adding some tests from time to time.

Code: Select all

TC 3'+2s

   # PLAYER                             :  RATING  POINTS  PLAYED   (%)
   1 GreKo 2018.08 2720                 :  2720.0    11.0      20    55
   2 Lc0 v0.21.1 w11258-64x6-se blas    :  2695.6    40.5      70    58
   3 Wyldchess 1.51 2630                :  2630.0     6.0      20    30
   4 Glass 2.0 2610                     :  2610.0     5.5      10    55
   5 Floyd 0.9 2580                     :  2580.0     7.0      20    35

White advantage = -15.79
Draw rate (equal opponents) = 22.15 %

Saw your 11258 tests at https://github.com/dkappe/leela-chess-w ... tournament
64x6 is not really far behind.

Created a multiple linear regression test on that data to possibly approximate the rating based on filters and blocks as independent variables.

Data table source:
https://github.com/dkappe/leela-chess-w ... tournament

Code: Select all

    Filters  Blocks  Rating
0       112      12    2988
1       120       9    2981
2       104       9    2977
3        80       7    2977
4       112      10    2966
5        64       6    2966
6       128       9    2958
7       120      10    2946
8        96       8    2945
9       128      10    2937
10       48       5    2909
11       32       4    2787
12       24       3    2703
13       16       2    2408

Same data sorted by Filters and Blocks in descending orders.

Code: Select all

    Filters  Blocks  Rating
9       128      10    2937
6       128       9    2958
7       120      10    2946
1       120       9    2981
0       112      12    2988
4       112      10    2966
2       104       9    2977
8        96       8    2945
3        80       7    2977
5        64       6    2966
10       48       5    2909
11       32       4    2787
12       24       3    2703
13       16       2    2408

Plot for rating vs filters

and rating vs blocks

Then use sklearn regression to get intercepts and coefficients.

Code: Select all

sklearn multiple linear regression results:
Intercept: 2584.2790502045655
Coefficients: [ 0.69001166 33.18384124]

It comes up with the formula

Code: Select all

Rating = 2584 + (Filters x 0.69) + (Blocks x 33.18)
Independent variables: Filters and Blocks

Sample prediction calculation.

Code: Select all

Sample prediction #1:
Filters = 120
Blocks = 8
Rating predition: 2933

Sample prediction #2:
Filters = 64
Blocks = 8
Rating predition: 2894

Not perfect but perhaps prediction accuracy increases as more data will be available.

Python source:

Code: Select all

# -*- coding: utf-8 -*-
"""
mlr.py

Lc0 Distilled Networks Multiple Linear Regression

Credits:
    https://datatofish.com/multiple-linear-regression-python/
    https://github.com/dkappe/leela-chess-weights/wiki/Distilled-Networks
    
"""


from pandas import DataFrame
from sklearn import linear_model
import matplotlib.pyplot as plt


Lc0DistilledData = {
        'Filters': [112,120,104,80,112,64,128,120,96,128,48,32,24,16],
        'Blocks': [12,9,9,7,10,6,9,10,8,10,5,4,3,2],
        'Rating': [2988,2981,2977,2977,2966,2966,2958,2946,2945,2937,
                   2909,2787,2703,2408]
        }
 

def main():
    
    df = DataFrame(Lc0DistilledData, columns=['Filters','Blocks','Rating'])
    
    print('Data table source:')
    print('https://github.com/dkappe/leela-chess-weights/wiki/Distilled-Networks#11258-focused-tournament')
    print(df)
    
    print('\nSorted by Filters and Blocks descending:')
    sorted_blocks = df.sort_values(by='Blocks', ascending=False)
    sorted_filters = sorted_blocks.sort_values(by='Filters', ascending=False)
    print(sorted_filters)
     
    plt.figure(num=None, figsize=(6, 3), dpi=80)
    plt.scatter(df['Filters'], df['Rating'], color='red')
    plt.title('Rating Vs Filters', fontsize=14)
    plt.xlabel('Filters', fontsize=14)
    plt.ylabel('Rating', fontsize=14)
    plt.grid(True)
    plt.show()
     
    plt.figure(num=None, figsize=(6, 3), dpi=80)
    plt.scatter(df['Blocks'], df['Rating'], color='green')
    plt.title('Rating Vs Blocks', fontsize=14)
    plt.xlabel('Blocks', fontsize=14)
    plt.ylabel('Rating', fontsize=14)
    plt.grid(True) 
    plt.show()
    
    # Mul reg
    X = df[['Filters','Blocks']]
    Y = df['Rating']
    
    # Using sklearn multiple linear regression
    regr = linear_model.LinearRegression()
    regr.fit(X, Y)
    
    print('sklearn multiple linear regression results:')    
    print('Intercept: {}'.format(regr.intercept_))
    print('Coefficients: {}'.format(regr.coef_))
    
    print('\nFormula:')
    print('Rating = {:0.0f} + (Filters x {:0.2f}) + (Blocks x {:0.2f})'.\
          format(regr.intercept_, regr.coef_[0], regr.coef_[1]))
    print('Independent variables: {} and {}'.format('Filters', 'Blocks'))
    
    # Prediction samples
    print('\nSample prediction #1:')
    new_filters = 120
    new_blocks = 8
    print('Filters = {}'.format(new_filters))
    print('Blocks = {}'.format(new_blocks))
    print ('Rating predition: {0:0.0f}'.\
           format(float(regr.predict([[new_filters ,new_blocks]]))))
    
    print('\nSample prediction #2:')
    new_filters = 64
    new_blocks = 8
    print('Filters = {}'.format(new_filters))
    print('Blocks = {}'.format(new_blocks))
    print ('Rating predition: {0:0.0f}'.\
           format(float(regr.predict([[new_filters ,new_blocks]]))))
    

if __name__ == '__main__':
    main()

New 11258 distilled networks released.

New 11258 distilled networks released.

Re: New 11258 distilled networks released.

Re: New 11258 distilled networks released.

Re: New 11258 distilled networks released.

Re: New 11258 distilled networks released.

Re: New 11258 distilled networks released.

Re: New 11258 distilled networks released.

Re: New 11258 distilled networks released.

Re: New 11258 distilled networks released.

Re: New 11258 distilled networks released.