New 11258 distilled networks released.

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
dkappe
Posts: 130
Joined: Tue Aug 21, 2018 5:52 pm
Full name: Dietrich Kappe

New 11258 distilled networks released.

Post by dkappe » Mon Feb 11, 2019 1:53 am

48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks

The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.

User avatar
Graham Banks
Posts: 32222
Joined: Sun Feb 26, 2006 9:52 am
Location: Auckland, NZ

Re: New 11258 distilled networks released.

Post by Graham Banks » Mon Feb 11, 2019 2:00 am

dkappe wrote:
Mon Feb 11, 2019 1:53 am
48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks

The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.
Thanks.
My email addresses:
gbanksnz at gmail.com
gbanksnz at yahoo.co.nz

User avatar
Eduard
Posts: 161
Joined: Fri Oct 26, 2018 10:58 pm
Location: Germany
Full name: Eduard Nemeth
Contact:

Re: New 11258 distilled networks released.

Post by Eduard » Mon Feb 11, 2019 4:49 am

Thank you! On my slow PC, the 48x5 runs very well, and is not bad. Surprisingly good results in the analysis mode.

jp
Posts: 393
Joined: Mon Apr 23, 2018 5:54 am

Re: New 11258 distilled networks released.

Post by jp » Thu Feb 14, 2019 12:46 am

Eduard wrote:
Mon Feb 11, 2019 4:49 am
Surprisingly good results in the analysis mode.
Can you say more about what are "surprisingly good results"?

User avatar
Eduard
Posts: 161
Joined: Fri Oct 26, 2018 10:58 pm
Location: Germany
Full name: Eduard Nemeth
Contact:

Re: New 11258 distilled networks released.

Post by Eduard » Thu Feb 14, 2019 1:50 am

jp wrote:
Thu Feb 14, 2019 12:46 am
Eduard wrote:
Mon Feb 11, 2019 4:49 am
Surprisingly good results in the analysis mode.
Can you say more about what are "surprisingly good results"?
Here I have posted some analyzes of other distilled networks:

forum3/viewtopic.php?f=2&t=69820

The net 48x5 is very good compared to its size. The 48x5 network makes on my 2x2,4 GHz i3 1-2 kn/s.

This gives a Lc0 Ratio of about 1 against Rybka 4.1 with 1 core. Of course this is not the same as with a big network. But should there be an Lc0 for Android, then this little NN would be fantastic!

Currently running on my PC a match against Rybka 4.1 x64 (1 core). Time control is 30+10. Rybka leads 3-2.

But, Lc0 (48x5) is clearly better in the opening and middlegame! I see that the bonus time for Lc0 is too little. Lc0 makes about 60 moves, then Lc0 has to play the rest with bonus time. :|

We will see how it is after 20 games. Then I will post some games and analyzes.

jp
Posts: 393
Joined: Mon Apr 23, 2018 5:54 am

Re: New 11258 distilled networks released.

Post by jp » Sat Mar 30, 2019 9:07 pm

Eduard wrote:
Thu Feb 14, 2019 1:50 am
Here I have posted some analyzes of other distilled networks:

forum3/viewtopic.php?f=2&t=69820
Thanks. I'll take a look.

Spill_The_Tea
Posts: 10
Joined: Mon Dec 17, 2018 2:33 am
Full name: Jase de Lace

Re: New 11258 distilled networks released.

Post by Spill_The_Tea » Fri Apr 12, 2019 7:01 pm

dkappe wrote:
Mon Feb 11, 2019 1:53 am
48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks

The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.
Can you define distilled networks?
Does this mean you are testing networks for best performance on cpu only (open blas backend)?

Raphexon
Posts: 32
Joined: Sun Mar 17, 2019 11:00 am
Full name: Henk Drost

Re: New 11258 distilled networks released.

Post by Raphexon » Fri Apr 12, 2019 8:41 pm

Does anyone know how fast the 48x5 is with an RTX 2080ti or similar?

Also wth happened?

MM50 (Mephisto) port vs Lc0.
Both on a i5-4460 single thread
TC was 30+0.3 or 20+0.2 (I forgot)


Raphexon
Posts: 32
Joined: Sun Mar 17, 2019 11:00 am
Full name: Henk Drost

Re: New 11258 distilled networks released.

Post by Raphexon » Fri Apr 12, 2019 9:39 pm

Net I used was 16x2 btw. :oops:

Ferdy
Posts: 3790
Joined: Sun Aug 10, 2008 1:15 pm
Location: Philippines

Re: New 11258 distilled networks released.

Post by Ferdy » Sat Apr 13, 2019 4:11 pm

dkappe wrote:
Mon Feb 11, 2019 1:53 am
48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks

The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.
Thanks for creating these nets. Am trying to find the best net according to my old cpu (i7-2600K 3.4 Ghz), so will be using a blas backend.

I found 64x6 to be the best so far. Rating list refs are anchored on engines with CCRL 40/4 ratings. Will be adding some tests from time to time.

Code: Select all

TC 3'+2s

   # PLAYER                             :  RATING  POINTS  PLAYED   (%)
   1 GreKo 2018.08 2720                 :  2720.0    11.0      20    55
   2 Lc0 v0.21.1 w11258-64x6-se blas    :  2695.6    40.5      70    58
   3 Wyldchess 1.51 2630                :  2630.0     6.0      20    30
   4 Glass 2.0 2610                     :  2610.0     5.5      10    55
   5 Floyd 0.9 2580                     :  2580.0     7.0      20    35

White advantage = -15.79
Draw rate (equal opponents) = 22.15 %
Saw your 11258 tests at https://github.com/dkappe/leela-chess-w ... tournament
64x6 is not really far behind.


Created a multiple linear regression test on that data to possibly approximate the rating based on filters and blocks as independent variables.

Data table source:
https://github.com/dkappe/leela-chess-w ... tournament

Code: Select all

    Filters  Blocks  Rating
0       112      12    2988
1       120       9    2981
2       104       9    2977
3        80       7    2977
4       112      10    2966
5        64       6    2966
6       128       9    2958
7       120      10    2946
8        96       8    2945
9       128      10    2937
10       48       5    2909
11       32       4    2787
12       24       3    2703
13       16       2    2408

Same data sorted by Filters and Blocks in descending orders.

Code: Select all

    Filters  Blocks  Rating
9       128      10    2937
6       128       9    2958
7       120      10    2946
1       120       9    2981
0       112      12    2988
4       112      10    2966
2       104       9    2977
8        96       8    2945
3        80       7    2977
5        64       6    2966
10       48       5    2909
11       32       4    2787
12       24       3    2703
13       16       2    2408

Plot for rating vs filters
Image

and rating vs blocks
Image

Then use sklearn regression to get intercepts and coefficients.

Code: Select all

sklearn multiple linear regression results:
Intercept: 2584.2790502045655
Coefficients: [ 0.69001166 33.18384124]
It comes up with the formula

Code: Select all

Rating = 2584 + (Filters x 0.69) + (Blocks x 33.18)
Independent variables: Filters and Blocks

Sample prediction calculation.

Code: Select all

Sample prediction #1:
Filters = 120
Blocks = 8
Rating predition: 2933

Sample prediction #2:
Filters = 64
Blocks = 8
Rating predition: 2894
Not perfect but perhaps prediction accuracy increases as more data will be available.

Python source:

Code: Select all

# -*- coding: utf-8 -*-
"""
mlr.py

Lc0 Distilled Networks Multiple Linear Regression

Credits:
    https://datatofish.com/multiple-linear-regression-python/
    https://github.com/dkappe/leela-chess-weights/wiki/Distilled-Networks
    
"""


from pandas import DataFrame
from sklearn import linear_model
import matplotlib.pyplot as plt


Lc0DistilledData = {
        'Filters': [112,120,104,80,112,64,128,120,96,128,48,32,24,16],
        'Blocks': [12,9,9,7,10,6,9,10,8,10,5,4,3,2],
        'Rating': [2988,2981,2977,2977,2966,2966,2958,2946,2945,2937,
                   2909,2787,2703,2408]
        }
 

def main():
    
    df = DataFrame(Lc0DistilledData, columns=['Filters','Blocks','Rating'])
    
    print('Data table source:')
    print('https://github.com/dkappe/leela-chess-weights/wiki/Distilled-Networks#11258-focused-tournament')
    print(df)
    
    print('\nSorted by Filters and Blocks descending:')
    sorted_blocks = df.sort_values(by='Blocks', ascending=False)
    sorted_filters = sorted_blocks.sort_values(by='Filters', ascending=False)
    print(sorted_filters)
     
    plt.figure(num=None, figsize=(6, 3), dpi=80)
    plt.scatter(df['Filters'], df['Rating'], color='red')
    plt.title('Rating Vs Filters', fontsize=14)
    plt.xlabel('Filters', fontsize=14)
    plt.ylabel('Rating', fontsize=14)
    plt.grid(True)
    plt.show()
     
    plt.figure(num=None, figsize=(6, 3), dpi=80)
    plt.scatter(df['Blocks'], df['Rating'], color='green')
    plt.title('Rating Vs Blocks', fontsize=14)
    plt.xlabel('Blocks', fontsize=14)
    plt.ylabel('Rating', fontsize=14)
    plt.grid(True) 
    plt.show()
    
    # Mul reg
    X = df[['Filters','Blocks']]
    Y = df['Rating']
    
    # Using sklearn multiple linear regression
    regr = linear_model.LinearRegression()
    regr.fit(X, Y)
    
    print('sklearn multiple linear regression results:')    
    print('Intercept: {}'.format(regr.intercept_))
    print('Coefficients: {}'.format(regr.coef_))
    
    print('\nFormula:')
    print('Rating = {:0.0f} + (Filters x {:0.2f}) + (Blocks x {:0.2f})'.\
          format(regr.intercept_, regr.coef_[0], regr.coef_[1]))
    print('Independent variables: {} and {}'.format('Filters', 'Blocks'))
    
    # Prediction samples
    print('\nSample prediction #1:')
    new_filters = 120
    new_blocks = 8
    print('Filters = {}'.format(new_filters))
    print('Blocks = {}'.format(new_blocks))
    print ('Rating predition: {0:0.0f}'.\
           format(float(regr.predict([[new_filters ,new_blocks]]))))
    
    print('\nSample prediction #2:')
    new_filters = 64
    new_blocks = 8
    print('Filters = {}'.format(new_filters))
    print('Blocks = {}'.format(new_blocks))
    print ('Rating predition: {0:0.0f}'.\
           format(float(regr.predict([[new_filters ,new_blocks]]))))
    

if __name__ == '__main__':
    main()
 

Post Reply