48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks
The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.
New 11258 distilled networks released.
Moderators: hgm, Rebel, chrisw
-
- Posts: 1631
- Joined: Tue Aug 21, 2018 7:52 pm
- Full name: Dietrich Kappe
New 11258 distilled networks released.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
-
- Posts: 41461
- Joined: Sun Feb 26, 2006 10:52 am
- Location: Auckland, NZ
Re: New 11258 distilled networks released.
Thanks.dkappe wrote: ↑Mon Feb 11, 2019 2:53 am 48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks
The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.
gbanksnz at gmail.com
-
- Posts: 1439
- Joined: Sat Oct 27, 2018 12:58 am
- Location: Germany
- Full name: N.N.
Re: New 11258 distilled networks released.
Thank you! On my slow PC, the 48x5 runs very well, and is not bad. Surprisingly good results in the analysis mode.
-
- Posts: 1470
- Joined: Mon Apr 23, 2018 7:54 am
-
- Posts: 1439
- Joined: Sat Oct 27, 2018 12:58 am
- Location: Germany
- Full name: N.N.
Re: New 11258 distilled networks released.
Here I have posted some analyzes of other distilled networks:
forum3/viewtopic.php?f=2&t=69820
The net 48x5 is very good compared to its size. The 48x5 network makes on my 2x2,4 GHz i3 1-2 kn/s.
This gives a Lc0 Ratio of about 1 against Rybka 4.1 with 1 core. Of course this is not the same as with a big network. But should there be an Lc0 for Android, then this little NN would be fantastic!
Currently running on my PC a match against Rybka 4.1 x64 (1 core). Time control is 30+10. Rybka leads 3-2.
But, Lc0 (48x5) is clearly better in the opening and middlegame! I see that the bonus time for Lc0 is too little. Lc0 makes about 60 moves, then Lc0 has to play the rest with bonus time.
We will see how it is after 20 games. Then I will post some games and analyzes.
-
- Posts: 1470
- Joined: Mon Apr 23, 2018 7:54 am
Re: New 11258 distilled networks released.
Thanks. I'll take a look.Eduard wrote: ↑Thu Feb 14, 2019 2:50 am Here I have posted some analyzes of other distilled networks:
forum3/viewtopic.php?f=2&t=69820
-
- Posts: 24
- Joined: Mon Dec 17, 2018 3:33 am
- Full name: Jase de Lace
Re: New 11258 distilled networks released.
Can you define distilled networks?dkappe wrote: ↑Mon Feb 11, 2019 2:53 am 48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks
The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.
Does this mean you are testing networks for best performance on cpu only (open blas backend)?
-
- Posts: 476
- Joined: Sun Mar 17, 2019 12:00 pm
- Full name: Henk Drost
Re: New 11258 distilled networks released.
Does anyone know how fast the 48x5 is with an RTX 2080ti or similar?
Also wth happened?
MM50 (Mephisto) port vs Lc0.
Both on a i5-4460 single thread
TC was 30+0.3 or 20+0.2 (I forgot)
[pgn][Event "?"] [Date "2019.04.12"] [Round "?"] [White "MM50-UCI"] [Black "lc0"] [Result "1/2-1/2"] [WhiteElo "?"] [BlackElo "?"] [Variant "Standard"] [TimeControl "-"] [ECO "B91"] [Opening "Sicilian Defense: Najdorf Variation, Zagreb (Fianchetto) Variation"] [Termination "Normal"] [Annotator "lichess.org"] 1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 a6 6. g3 { B91 Sicilian Defense: Najdorf Variation, Zagreb (Fianchetto) Variation } e5 7. Nde2 Be6 8. f4 Nbd7 9. f5 Bc4 10. b3 Bb5 11. Nxb5 axb5 12. Bg5 d5 13. exd5 h6 14. Be3 Bb4+ 15. c3 Bc5 16. Bg1 Bxg1 17. Nxg1 Qb6 18. Qd3 O-O 19. Qxb5 Qe3+ 20. Ne2 Nc5 21. Bg2 Nd3+ 22. Kd1 Ng4 23. Rf1 Nxh2 24. Rg1 Ng4 25. Bf1 e4 26. Kc2 Qf2 27. Qxb7 Rfb8 28. Qe7 Re8 29. Qd7 Nde5 30. Qc7 Rac8 31. Qd6 Nf3 32. Rh1 Rcd8 33. Qa6 Rxd5 34. Qc6 Red8 35. Kb2 e3 36. Qa6 Rd2+ 37. Ka3 R8d5 38. Bh3 Nge5 39. Bf1 Ng4 40. Bh3 Ngh2 41. Nf4 Rc5 42. Nd3 Qxg3 43. Nxc5 e2 44. Ne4 Qf4 45. Qc8+ Kh7 46. Nxd2 Qxd2 47. Qc7 Kg8 48. Qb8+ Kh7 49. Qe8 h5 50. Rhc1 Ng4 51. Bxg4 e1=Q 52. Rxe1 hxg4 53. Rh1+ Nh2 54. Qb8 g3 55. Qxg3 Qd6+ 56. Qxd6 f6 57. Rxh2+ Kg8 58. Rb1 Kf7 59. Ra1 Ke8 60. Rb1 Kf7 61. Ra1 Kg8 62. Rb1 Kf7 { The game is a draw. } 1/2-1/2[/pgn]
Also wth happened?
MM50 (Mephisto) port vs Lc0.
Both on a i5-4460 single thread
TC was 30+0.3 or 20+0.2 (I forgot)
[pgn][Event "?"] [Date "2019.04.12"] [Round "?"] [White "MM50-UCI"] [Black "lc0"] [Result "1/2-1/2"] [WhiteElo "?"] [BlackElo "?"] [Variant "Standard"] [TimeControl "-"] [ECO "B91"] [Opening "Sicilian Defense: Najdorf Variation, Zagreb (Fianchetto) Variation"] [Termination "Normal"] [Annotator "lichess.org"] 1. e4 c5 2. Nf3 d6 3. d4 cxd4 4. Nxd4 Nf6 5. Nc3 a6 6. g3 { B91 Sicilian Defense: Najdorf Variation, Zagreb (Fianchetto) Variation } e5 7. Nde2 Be6 8. f4 Nbd7 9. f5 Bc4 10. b3 Bb5 11. Nxb5 axb5 12. Bg5 d5 13. exd5 h6 14. Be3 Bb4+ 15. c3 Bc5 16. Bg1 Bxg1 17. Nxg1 Qb6 18. Qd3 O-O 19. Qxb5 Qe3+ 20. Ne2 Nc5 21. Bg2 Nd3+ 22. Kd1 Ng4 23. Rf1 Nxh2 24. Rg1 Ng4 25. Bf1 e4 26. Kc2 Qf2 27. Qxb7 Rfb8 28. Qe7 Re8 29. Qd7 Nde5 30. Qc7 Rac8 31. Qd6 Nf3 32. Rh1 Rcd8 33. Qa6 Rxd5 34. Qc6 Red8 35. Kb2 e3 36. Qa6 Rd2+ 37. Ka3 R8d5 38. Bh3 Nge5 39. Bf1 Ng4 40. Bh3 Ngh2 41. Nf4 Rc5 42. Nd3 Qxg3 43. Nxc5 e2 44. Ne4 Qf4 45. Qc8+ Kh7 46. Nxd2 Qxd2 47. Qc7 Kg8 48. Qb8+ Kh7 49. Qe8 h5 50. Rhc1 Ng4 51. Bxg4 e1=Q 52. Rxe1 hxg4 53. Rh1+ Nh2 54. Qb8 g3 55. Qxg3 Qd6+ 56. Qxd6 f6 57. Rxh2+ Kg8 58. Rb1 Kf7 59. Ra1 Ke8 60. Rb1 Kf7 61. Ra1 Kg8 62. Rb1 Kf7 { The game is a draw. } 1/2-1/2[/pgn]
-
- Posts: 476
- Joined: Sun Mar 17, 2019 12:00 pm
- Full name: Henk Drost
Re: New 11258 distilled networks released.
Net I used was 16x2 btw.
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: New 11258 distilled networks released.
Thanks for creating these nets. Am trying to find the best net according to my old cpu (i7-2600K 3.4 Ghz), so will be using a blas backend.dkappe wrote: ↑Mon Feb 11, 2019 2:53 am 48x5 and 120x10 https://github.com/dkappe/leela-chess-w ... d-Networks
The smaller end may work well on Raspberry Pi, the higher end is intended for CPU.
I found 64x6 to be the best so far. Rating list refs are anchored on engines with CCRL 40/4 ratings. Will be adding some tests from time to time.
Code: Select all
TC 3'+2s
# PLAYER : RATING POINTS PLAYED (%)
1 GreKo 2018.08 2720 : 2720.0 11.0 20 55
2 Lc0 v0.21.1 w11258-64x6-se blas : 2695.6 40.5 70 58
3 Wyldchess 1.51 2630 : 2630.0 6.0 20 30
4 Glass 2.0 2610 : 2610.0 5.5 10 55
5 Floyd 0.9 2580 : 2580.0 7.0 20 35
White advantage = -15.79
Draw rate (equal opponents) = 22.15 %
64x6 is not really far behind.
Created a multiple linear regression test on that data to possibly approximate the rating based on filters and blocks as independent variables.
Data table source:
https://github.com/dkappe/leela-chess-w ... tournament
Code: Select all
Filters Blocks Rating
0 112 12 2988
1 120 9 2981
2 104 9 2977
3 80 7 2977
4 112 10 2966
5 64 6 2966
6 128 9 2958
7 120 10 2946
8 96 8 2945
9 128 10 2937
10 48 5 2909
11 32 4 2787
12 24 3 2703
13 16 2 2408
Same data sorted by Filters and Blocks in descending orders.
Code: Select all
Filters Blocks Rating
9 128 10 2937
6 128 9 2958
7 120 10 2946
1 120 9 2981
0 112 12 2988
4 112 10 2966
2 104 9 2977
8 96 8 2945
3 80 7 2977
5 64 6 2966
10 48 5 2909
11 32 4 2787
12 24 3 2703
13 16 2 2408
Plot for rating vs filters
and rating vs blocks
Then use sklearn regression to get intercepts and coefficients.
Code: Select all
sklearn multiple linear regression results:
Intercept: 2584.2790502045655
Coefficients: [ 0.69001166 33.18384124]
Code: Select all
Rating = 2584 + (Filters x 0.69) + (Blocks x 33.18)
Independent variables: Filters and Blocks
Sample prediction calculation.
Code: Select all
Sample prediction #1:
Filters = 120
Blocks = 8
Rating predition: 2933
Sample prediction #2:
Filters = 64
Blocks = 8
Rating predition: 2894
Python source:
Code: Select all
# -*- coding: utf-8 -*-
"""
mlr.py
Lc0 Distilled Networks Multiple Linear Regression
Credits:
https://datatofish.com/multiple-linear-regression-python/
https://github.com/dkappe/leela-chess-weights/wiki/Distilled-Networks
"""
from pandas import DataFrame
from sklearn import linear_model
import matplotlib.pyplot as plt
Lc0DistilledData = {
'Filters': [112,120,104,80,112,64,128,120,96,128,48,32,24,16],
'Blocks': [12,9,9,7,10,6,9,10,8,10,5,4,3,2],
'Rating': [2988,2981,2977,2977,2966,2966,2958,2946,2945,2937,
2909,2787,2703,2408]
}
def main():
df = DataFrame(Lc0DistilledData, columns=['Filters','Blocks','Rating'])
print('Data table source:')
print('https://github.com/dkappe/leela-chess-weights/wiki/Distilled-Networks#11258-focused-tournament')
print(df)
print('\nSorted by Filters and Blocks descending:')
sorted_blocks = df.sort_values(by='Blocks', ascending=False)
sorted_filters = sorted_blocks.sort_values(by='Filters', ascending=False)
print(sorted_filters)
plt.figure(num=None, figsize=(6, 3), dpi=80)
plt.scatter(df['Filters'], df['Rating'], color='red')
plt.title('Rating Vs Filters', fontsize=14)
plt.xlabel('Filters', fontsize=14)
plt.ylabel('Rating', fontsize=14)
plt.grid(True)
plt.show()
plt.figure(num=None, figsize=(6, 3), dpi=80)
plt.scatter(df['Blocks'], df['Rating'], color='green')
plt.title('Rating Vs Blocks', fontsize=14)
plt.xlabel('Blocks', fontsize=14)
plt.ylabel('Rating', fontsize=14)
plt.grid(True)
plt.show()
# Mul reg
X = df[['Filters','Blocks']]
Y = df['Rating']
# Using sklearn multiple linear regression
regr = linear_model.LinearRegression()
regr.fit(X, Y)
print('sklearn multiple linear regression results:')
print('Intercept: {}'.format(regr.intercept_))
print('Coefficients: {}'.format(regr.coef_))
print('\nFormula:')
print('Rating = {:0.0f} + (Filters x {:0.2f}) + (Blocks x {:0.2f})'.\
format(regr.intercept_, regr.coef_[0], regr.coef_[1]))
print('Independent variables: {} and {}'.format('Filters', 'Blocks'))
# Prediction samples
print('\nSample prediction #1:')
new_filters = 120
new_blocks = 8
print('Filters = {}'.format(new_filters))
print('Blocks = {}'.format(new_blocks))
print ('Rating predition: {0:0.0f}'.\
format(float(regr.predict([[new_filters ,new_blocks]]))))
print('\nSample prediction #2:')
new_filters = 64
new_blocks = 8
print('Filters = {}'.format(new_filters))
print('Blocks = {}'.format(new_blocks))
print ('Rating predition: {0:0.0f}'.\
format(float(regr.predict([[new_filters ,new_blocks]]))))
if __name__ == '__main__':
main()