Lc0 OpenCL benchmark with 128x10 network
Moderators: hgm, Rebel, chrisw
-
- Posts: 29
- Joined: Tue Dec 13, 2016 10:36 am
Re: Lc0 OpenCL benchmark with 128x10 network
Asus ROG G73S
CPU :
I7 2630QM 2.0GHz
16Gb Ram
8CU
OpenCL 1.2 capabylity
GPU :
Nvidia Geforce GTX 460M
4CU
505 nodes / second
CPU :
I7 2630QM 2.0GHz
16Gb Ram
8CU
OpenCL 1.2 capabylity
GPU :
Nvidia Geforce GTX 460M
4CU
505 nodes / second
-
- Posts: 247
- Joined: Tue Apr 13, 2010 10:41 am
Re: Lc0 OpenCL benchmark with 128x10 network
Thanks for the benchmarks guys .. gimme more
And thanks for the link to the AMD Cedar. In the meantime I was informed, that this is a mediacenter.
Interesting combo: Leela Chess Zero on a slow Atom D525 paired with AMD Radeon. But hey .. it runs Netflix.
Value for Asus ROG G73S was updated to 545 nps via PM.
And thanks for the link to the AMD Cedar. In the meantime I was informed, that this is a mediacenter.
Interesting combo: Leela Chess Zero on a slow Atom D525 paired with AMD Radeon. But hey .. it runs Netflix.
Code: Select all
NPS GPU (OpenCL) System OS
==================================================================================
8754 Nvidia GTX 1070 Win10
705 AMD Firepro M4000 HP EliteBook 8570w, i7-3740QM Win10
595 Intel 6100 MacBook Air 13" 2015, i5-5250U macOS 13.6
545 Nvidia Geforce GTX 460M Asus ROG G73S, i7-2630QM Win10
437 Intel HD 520 Dell Latitude E5570, i5-6200U Win10
412 Intel HD 4400 Sony Vaio Ultrabook 13", i5-4200U Win10
353 Nvidia GT 650M MacBook Pro 15" 2012, i7-3615QM macOS 12.6
260 Intel HD 4000 Lenovo Thinkpad T430, i7-3520M Win10
155 Intel HD 505 Acer Spin 1, Pentium N4200 Win10 1903
74 ATI Radeon HD 5430M Arctic MediaCenter MC001, Atom D525 Win10
11 Intel HD Medion E1232T, Celeron N2807 Win10 1903
Hope we're not just the biological boot loader for digital super intelligence. Unfortunately, that is increasingly probable - Elon Musk
-
- Posts: 2559
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Lc0 OpenCL benchmark with 128x10 network
GTX 1080, Win10:
GTX 1050 Ti, Win10 laptop:
Intel HD 630, Win10 laptop:
Code: Select all
Benchmark final time 5.18833s calculating 10703.1 nodes per second.
Code: Select all
Benchmark final time 5.48895s calculating 3986.38 nodes per second.
Code: Select all
Benchmark final time 7.49431s calculating 573.902 nodes per second.
Martin Sedlak
-
- Posts: 1056
- Joined: Fri Mar 10, 2006 6:07 am
- Location: Basque Country (Spain)
Re: Lc0 OpenCL benchmark with 128x10 network
GTX 750 Ti, Win10, AMD FX-8300:
Code: Select all
Benchmark final time 5.66103s calculating 2493.19 nodes per second.
-
- Posts: 247
- Joined: Tue Apr 13, 2010 10:41 am
Re: Lc0 OpenCL benchmark with 128x10 network
Can't new Intel and AMD GPUs compete under OpenCL?
Code: Select all
NPS GPU (OpenCL) System OS
==================================================================================
10703 Nvidia GTX 1080 Desktop Win10
8754 Nvidia GTX 1070 Desktop Win10
3986 Nvidia GTX 1050 Ti Laptop Win10
2493 Nvidia GTX 750 Ti Desktop, AMD FX-8300 Win10
705 AMD Firepro M4000 HP EliteBook 8570w, i7-3740QM Win10
595 Intel 6100 MacBook Air 13" 2015, i5-5250U macOS 13.6
573 Intel HD 630 Laptop Win10
545 Nvidia GTX 460M Asus ROG G73S, i7-2630QM Win10
437 Intel HD 520 Dell Latitude E5570, i5-6200U Win10
412 Intel HD 4400 Sony Vaio Ultrabook 13", i5-4200U Win10
353 Nvidia GT 650M MacBook Pro 15" 2012, i7-3615QM macOS 12.6
260 Intel HD 4000 Lenovo Thinkpad T430, i7-3520M Win10
155 Intel HD 505 Acer Spin 1, Pentium N4200 Win10
74 ATI Radeon HD 5430M Arctic MediaCenter MC001, Atom D525 Win10
11 Intel HD Medion E1232T, Celeron N2807 Win10
Hope we're not just the biological boot loader for digital super intelligence. Unfortunately, that is increasingly probable - Elon Musk
-
- Posts: 3550
- Joined: Thu Jun 07, 2012 11:02 pm
Re: Lc0 OpenCL benchmark with 128x10 network
GTX 1050
AMD FX-8350
Windows 10
NPS 3,579
AMD FX-8350
Windows 10
NPS 3,579
C:\temp4>lc0.exe benchmark
_
| _ | |
|_ |_ |_| v0.22.0 built Aug 5 2019
Found pb network file: ./weights_run2_56215.pb.gz
Creating backend [opencl]...
OpenCL, maximum batch size set to 16.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 10.2.120
Platform profile: FULL_PROFILE
Platform name: NVIDIA CUDA
Platform vendor: NVIDIA Corporation
Device ID: 0
Device name: GeForce GTX 1050
Device type: GPU
Device vendor: NVIDIA Corporation
Device driver: 430.86
Device speed: 1518 MHZ
Device cores: 5 CU
Device score: 1112
Selected platform: NVIDIA CUDA
Selected device: GeForce GTX 1050
with OpenCL 1.2 capability.
Loaded existing SGEMM tuning for batch size 16.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
Benchmark time 101ms, 2 nodes, 19 nps, move e2e4
Benchmark time 130ms, 4 nodes, 30 nps, move e2e4
Benchmark time 146ms, 9 nodes, 61 nps, move e2e4
Benchmark time 161ms, 18 nodes, 111 nps, move e2e4
Benchmark time 181ms, 29 nodes, 160 nps, move e2e4
Benchmark time 189ms, 33 nodes, 174 nps, move e2e4
Benchmark time 205ms, 45 nodes, 219 nps, move e2e4
Benchmark time 229ms, 68 nodes, 296 nps, move e2e4
Benchmark time 255ms, 78 nodes, 305 nps, move e2e4
Benchmark time 277ms, 102 nodes, 368 nps, move e2e4
Benchmark time 299ms, 138 nodes, 461 nps, move e2e4
Benchmark time 322ms, 173 nodes, 537 nps, move e2e4
Benchmark time 344ms, 223 nodes, 648 nps, move e2e4
Benchmark time 379ms, 275 nodes, 725 nps, move e2e4
Benchmark time 406ms, 313 nodes, 770 nps, move e2e4
Benchmark time 409ms, 355 nodes, 867 nps, move e2e4
Benchmark time 427ms, 378 nodes, 885 nps, move e2e4
Benchmark time 450ms, 470 nodes, 1044 nps, move e2e4
Benchmark time 483ms, 569 nodes, 1178 nps, move e2e4
Benchmark time 534ms, 730 nodes, 1367 nps, move e2e4
Benchmark time 559ms, 825 nodes, 1475 nps, move e2e4
Benchmark time 580ms, 941 nodes, 1622 nps, move e2e4
Benchmark time 670ms, 1212 nodes, 1808 nps, move e2e4
Benchmark time 701ms, 1357 nodes, 1935 nps, move e2e4
Benchmark time 805ms, 1698 nodes, 2109 nps, move e2e4
Benchmark time 1052ms, 2783 nodes, 2645 nps, move e2e4
Benchmark time 1080ms, 2882 nodes, 2668 nps, move e2e4
Benchmark time 1101ms, 2922 nodes, 2653 nps, move e2e4
Benchmark time 1124ms, 2959 nodes, 2632 nps, move e2e4
Benchmark time 1144ms, 2982 nodes, 2606 nps, move e2e4
Benchmark time 1194ms, 3243 nodes, 2716 nps, move e2e4
Benchmark time 1239ms, 3531 nodes, 2849 nps, move e2e4
Benchmark time 1411ms, 4178 nodes, 2961 nps, move e2e4
Benchmark time 1536ms, 4661 nodes, 3034 nps, move e2e4
Benchmark time 1851ms, 5684 nodes, 3070 nps, move e2e4
Benchmark time 1852ms, 5782 nodes, 3122 nps, move e2e4
Benchmark time 1916ms, 6024 nodes, 3144 nps, move e2e4
Benchmark time 3175ms, 10822 nodes, 3408 nps, move e2e4
Benchmark time 3586ms, 12274 nodes, 3422 nps, move e2e4
Benchmark time 3820ms, 13107 nodes, 3431 nps, move e2e4
Benchmark time 5509ms, 19776 nodes, 3589 nps, move e2e4
bestmove e2e4
Benchmark final time 5.55779s calculating 3579.48 nodes per second.
-
- Posts: 247
- Joined: Tue Apr 13, 2010 10:41 am
Re: Lc0 OpenCL benchmark with 128x10 network
Code: Select all
NPS GPU (OpenCL) System OS
===================================================================================
10703 Nvidia GTX 1080 Desktop Win10
9061 Nvidia Tesla T4 Google Colab (*) Linux <-new
8754 Nvidia GTX 1070 Desktop Win10
3986 Nvidia GTX 1050 Ti Laptop Win10
3579 Nvidia GTX 1050 Desktop, AMD FX-8350 Win10
2493 Nvidia GTX 750 Ti Desktop, AMD FX-8300 Win10
705 AMD Firepro M4000 HP EliteBook 8570w, i7-3740QM Win10
595 Intel 6100 MacBook Air 13" 2015, i5-5250U macOS 13.6
573 Intel HD 630 Laptop Win10
545 Nvidia GTX 460M Asus ROG G73S, i7-2630QM Win10
437 Intel HD 520 Dell Latitude E5570, i5-6200U Win10
412 Intel HD 4400 Sony Vaio Ultrabook 13", i5-4200U Win10
353 Nvidia GT 650M MacBook Pro 15" 2012, i7-3615QM macOS 12.6
260 Intel HD 4000 Lenovo Thinkpad T430, i7-3520M Win10
155 Intel HD 505 Acer Spin 1, Pentium N4200 Win10
74 ATI Radeon HD 5430M Arctic MediaCenter MC001, Atom D525 Win10
11 Intel HD Medion E1232T, Celeron N2807 Win10
backend = opencl
Code: Select all
size net nps
----------------------
128x10 56215 9061.2
256x20 42850 1043.7
320x24 60260 1013.9
Code: Select all
size net nps
----------------------
128x10 56215 22502.7
256x20 42850 4026.5
320x24 60260 2189.9
Code: Select all
size net nps
----------------------
128x10 56215 73823.8
256x20 42850 11956.2
320x24 60260 5956.6
Code: Select all
_
| _ | |
|_ |_ |_| v0.22.0 built Aug 16 2019
Loading weights file from: 56215
Creating backend [opencl]...
OpenCL, maximum batch size set to 16.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 10.0.211
Platform profile: FULL_PROFILE
Platform name: NVIDIA CUDA
Platform vendor: NVIDIA Corporation
Device ID: 0
Device name: Tesla T4
Device type: GPU
Device vendor: NVIDIA Corporation
Device driver: 410.79
Device speed: 1590 MHZ
Device cores: 40 CU
Device score: 1112
Selected platform: NVIDIA CUDA
Selected device: Tesla T4
with OpenCL 1.2 capability.
Loaded existing SGEMM tuning for batch size 16.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
Benchmark time 88ms, 2 nodes, 22 nps, move e2e4
Benchmark time 100ms, 5 nodes, 50 nps, move e2e4
Benchmark time 112ms, 9 nodes, 80 nps, move e2e4
Benchmark time 122ms, 18 nodes, 147 nps, move e2e4
Benchmark time 134ms, 29 nodes, 216 nps, move e2e4
Benchmark time 145ms, 33 nodes, 227 nps, move e2e4
Benchmark time 150ms, 45 nodes, 300 nps, move e2e4
Benchmark time 164ms, 59 nodes, 359 nps, move e2e4
Benchmark time 178ms, 73 nodes, 410 nps, move e2e4
Benchmark time 194ms, 102 nodes, 525 nps, move e2e4
Benchmark time 208ms, 135 nodes, 649 nps, move e2e4
Benchmark time 229ms, 173 nodes, 755 nps, move e2e4
Benchmark time 245ms, 218 nodes, 889 nps, move e2e4
Benchmark time 261ms, 267 nodes, 1022 nps, move e2e4
Benchmark time 291ms, 301 nodes, 1034 nps, move e2e4
Benchmark time 291ms, 339 nodes, 1164 nps, move e2e4
Benchmark time 312ms, 399 nodes, 1278 nps, move e2e4
Benchmark time 323ms, 477 nodes, 1476 nps, move e2e4
Benchmark time 331ms, 552 nodes, 1667 nps, move e2e4
Benchmark time 344ms, 665 nodes, 1933 nps, move e2e4
Benchmark time 358ms, 781 nodes, 2181 nps, move e2e4
Benchmark time 369ms, 889 nodes, 2409 nps, move e2e4
Benchmark time 405ms, 1155 nodes, 2851 nps, move e2e4
Benchmark time 430ms, 1318 nodes, 3065 nps, move e2e4
Benchmark time 478ms, 1762 nodes, 3686 nps, move e2e4
Benchmark time 499ms, 1890 nodes, 3787 nps, move e2e4
Benchmark time 564ms, 2502 nodes, 4436 nps, move e2e4
Benchmark time 581ms, 2701 nodes, 4648 nps, move e2e4
Benchmark time 614ms, 2910 nodes, 4739 nps, move e2e4
Benchmark time 636ms, 3091 nodes, 4860 nps, move e2e4
Benchmark time 663ms, 3382 nodes, 5101 nps, move e2e4
Benchmark time 684ms, 3536 nodes, 5169 nps, move e2e4
Benchmark time 707ms, 3784 nodes, 5352 nps, move e2e4
Benchmark time 737ms, 4053 nodes, 5499 nps, move e2e4
Benchmark time 896ms, 5448 nodes, 6080 nps, move e2e4
Benchmark time 951ms, 5904 nodes, 6208 nps, move e2e4
Benchmark time 1675ms, 12273 nodes, 7327 nps, move e2e4
Benchmark time 1770ms, 13118 nodes, 7411 nps, move e2e4
Benchmark time 1823ms, 13714 nodes, 7522 nps, move e2e4
Benchmark time 4173ms, 36470 nodes, 8739 nps, move e2e4
Benchmark time 5104ms, 45864 nodes, 8985 nps, move e2e4
Benchmark time 5302ms, 47989 nodes, 9051 nps, move e2e4
bestmove e2e4
Benchmark final time 5.31629s calculating 9061.2 nodes per second.
Hope we're not just the biological boot loader for digital super intelligence. Unfortunately, that is increasingly probable - Elon Musk
-
- Posts: 52
- Joined: Fri Jan 29, 2010 2:01 pm
- Location: Ivrea, Italy
Re: Lc0 OpenCL benchmark with 128x10 network
Intel HD Graphics 620
Win10
HP EliteBook 850 G4
NPS 487.4
Win10
HP EliteBook 850 G4
NPS 487.4
Code: Select all
C:\Chess\lc0> .\lc0.exe benchmark
_
| _ | |
|_ |_ |_| v0.22.0 built Aug 5 2019
Found pb network file: C:\Chess\lc0/56215.pb.gz
Creating backend [opencl]...
OpenCL, maximum batch size set to 16.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 2.1
Platform profile: FULL_PROFILE
Platform name: Intel(R) OpenCL
Platform vendor: Intel(R) Corporation
Device ID: 0
Device name: Intel(R) HD Graphics 620
Device type: GPU
Device vendor: Intel(R) Corporation
Device driver: 25.20.100.6472
Device speed: 1050 MHZ
Device cores: 24 CU
Device score: 621
Device ID: 1
Device name: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
Device type: CPU
Device vendor: Intel(R) Corporation
Device driver: 7.6.0.0
Device speed: 2700 MHZ
Device cores: 4 CU
Device score: 521
Selected platform: Intel(R) OpenCL
Selected device: Intel(R) HD Graphics 620
with OpenCL 2.1 capability.
Loaded existing SGEMM tuning for batch size 16.
Wavefront/Warp size: 8
Max workgroup size: 256
Max workgroup dimensions: 256 256 256
Benchmark time 90ms, 17 nodes, 188 nps, move b1c3
Benchmark time 222ms, 20 nodes, 90 nps, move f2f3
Benchmark time 274ms, 75 nodes, 273 nps, move b1c3
Benchmark time 377ms, 102 nodes, 270 nps, move b1c3
Benchmark time 644ms, 147 nodes, 228 nps, move g2g3
Benchmark time 659ms, 152 nodes, 230 nps, move g2g3
Benchmark time 781ms, 183 nodes, 234 nps, move e2e4
Benchmark time 880ms, 313 nodes, 355 nps, move a2a3
Benchmark time 1197ms, 494 nodes, 412 nps, move d2d4
Benchmark time 1298ms, 502 nodes, 386 nps, move d2d4
Benchmark time 1538ms, 582 nodes, 378 nps, move d2d4
Benchmark time 6565ms, 3032 nodes, 461 nps, move d2d4
Benchmark time 6626ms, 3280 nodes, 495 nps, move d2d4
Benchmark time 7474ms, 3627 nodes, 485 nps, move c2c3
Benchmark time 7862ms, 3721 nodes, 473 nps, move c2c3
Benchmark time 8862ms, 4266 nodes, 481 nps, move c2c3
bestmove c2c3
Benchmark final time 9.16905s calculating 487.401 nodes per second.
-
- Posts: 247
- Joined: Tue Apr 13, 2010 10:41 am
Re: Lc0 OpenCL benchmark with 128x10 network
Nvidia Tesla K80, the slower companion of Nvidia Tesla T4 from Google Colab.
(*) Nvidia Tesla T4 & K80 @ Google Colab
Benchmark of Nvidia Tesla K80 with 2 backends (K80 does't support the faster cudnn-fp16) for 3 different weights sizes:
backend = opencl
backend = cudnn
./lc0/build/lc0 benchmark -w 56215 -b opencl
Code: Select all
NPS GPU (OpenCL) System OS
===================================================================================
10703 Nvidia GTX 1080 Desktop Win10
9150 Nvidia Tesla T4 Google Colab (*) Linux
8754 Nvidia GTX 1070 Desktop Win10
4829 Nvidia Tesla K80 Google Colab (*) Linux <-new
3986 Nvidia GTX 1050 Ti Laptop Win10
3579 Nvidia GTX 1050 Desktop, AMD FX-8350 Win10
2493 Nvidia GTX 750 Ti Desktop, AMD FX-8300 Win10
705 AMD Firepro M4000 HP EliteBook 8570w, i7-3740QM Win10
595 Intel 6100 MacBook Air 13" 2015, i5-5250U macOS 13.6
573 Intel HD 630 Laptop Win10
545 Nvidia GTX 460M Asus ROG G73S, i7-2630QM Win10
487 Intel HD 620 HP EliteBook 850 G4 Win10
437 Intel HD 520 Dell Latitude E5570, i5-6200U Win10
412 Intel HD 4400 Sony Vaio Ultrabook 13", i5-4200U Win10
353 Nvidia GT 650M MacBook Pro 15" 2012, i7-3615QM macOS 12.6
260 Intel HD 4000 Lenovo Thinkpad T430, i7-3520M Win10
155 Intel HD 505 Acer Spin 1, Pentium N4200 Win10
74 ATI Radeon HD 5430M Arctic MediaCenter MC001, Atom D525 Win10
11 Intel HD Medion E1232T, Celeron N2807 Win10
Benchmark of Nvidia Tesla K80 with 2 backends (K80 does't support the faster cudnn-fp16) for 3 different weights sizes:
backend = opencl
Code: Select all
size net nps
----------------------
128x10 56215 4829.4
256x20 42850 584.0
320x24 60260 493.9
Code: Select all
size net nps
----------------------
128x10 56215 6888.3
256x20 42850 1296.3
320x24 60260 787.8
Code: Select all
_
| _ | |
|_ |_ |_| v0.22.0 built Aug 19 2019
Loading weights file from: 56215
Creating backend [opencl]...
OpenCL, maximum batch size set to 16.
Initializing OpenCL.
Detected 1 OpenCL platforms.
Platform version: OpenCL 1.2 CUDA 10.0.211
Platform profile: FULL_PROFILE
Platform name: NVIDIA CUDA
Platform vendor: NVIDIA Corporation
Device ID: 0
Device name: Tesla K80
Device type: GPU
Device vendor: NVIDIA Corporation
Device driver: 410.79
Device speed: 823 MHZ
Device cores: 13 CU
Device score: 1112
Selected platform: NVIDIA CUDA
Selected device: Tesla K80
with OpenCL 1.2 capability.
Loaded existing SGEMM tuning for batch size 16.
Wavefront/Warp size: 32
Max workgroup size: 1024
Max workgroup dimensions: 1024 1024 64
Benchmark time 67ms, 2 nodes, 29 nps, move e2e4
Benchmark time 96ms, 3 nodes, 31 nps, move e2e4
Benchmark time 122ms, 6 nodes, 49 nps, move e2e4
Benchmark time 122ms, 9 nodes, 73 nps, move e2e4
Benchmark time 136ms, 18 nodes, 132 nps, move e2e4
Benchmark time 153ms, 29 nodes, 189 nps, move e2e4
Benchmark time 166ms, 33 nodes, 198 nps, move e2e4
Benchmark time 174ms, 45 nodes, 258 nps, move e2e4
Benchmark time 191ms, 59 nodes, 308 nps, move e2e4
Benchmark time 209ms, 73 nodes, 349 nps, move e2e4
Benchmark time 228ms, 102 nodes, 447 nps, move e2e4
Benchmark time 248ms, 135 nodes, 544 nps, move e2e4
Benchmark time 269ms, 171 nodes, 635 nps, move e2e4
Benchmark time 287ms, 199 nodes, 693 nps, move e2e4
Benchmark time 310ms, 246 nodes, 793 nps, move e2e4
Benchmark time 354ms, 306 nodes, 864 nps, move e2e4
Benchmark time 354ms, 336 nodes, 949 nps, move e2e4
Benchmark time 381ms, 389 nodes, 1020 nps, move e2e4
Benchmark time 397ms, 478 nodes, 1204 nps, move e2e4
Benchmark time 419ms, 545 nodes, 1300 nps, move e2e4
Benchmark time 455ms, 663 nodes, 1457 nps, move e2e4
Benchmark time 484ms, 780 nodes, 1611 nps, move e2e4
Benchmark time 550ms, 1065 nodes, 1936 nps, move e2e4
Benchmark time 615ms, 1316 nodes, 2139 nps, move e2e4
Benchmark time 695ms, 1737 nodes, 2499 nps, move e2e4
Benchmark time 734ms, 1890 nodes, 2574 nps, move e2e4
Benchmark time 775ms, 2076 nodes, 2678 nps, move e2e4
Benchmark time 800ms, 2249 nodes, 2811 nps, move e2e4
Benchmark time 830ms, 2439 nodes, 2938 nps, move e2e4
Benchmark time 896ms, 2773 nodes, 3094 nps, move e2e4
Benchmark time 927ms, 2927 nodes, 3157 nps, move e2e4
Benchmark time 969ms, 3123 nodes, 3222 nps, move e2e4
Benchmark time 1004ms, 3261 nodes, 3248 nps, move e2e4
Benchmark time 1045ms, 3486 nodes, 3335 nps, move e2e4
Benchmark time 1321ms, 4965 nodes, 3758 nps, move e2e4
Benchmark time 1484ms, 5804 nodes, 3911 nps, move e2e4
Benchmark time 2703ms, 11878 nodes, 4394 nps, move e2e4
Benchmark time 2976ms, 13207 nodes, 4437 nps, move e2e4
Benchmark time 5299ms, 25396 nodes, 4792 nps, move e2e4
Benchmark time 5411ms, 25751 nodes, 4759 nps, move e2e4
bestmove e2e4
Benchmark final time 5.42483s calculating 4829.46 nodes per second.
Hope we're not just the biological boot loader for digital super intelligence. Unfortunately, that is increasingly probable - Elon Musk
-
- Posts: 4889
- Joined: Thu Mar 09, 2006 6:34 am
- Location: Pen Argyl, Pennsylvania
Re: Lc0 OpenCL benchmark with 128x10 network
I know this was for opencl benchmarks, but just for kicks I ran it with a 2060 RTX Super (cudnn-fp16)
Code: Select all
michaelb7@Threadripper-32:~/cluster.mfb$ lc0 benchmark
_
| _ | |
|_ |_ |_| v0.23.2+git.c8d9095 built Jan 9 2020
Found pb network file: ./128x10-se-distill-ccrl-11248.pb.gz
Creating backend [cudnn-auto]...
Switching to [cudnn-fp16]...
CUDA Runtime version: 10.2.0
Cudnn version: 7.6.5
Latest version of CUDA supported by the driver: 10.2.0
GPU: GeForce RTX 2060 SUPER
GPU memory: 7.7923 Gb
GPU clock frequency: 1695 MHz
GPU compute capability: 7.5
Benchmark time 25ms, 7 nodes, 1000 nps, move d2d4
Benchmark time 26ms, 19 nodes, 2375 nps, move g2g3
Benchmark time 27ms, 35 nodes, 3888 nps, move g2g3
Benchmark time 29ms, 59 nodes, 5363 nps, move e2e3
Benchmark time 30ms, 73 nodes, 5615 nps, move g2g3
Benchmark time 32ms, 100 nodes, 7142 nps, move g2g3
Benchmark time 33ms, 111 nodes, 7400 nps, move e2e4
Benchmark time 34ms, 120 nodes, 7500 nps, move e2e4
Benchmark time 35ms, 148 nodes, 8705 nps, move g1f3
Benchmark time 36ms, 169 nodes, 9388 nps, move g1f3
Benchmark time 40ms, 259 nodes, 11772 nps, move c2c4
Benchmark time 41ms, 300 nodes, 13043 nps, move g1f3
Benchmark time 42ms, 325 nodes, 13541 nps, move g1f3
Benchmark time 44ms, 411 nodes, 15807 nps, move c2c4
Benchmark time 47ms, 515 nodes, 17758 nps, move g1f3
Benchmark time 56ms, 771 nodes, 20289 nps, move g1f3
Benchmark time 60ms, 1023 nodes, 23790 nps, move g1f3
Benchmark time 84ms, 1525 nodes, 23106 nps, move g1f3
Benchmark time 88ms, 1744 nodes, 24914 nps, move g1f3
Benchmark time 93ms, 2051 nodes, 27346 nps, move d2d4
Benchmark time 95ms, 2142 nodes, 27818 nps, move d2d4
Benchmark time 106ms, 2602 nodes, 29568 nps, move d2d4
Benchmark time 123ms, 3359 nodes, 31688 nps, move d2d4
Benchmark time 216ms, 9451 nodes, 47732 nps, move d2d4
Benchmark time 223ms, 10023 nodes, 48892 nps, move d2d4
Benchmark time 243ms, 11017 nodes, 48747 nps, move d2d4
Benchmark time 249ms, 11604 nodes, 50233 nps, move d2d4
Benchmark time 337ms, 18663 nodes, 58504 nps, move d2d4
Benchmark time 501ms, 34510 nodes, 71449 nps, move d2d4
Benchmark time 663ms, 48414 nodes, 75060 nps, move d2d4
Benchmark time 742ms, 55440 nodes, 76574 nps, move d2d4
Benchmark time 1224ms, 112779 nodes, 93437 nps, move d2d4
Benchmark time 1246ms, 115673 nodes, 94196 nps, move d2d4
Benchmark time 1518ms, 149134 nodes, 99422 nps, move d2d4
Benchmark time 2005ms, 211803 nodes, 106594 nps, move d2d4
Benchmark time 2274ms, 246094 nodes, 109084 nps, move d2d4
Benchmark time 3475ms, 394188 nodes, 114026 nps, move d2d4
Benchmark time 7000ms, 795262 nodes, 113901 nps, move d2d4
Benchmark time 10000ms, 1106784 nodes, 110877 nps, move d2d4
bestmove d2d4
Benchmark final time 10.0043s calculating 110656 nodes per second.