Help with Scorpio 3.0.8

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Help with Scorpio 3.0.8

Post by Dann Corbit »

Right now GPU is at 0% and CPU goes between 7% and 60% in a sawtooth pattern.
After this batch finishes, I will change to half.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Help with Scorpio 3.0.8

Post by Dann Corbit »

The last job set 2306 games to the server but it no longer echos the games per second
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Help with Scorpio 3.0.8

Post by Daniel Shawul »

Dann Corbit wrote: Sat Jun 06, 2020 12:54 am Right now GPU is at 0% and CPU goes between 7% and 60% in a sawtooth pattern.
After this batch finishes, I will change to half.
That is quite surprizing to me. On my local client I see 80% GPU utilization and on a vast.ai RTX 2080 ti I am seeing 70%
using nvidia-smi.

Code: Select all


Fri Jun  5 16:57:36 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64       Driver Version: 440.64       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 207...  Off  | 00000000:07:00.0  On |                  N/A |
| 61%   81C    P2   214W / 215W |   1230MiB /  7973MiB |     80%      Default |
+-------------------------------+----------------------+----------------------+
Windows also have "nvidia-smi.exe" so you may want to use that from the command line and see how much GPU utilization it reports.

The HALF precision is not going to increase GPU utilization though.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Help with Scorpio 3.0.8

Post by Dann Corbit »

Code: Select all

C:\>C:\Windows\System32\DriverStore\FileRepository\nv_dispsi.inf_amd64_86e7a5db5f94798e\nvidia-smi
Fri Jun 05 16:05:37 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 442.19       Driver Version: 442.19       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208... WDDM  | 00000000:01:00.0  On |                  N/A |
| 84%   83C    P2   127W / 250W |   3201MiB /  8192MiB |     41%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208... WDDM  | 00000000:4E:00.0 Off |                  N/A |
| 69%   75C    P2    81W / 250W |    790MiB /  8192MiB |     23%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0       464    C+G   ...6)\Google\Chrome\Application\chrome.exe N/A      |
|    0      1596    C+G   ...osoft.LockApp_cw5n1h2txyewy\LockApp.exe N/A      |
|    0      2908    C+G   ...6.102.0_x64__kzf8qxf38zg5c\SkypeApp.exe N/A      |
|    0      3012    C+G   ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A      |
|    0      8512      C   ...-master\Scorpio\bin\Windows\scorpio.exe N/A      |
|    0      9676    C+G   ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A      |
|    0     10268    C+G   ...x64__8wekyb3d8bbwe\Microsoft.Photos.exe N/A      |
|    0     12388    C+G   ...hell.Experiences.TextInput.InputApp.exe N/A      |
|    0     13608    C+G   Insufficient Permissions                   N/A      |
|    0     14804      C   ...-master\Scorpio\bin\Windows\scorpio.exe N/A      |
|    0     15136    C+G   C:\Windows\explorer.exe                    N/A      |
|    0     15192    C+G   ...1.93.0_x64__8wekyb3d8bbwe\YourPhone.exe N/A      |
|    1     14804      C   ...-master\Scorpio\bin\Windows\scorpio.exe N/A      |
+-----------------------------------------------------------------------------+
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Help with Scorpio 3.0.8

Post by Dann Corbit »

I had a whole bunch of scorpio processes in memory (maybe 10 or 11).
I killed them all and will restart the batch.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Help with Scorpio 3.0.8

Post by Daniel Shawul »

Ok now that is better and makes more sense!
Your GPU utlization is still not great but It is using 41% of your 1st GPU, and 23% of your second GPU.
I don't know what the task manager graphs for GPU / CPU utilization measure.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Help with Scorpio 3.0.8

Post by Dann Corbit »

With one Scorpio in RAM, it seems to be behaving better.
The windows system monitor is lying, I think. Looks to me like 83% of the GPU bandwidth is in use.

Code: Select all

C:\>C:\Windows\System32\DriverStore\FileRepository\nv_dispsi.inf_amd64_86e7a5db5f94798e\nvidia-smi
Fri Jun 05 16:13:36 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 442.19       Driver Version: 442.19       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208... WDDM  | 00000000:01:00.0  On |                  N/A |
| 86%   86C    P2   155W / 250W |   1954MiB /  8192MiB |     54%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208... WDDM  | 00000000:4E:00.0 Off |                  N/A |
| 85%   86C    P2   172W / 250W |    802MiB /  8192MiB |     39%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0       464    C+G   ...6)\Google\Chrome\Application\chrome.exe N/A      |
|    0      1596    C+G   ...osoft.LockApp_cw5n1h2txyewy\LockApp.exe N/A      |
|    0      1712      C   ...-master\Scorpio\bin\Windows\scorpio.exe N/A      |
|    0      2908    C+G   ...6.102.0_x64__kzf8qxf38zg5c\SkypeApp.exe N/A      |
|    0      3012    C+G   ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A      |
|    0      9676    C+G   ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A      |
|    0     10268    C+G   ...x64__8wekyb3d8bbwe\Microsoft.Photos.exe N/A      |
|    0     12388    C+G   ...hell.Experiences.TextInput.InputApp.exe N/A      |
|    0     13608    C+G   Insufficient Permissions                   N/A      |
|    0     15136    C+G   C:\Windows\explorer.exe                    N/A      |
|    0     15192    C+G   ...1.93.0_x64__8wekyb3d8bbwe\YourPhone.exe N/A      |
|    1      1712      C   ...-master\Scorpio\bin\Windows\scorpio.exe N/A      |
+-----------------------------------------------------------------------------+

C:\>
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Help with Scorpio 3.0.8

Post by Daniel Shawul »

Ok that is even better andI think if you were able to use delay=0 GPU utilization would go up to 70%.
But if delay=1 works stably no need to mess around with it.
Don't forget to kill off background processes in the future :)

Btw the client should have printed games per min but I havent checked on Windows.
What does your client output look like?
Mine is like this:

Code: Select all

0: Skipping net download.
0: Executing job : /home/daniel/egbbSource/nn-train/nn-dist/scripts/job-client.sh
[1993073] Games 512: + 211 - 262 = 39
[1993073] generated 512 games in 7.92 min : Rate 64.62 games/min
0: Finished executing job!                     
0: Sending 609 games to server
0: Executing job : /home/daniel/egbbSource/nn-train/nn-dist/scripts/job-client.sh
[1994769] Games 512: + 235 - 250 = 27
[1994769] generated 512 games in 8.00 min : Rate 64.03 games/min
0: Finished executing job!                     
0: Sending 512 games to server
0: Executing job : /home/daniel/egbbSource/nn-train/nn-dist/scripts/job-client.sh
[1996462] Games 512: + 232 - 234 = 46
[1996462] generated 512 games in 8.20 min : Rate 62.42 games/min
0: Finished executing job!                     
0: Sending 512 games to server
0: Executing job : /home/daniel/egbbSource/nn-train/nn-dist/scripts/job-client.sh
[1998476] Games 512: + 241 - 249 = 22
[1998476] generated 512 games in 7.84 min : Rate 65.27 games/min
Edit Maybe since you do longer runs, the client reconnects (you see welcome message repeatedly) to server every time and messes up the output.
That is one issue we need to resolve.
Last edited by Daniel Shawul on Sat Jun 06, 2020 1:23 am, edited 1 time in total.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Help with Scorpio 3.0.8

Post by Dann Corbit »

I have a new name for this version:
Scorpio-Seismic:
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Daniel Shawul
Posts: 4186
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Help with Scorpio 3.0.8

Post by Daniel Shawul »

Dann Corbit wrote: Sat Jun 06, 2020 1:22 am I have a new name for this version:
Scorpio-Seismic:
This happens because when the GPU is computing the CPU threads are sitting idle, and vice versa.
The compute for the threads is synchronized by the batch size so the pattern makes sense.