Help with installing Leela (2 Geforce RTX 2080 TI graphics cards)

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Metaphysician
Posts: 26
Joined: Wed Feb 20, 2019 10:46 pm
Full name: Neil Kulick

Help with installing Leela (2 Geforce RTX 2080 TI graphics cards)

Post by Metaphysician »

Hello. I am wondering if I could get some further advice from the experts here.

I have just installed 2 Geforce RTX 2080 Ti graphics cards on my workstation, which has two Xeon processors and a total of 24 cores. As far as I can tell, the graphics cards, which were installed by a friend who is an expert at these things, are operating properly; Windows task manager says they are.

After flailing around a bit, I seem to have succeed in downloading and installing Leela. (I installed the CUDA version on my Windows 10 machine, because I want Leela to take advantage of my graphics processors.)

My problem is that Leela seems to be running much more slowly than it should. I am running it in the Chessbase interface as a UCI engine. After about a minute of letting Leela analyze the starting position, I am getting a depth of 18 and am at 37 kN/s (Now, a few minutes later, I am at 26 kN/s. .Lower in the engine information panel I saw 6306kN after a short while; now, some minutes later, it says 17448kN, at a depth of 19.

When I open Task Manager, I see that one of my two graphics cards is working at about 5% or less, and the other at 0.

What am I doing wrong? Leela has so many parameters to play with (backend, backendOptions . . .), and I have no idea what values to input. Naively, I thought that the program would fly on my machine with my two new graphics cards, but obviously I don't know what I am doing.

Can anyone please help? I saw a reference somewhere to testing with T40, but I don't know what that means.

Thank you.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Help with installing Leela (2 Geforce RTX 2080 TI graphics cards)

Post by mwyoung »

Metaphysician wrote: Mon Mar 25, 2019 2:48 am Hello. I am wondering if I could get some further advice from the experts here.

I have just installed 2 Geforce RTX 2080 Ti graphics cards on my workstation, which has two Xeon processors and a total of 24 cores. As far as I can tell, the graphics cards, which were installed by a friend who is an expert at these things, are operating properly; Windows task manager says they are.

After flailing around a bit, I seem to have succeed in downloading and installing Leela. (I installed the CUDA version on my Windows 10 machine, because I want Leela to take advantage of my graphics processors.)

My problem is that Leela seems to be running much more slowly than it should. I am running it in the Chessbase interface as a UCI engine. After about a minute of letting Leela analyze the starting position, I am getting a depth of 18 and am at 37 kN/s (Now, a few minutes later, I am at 26 kN/s. .Lower in the engine information panel I saw 6306kN after a short while; now, some minutes later, it says 17448kN, at a depth of 19.

When I open Task Manager, I see that one of my two graphics cards is working at about 5% or less, and the other at 0.

What am I doing wrong? Leela has so many parameters to play with (backend, backendOptions . . .), and I have no idea what values to input. Naively, I thought that the program would fly on my machine with my two new graphics cards, but obviously I don't know what I am doing.

Can anyone please help? I saw a reference somewhere to testing with T40, but I don't know what that means.

Thank you.
I am not a expert on getting 2 cards working. As I only test with one 2080 ti.
But lets get one card working correctly. Then getting 2 cards to work will just be setting up the backend options correctly.

It sounds like you have one card working. If you have both cards installed. Remove one of the cards to give the other card more room for air flow.

The reason is I think you maybe cooking the card. This is the reason for your slow down. Your 37 kns is about right, but slowing down to 26 kns in not normal from the starting position. That I have every observed.

What version of Lc0?
What Neural network are you using?

"When I open Task Manager, I see that one of my two graphics cards is working at about 5% or less, and the other at 0."

The 5% in Task manager is normal. To see the correct readings you need to download MSI Afterburner. But this is telling me 1 card is running Lc0, and one card is not. And this is because you have not setup your backend options.

After downloading MSI afterburner. Run your test again for the start position. And let me know the temp, and GPU utilization. Use Default setting for Lc0, and change only backend to fp-16.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
smatovic
Posts: 2645
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Help with installing Leela (2 Geforce RTX 2080 TI graphics cards)

Post by smatovic »

Take a look at this sheet for different LC0 gpu parameters and nps:

https://docs.google.com/spreadsheets/d/ ... CjBILe6uA/

I doubt that LC0 per default will utilize both gpus, you will need to tell to use both, preferred with cudnn-fp16 backend for RTX cards, 2 threads per gpu, some nn cache and appropriate batch size.

When you got both gpus running with LC0 and an nps drop occur resp. gpus are not under full load , it could be thermal issues:

viewtopic.php?f=2&t=70097

--
Srdja
Albert Silver
Posts: 3019
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: Help with installing Leela (2 Geforce RTX 2080 TI graphics cards)

Post by Albert Silver »

Metaphysician wrote: Mon Mar 25, 2019 2:48 am Hello. I am wondering if I could get some further advice from the experts here.

I have just installed 2 Geforce RTX 2080 Ti graphics cards on my workstation, which has two Xeon processors and a total of 24 cores. As far as I can tell, the graphics cards, which were installed by a friend who is an expert at these things, are operating properly; Windows task manager says they are.

After flailing around a bit, I seem to have succeed in downloading and installing Leela. (I installed the CUDA version on my Windows 10 machine, because I want Leela to take advantage of my graphics processors.)

My problem is that Leela seems to be running much more slowly than it should. I am running it in the Chessbase interface as a UCI engine. After about a minute of letting Leela analyze the starting position, I am getting a depth of 18 and am at 37 kN/s (Now, a few minutes later, I am at 26 kN/s. .Lower in the engine information panel I saw 6306kN after a short while; now, some minutes later, it says 17448kN, at a depth of 19.

When I open Task Manager, I see that one of my two graphics cards is working at about 5% or less, and the other at 0.

What am I doing wrong? Leela has so many parameters to play with (backend, backendOptions . . .), and I have no idea what values to input. Naively, I thought that the program would fly on my machine with my two new graphics cards, but obviously I don't know what I am doing.

Can anyone please help? I saw a reference somewhere to testing with T40, but I don't know what that means.

Thank you.
In the engine options/properties:

1) Set the backend to Roundrobin
2) Set the backend options to "(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1)" (yes, include the quotes)
3) Set the threads to 3 (not more)
4) Set the nncache to 5000000 (five million). Can be ten if you have the RAM for it.

Save and enjoy
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: Help with installing Leela (2 Geforce RTX 2080 TI graphics cards)

Post by Hugo »

Taskmanager is not the best tool, to check the cards.
I use GPUz which gives a lot of usefull information.

C.K.
Metaphysician
Posts: 26
Joined: Wed Feb 20, 2019 10:46 pm
Full name: Neil Kulick

Re: Help with installing Leela (2 Geforce RTX 2080 TI graphics cards)

Post by Metaphysician »

Albert and others who have replied, thank you so much for your help.

Albert, I configured Leela as you suggested, limiting nncache to 5 million (I have 64 GB of RAM). But the engine hangs when I attempt to use it to analyze. Do you have any idea what I'm doing wrong?

Thanks again.

Meta
Metaphysician
Posts: 26
Joined: Wed Feb 20, 2019 10:46 pm
Full name: Neil Kulick

Re: Help with installing Leela (2 Geforce RTX 2080 TI graphics cards)

Post by Metaphysician »

Albert Silver wrote: Tue Mar 26, 2019 9:26 pm
Metaphysician wrote: Mon Mar 25, 2019 2:48 am Hello. I am wondering if I could get some further advice from the experts here.

I have just installed 2 Geforce RTX 2080 Ti graphics cards on my workstation, which has two Xeon processors and a total of 24 cores. As far as I can tell, the graphics cards, which were installed by a friend who is an expert at these things, are operating properly; Windows task manager says they are.

After flailing around a bit, I seem to have succeed in downloading and installing Leela. (I installed the CUDA version on my Windows 10 machine, because I want Leela to take advantage of my graphics processors.)

My problem is that Leela seems to be running much more slowly than it should. I am running it in the Chessbase interface as a UCI engine. After about a minute of letting Leela analyze the starting position, I am getting a depth of 18 and am at 37 kN/s (Now, a few minutes later, I am at 26 kN/s. .Lower in the engine information panel I saw 6306kN after a short while; now, some minutes later, it says 17448kN, at a depth of 19.

When I open Task Manager, I see that one of my two graphics cards is working at about 5% or less, and the other at 0.

What am I doing wrong? Leela has so many parameters to play with (backend, backendOptions . . .), and I have no idea what values to input. Naively, I thought that the program would fly on my machine with my two new graphics cards, but obviously I don't know what I am doing.

Can anyone please help? I saw a reference somewhere to testing with T40, but I don't know what that means.

Thank you.
In the engine options/properties:

1) Set the backend to Roundrobin
2) Set the backend options to "(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1)" (yes, include the quotes)
3) Set the threads to 3 (not more)
4) Set the nncache to 5000000 (five million). Can be ten if you have the RAM for it.

Save and enjoy
Albert, as I mentioned in a previous post a few minutes ago, when I followed your instructions to the letter, Leela hung. But now I'm trying them without the quotes around the string that you suggest I input for backend options, and the engine is running., though I am getting just 23 or 24 kN/s, just slightly higher than on my laptop, which has just one graphics cards, not two powerful Geforce RTX 2080 Tis. So I think I am still doing something wrong. Do you have any thoughts?

Many thanks for all your help.

Meta
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Help with installing Leela (2 Geforce RTX 2080 TI graphics cards)

Post by mwyoung »

Metaphysician wrote: Wed Mar 27, 2019 8:04 pm
Albert Silver wrote: Tue Mar 26, 2019 9:26 pm
Metaphysician wrote: Mon Mar 25, 2019 2:48 am Hello. I am wondering if I could get some further advice from the experts here.

I have just installed 2 Geforce RTX 2080 Ti graphics cards on my workstation, which has two Xeon processors and a total of 24 cores. As far as I can tell, the graphics cards, which were installed by a friend who is an expert at these things, are operating properly; Windows task manager says they are.

After flailing around a bit, I seem to have succeed in downloading and installing Leela. (I installed the CUDA version on my Windows 10 machine, because I want Leela to take advantage of my graphics processors.)

My problem is that Leela seems to be running much more slowly than it should. I am running it in the Chessbase interface as a UCI engine. After about a minute of letting Leela analyze the starting position, I am getting a depth of 18 and am at 37 kN/s (Now, a few minutes later, I am at 26 kN/s. .Lower in the engine information panel I saw 6306kN after a short while; now, some minutes later, it says 17448kN, at a depth of 19.

When I open Task Manager, I see that one of my two graphics cards is working at about 5% or less, and the other at 0.

What am I doing wrong? Leela has so many parameters to play with (backend, backendOptions . . .), and I have no idea what values to input. Naively, I thought that the program would fly on my machine with my two new graphics cards, but obviously I don't know what I am doing.

Can anyone please help? I saw a reference somewhere to testing with T40, but I don't know what that means.

Thank you.
In the engine options/properties:

1) Set the backend to Roundrobin
2) Set the backend options to "(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1)" (yes, include the quotes)
3) Set the threads to 3 (not more)
4) Set the nncache to 5000000 (five million). Can be ten if you have the RAM for it.

Save and enjoy
Albert, as I mentioned in a previous post a few minutes ago, when I followed your instructions to the letter, Leela hung. But now I'm trying them without the quotes around the string that you suggest I input for backend options, and the engine is running., though I am getting just 23 or 24 kN/s, just slightly higher than on my laptop, which has just one graphics cards, not two powerful Geforce RTX 2080 Tis. So I think I am still doing something wrong. Do you have any thoughts?

Many thanks for all your help.

Meta
You need to download msi afterburner. And see what temp and GPU % your graphics cards are running. If the engines are working, this sounds like a thermal issue. Remember two RTX 2080 ti will run at over 600 watts when using Lc0.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Hugo
Posts: 782
Joined: Tue Dec 01, 2009 11:10 am

Re: Help with installing Leela (2 Geforce RTX 2080 TI graphics cards)

Post by Hugo »

here is my lc0.config file I am using for two RTX cards (RTX 2070 + RTX 2060).
My machine is a simple quad at 8 threads 4GHz, and 32 GB RAM.

--backend=roundrobin
--backend-opts=(backend=cudnn-fp16,gpu=0),(backend=cudnn-fp16,gpu=1)
--threads=4
--nncache=15000000
--syzygy-paths=E:\syzygy-5;E:\6-wdl;E:\6-dtz

With this setup, I get 65900 nps in console mode with command go nodes 5000000.
Complete machine powerconsume is 520 Watt.

C.K.

btw. I uninstalled all tweaking tools . The cards dont need them. GPU load is better since then, and both cards run at 1900+ MHz.