The Next Big Thing in Computer Chess?

towforce · Post by **towforce** » Thu May 25, 2023 10:28 pm

syzygy wrote: ↑Thu May 25, 2023 10:11 pm
towforce wrote: ↑Wed May 24, 2023 8:51 pm To be honest, I'm confused by the idea that NNs are sparse: are you telling me that most of the weights in LC0's NN are zero?
He is saying that if someone could train a strong Lc0-type network that is sparse, then one could run it somewhat efficiently on a CPU. But this might be completely unfeasible (I have no idea).

The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation.

It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:

"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."

If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?

chrisw · Post by **chrisw** » Thu May 25, 2023 11:20 pm

towforce wrote: ↑Thu May 25, 2023 10:28 pm
syzygy wrote: ↑Thu May 25, 2023 10:11 pm
towforce wrote: ↑Wed May 24, 2023 8:51 pm To be honest, I'm confused by the idea that NNs are sparse: are you telling me that most of the weights in LC0's NN are zero?
He is saying that if someone could train a strong Lc0-type network that is sparse, then one could run it somewhat efficiently on a CPU. But this might be completely unfeasible (I have no idea).

The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation.

It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:

"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."

If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?

What's that got to do with sparsity?

alvinypeng · Post by **alvinypeng** » Fri May 26, 2023 12:36 am

syzygy wrote: ↑Thu May 25, 2023 10:04 pm It seems unlikely that NNUE is the best we can do on CPU, but I don't know how to do it better.

I wouldn't be surprised if NNUE is the best we can do on CPU. I don't see how a neural network could be evaluated any more efficiently than with incremental updates.

HalfKAv2_hm probably isn't the perfect NNUE architecture though.

towforce · Post by **towforce** » Fri May 26, 2023 9:29 am

chrisw wrote: ↑Thu May 25, 2023 11:20 pm
towforce wrote: ↑Thu May 25, 2023 10:28 pm
syzygy wrote: ↑Thu May 25, 2023 10:11 pm
towforce wrote: ↑Wed May 24, 2023 8:51 pm To be honest, I'm confused by the idea that NNs are sparse: are you telling me that most of the weights in LC0's NN are zero?
He is saying that if someone could train a strong Lc0-type network that is sparse, then one could run it somewhat efficiently on a CPU. But this might be completely unfeasible (I have no idea).

The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation.

It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:

"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."

If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?
What's that got to do with sparsity?

https://en.wikipedia.org/wiki/Sparse_matrix

https://stackoverflow.com/questions/413 ... eural-nets

chrisw · Post by **chrisw** » Fri May 26, 2023 9:58 am

towforce wrote: ↑Fri May 26, 2023 9:29 am
chrisw wrote: ↑Thu May 25, 2023 11:20 pm
towforce wrote: ↑Thu May 25, 2023 10:28 pm
syzygy wrote: ↑Thu May 25, 2023 10:11 pm
towforce wrote: ↑Wed May 24, 2023 8:51 pm To be honest, I'm confused by the idea that NNs are sparse: are you telling me that most of the weights in LC0's NN are zero?
He is saying that if someone could train a strong Lc0-type network that is sparse, then one could run it somewhat efficiently on a CPU. But this might be completely unfeasible (I have no idea).

The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation.

It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:

"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."

If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?
What's that got to do with sparsity?

https://en.wikipedia.org/wiki/Sparse_matrix

https://stackoverflow.com/questions/413 ... eural-nets

Yes, we know what is a sparse matrix, thank you.
You’re just proving, also in your original question, that you read the big words but have no clue as to the underlying concepts. All mouth and BS.
Maybe go do some homework and try and work out how your original question demonstrates the lack of understanding?

smatovic · Post by **smatovic** » Fri May 26, 2023 10:31 am

Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT
https://developer.nvidia.com/blog/accel ... -tensorrt/

...thread seems to degenerate?

--
Srdja

towforce · Post by **towforce** » Fri May 26, 2023 11:29 am

smatovic wrote: ↑Fri May 26, 2023 10:31 am Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT
https://developer.nvidia.com/blog/accel ... -tensorrt/

Good find! Always impressed by your deep knowledge of graphics cards!

...thread seems to degenerate?

Don't worry - that's just standard boilerplate ChrisW rudeness towards myself - nothing to see here.

I would still like for someone who has an NN chess program on their device to look at the weights file and tell me roughly what proportion of the weights are equal to (or maybe close to) zero (or tell me where I can download a chess NN weight file so that I can take a look for myself). I am still confused as to how it can contain the enormous amount of knowledge it needs if most of the weights have a zero (or near zero) value.

towforce · Post by **towforce** » Fri May 26, 2023 11:53 am

Useful reading: https://www.google.com/search?q=Sparse+ ... al+network

From that search, this seems to be a helpful article: https://www.baeldung.com/cs/neural-netw ... nse-sparse

So... on talking about a "sparse" NN, there are two valid, but separate, definitions:

1. Most of the weights are near zero

2. Not all nodes in one layer on an NN are connected all the nodes in the next layer

Of course, (1) could be used to achieve (2), in which case there's an opportunity for compressing the NN - and code libraries exist to accomplish this.

smatovic · Post by **smatovic** » Fri May 26, 2023 12:19 pm

towforce wrote: ↑Fri May 26, 2023 11:29 am ...
Don't worry - that's just standard boilerplate ChrisW rudeness towards myself - nothing to see here.
...

Well, I have to admit that sometimes you are just too lazy to look up things by yourself

--
Srdja

towforce · Post by **towforce** » Fri May 26, 2023 12:40 pm

smatovic wrote: ↑Fri May 26, 2023 12:19 pmWell, I have to admit that sometimes you are just too lazy to look up things by yourself

Of course I looked it up - but as stated in my previous post, it turns out there are two definitions. I already knew, extremely well, that a sparse matrix is one in which most of the values are zero, and looking up a sparse NN yielded the same definition.

Speaking of laziness, you didn't notice that the definition in the link to which ChrisW responded incorrectly states that a sparse NN is one in which most of the weights are zero. Looks as though this alleged "laziness" is not unique to myself: you and Mr Boilerplate have also demonstrated it here!

The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?