towforce wrote: ↑Wed May 24, 2023 8:51 pm
To be honest, I'm confused by the idea that NNs are sparse: are you telling me that most of the weights in LC0's NN are zero?
He is saying that if someone could train a strong Lc0-type network that is sparse, then one could run it somewhat efficiently on a CPU. But this might be completely unfeasible (I have no idea).
The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation.
It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:
"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."
If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?
Human chess is partly about tactics and strategy, but mostly about memory
towforce wrote: ↑Wed May 24, 2023 8:51 pm
To be honest, I'm confused by the idea that NNs are sparse: are you telling me that most of the weights in LC0's NN are zero?
He is saying that if someone could train a strong Lc0-type network that is sparse, then one could run it somewhat efficiently on a CPU. But this might be completely unfeasible (I have no idea).
The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation.
It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:
"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."
If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?
syzygy wrote: ↑Thu May 25, 2023 10:04 pm
It seems unlikely that NNUE is the best we can do on CPU, but I don't know how to do it better.
I wouldn't be surprised if NNUE is the best we can do on CPU. I don't see how a neural network could be evaluated any more efficiently than with incremental updates.
HalfKAv2_hm probably isn't the perfect NNUE architecture though.
towforce wrote: ↑Wed May 24, 2023 8:51 pm
To be honest, I'm confused by the idea that NNs are sparse: are you telling me that most of the weights in LC0's NN are zero?
He is saying that if someone could train a strong Lc0-type network that is sparse, then one could run it somewhat efficiently on a CPU. But this might be completely unfeasible (I have no idea).
The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation.
It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:
"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."
If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?
towforce wrote: ↑Wed May 24, 2023 8:51 pm
To be honest, I'm confused by the idea that NNs are sparse: are you telling me that most of the weights in LC0's NN are zero?
He is saying that if someone could train a strong Lc0-type network that is sparse, then one could run it somewhat efficiently on a CPU. But this might be completely unfeasible (I have no idea).
The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation.
It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:
"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."
If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?
Yes, we know what is a sparse matrix, thank you.
You’re just proving, also in your original question, that you read the big words but have no clue as to the underlying concepts. All mouth and BS.
Maybe go do some homework and try and work out how your original question demonstrates the lack of understanding?
Good find! Always impressed by your deep knowledge of graphics cards!
...thread seems to degenerate?
Don't worry - that's just standard boilerplate ChrisW rudeness towards myself - nothing to see here.
I would still like for someone who has an NN chess program on their device to look at the weights file and tell me roughly what proportion of the weights are equal to (or maybe close to) zero (or tell me where I can download a chess NN weight file so that I can take a look for myself). I am still confused as to how it can contain the enormous amount of knowledge it needs if most of the weights have a zero (or near zero) value.
Human chess is partly about tactics and strategy, but mostly about memory
So... on talking about a "sparse" NN, there are two valid, but separate, definitions:
1. Most of the weights are near zero
2. Not all nodes in one layer on an NN are connected all the nodes in the next layer
Of course, (1) could be used to achieve (2), in which case there's an opportunity for compressing the NN - and code libraries exist to accomplish this.
Human chess is partly about tactics and strategy, but mostly about memory
towforce wrote: ↑Fri May 26, 2023 11:29 am
...
Don't worry - that's just standard boilerplate ChrisW rudeness towards myself - nothing to see here.
...
Well, I have to admit that sometimes you are just too lazy to look up things by yourself
smatovic wrote: ↑Fri May 26, 2023 12:19 pmWell, I have to admit that sometimes you are just too lazy to look up things by yourself
Of course I looked it up - but as stated in my previous post, it turns out there are two definitions. I already knew, extremely well, that a sparse matrix is one in which most of the values are zero, and looking up a sparse NN yielded the same definition.
Speaking of laziness, you didn't notice that the definition in the link to which ChrisW responded incorrectly states that a sparse NN is one in which most of the weights are zero. Looks as though this alleged "laziness" is not unique to myself: you and Mr Boilerplate have also demonstrated it here!
Human chess is partly about tactics and strategy, but mostly about memory