The Next Big Thing in Computer Chess?

towforce · Post by **towforce** » Fri May 26, 2023 12:45 pm

Quick thought: a graphics card (or other matrix multiplication device - e.g. tensor core) will probably still be quicker at multiplying a sparse matrix than a CPU: it's still doing multiplications in parallel. It's just that if a matrix is sparse, the disadvantage won't be so great.

chrisw · Post by **chrisw** » Fri May 26, 2023 2:49 pm

towforce wrote: ↑Fri May 26, 2023 12:40 pm
smatovic wrote: ↑Fri May 26, 2023 12:19 pmWell, I have to admit that sometimes you are just too lazy to look up things by yourself

Of course I looked it up - but as stated in my previous post, it turns out there are two definitions. I already knew, extremely well, that a sparse matrix is one in which most of the values are zero, and looking up a sparse NN yielded the same definition.

Speaking of laziness, you didn't notice that the definition in the link to which ChrisW responded incorrectly states that a sparse NN is one in which most of the weights are zero. Looks as though this alleged "laziness" is not unique to myself: you and Mr Boilerplate have also demonstrated it here!

FFS. Wrong, wrong, wrong and wrong.

syzygy wrote: "The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation."

you wrote:
"It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:

"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."

If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?"

Before you spread around anymore of your streams of non-grounded thoughts and fill this space up with yet more misinformation .,...

Typical NNUEs (to which we are referring) have around 64x64x12 = 49152 inputs
For any one chess position, no more than 32 of those inputs will be set, the rest will be zero.

Sparsity refers to the density of the set inputs (32 out of 49152 = 0.065 percent), and has nothing to do with the weights.

For an accumulator of 1024 neurons, there are 1024 x 64 x 64 x 12 = 50M weights, all of which will be set to some value or other, positive or negative. For any one position, very few of those weights are actually being "used" to compute the NNUE eval, but that's entirely due to the sparsity of the inputs.

Your ramblings about "what percentage of the weights are zero" as being something to do with sparsity indicate you've learnt some big words (NNUE, weights, sparsity), but you're putting them together in a way that indicates you have no idea of actually how they are put together. All talk and no doing makes for BS production. Your hallmark. Very tedious it is too. Pomposity level in inverse proportion to actual knowledge.

towforce · Post by **towforce** » Fri May 26, 2023 3:06 pm

chrisw wrote: ↑Fri May 26, 2023 2:49 pmFFS. Wrong, wrong, wrong and wrong.

syzygy wrote: "The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation."

you wrote:
"It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:

"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."

If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?"

Before you spread around anymore of your streams of non-grounded thoughts and fill this space up with yet more misinformation .,...

Typical NNUEs (to which we are referring) have around 64x64x12 = 49152 inputs
For any one chess position, no more than 32 of those inputs will be set, the rest will be zero.

Sparsity refers to the density of the set inputs (32 out of 49152 = 0.065 percent), and has nothing to do with the weights.

For an accumulator of 1024 neurons, there are 1024 x 64 x 64 x 12 = 50M weights, all of which will be set to some value or other, positive or negative. For any one position, very few of those weights are actually being "used" to compute the NNUE eval, but that's entirely due to the sparsity of the inputs.

Your ramblings about "what percentage of the weights are zero" as being something to do with sparsity indicate you've learnt some big words (NNUE, weights, sparsity), but you're putting them together in a way that indicates you have no idea of actually how they are put together. All talk and no doing makes for BS production. Your hallmark. Very tedious it is too. Pomposity level in inverse proportion to actual knowledge.

Ignoring the boilerplate (as I always do

), you are wrong.

You are probably right about how syzygy intended it, but in ML, inputs which are mostly zero values are called "sparse data", not sparse NNs. I have already given correct definitions of sparse NNs.

As a result of my openness and willingness to be challenged, I have learned some interesting information, so for me, a big win!

You're apparently upset because me taking the meaning of expressions as they were written, and not as the writer intended, has "filled this space up" (your words). Neuroticism is, at heart, over-reacting to things emotionally. You know what to do to reduce that issue!

chrisw · Post by **chrisw** » Fri May 26, 2023 3:48 pm

towforce wrote: ↑Fri May 26, 2023 3:06 pm
chrisw wrote: ↑Fri May 26, 2023 2:49 pmFFS. Wrong, wrong, wrong and wrong.

syzygy wrote: "The NNUE networks, or at least some of them, are relatively sparse, and some implementations use/have used that to speed up the evaluation."

you wrote:
"It might be that the NNs are sparse, but I would find that surprising. I have a lot to say about this, but for now I'm going to "super simplify", saving my stream of thought for later. Here's my one sentence summary:

"A chess NN requires a lot of knowledge, and there's almost no knowledge in a zero weight."

If anyone has an NN chess program on their computer, would you mind taking a look at the weights file and getting a rough estimate of what percentage of the weights are zero, please?"

Before you spread around anymore of your streams of non-grounded thoughts and fill this space up with yet more misinformation .,...

Typical NNUEs (to which we are referring) have around 64x64x12 = 49152 inputs
For any one chess position, no more than 32 of those inputs will be set, the rest will be zero.

Sparsity refers to the density of the set inputs (32 out of 49152 = 0.065 percent), and has nothing to do with the weights.

For an accumulator of 1024 neurons, there are 1024 x 64 x 64 x 12 = 50M weights, all of which will be set to some value or other, positive or negative. For any one position, very few of those weights are actually being "used" to compute the NNUE eval, but that's entirely due to the sparsity of the inputs.

Your ramblings about "what percentage of the weights are zero" as being something to do with sparsity indicate you've learnt some big words (NNUE, weights, sparsity), but you're putting them together in a way that indicates you have no idea of actually how they are put together. All talk and no doing makes for BS production. Your hallmark. Very tedious it is too. Pomposity level in inverse proportion to actual knowledge.

Ignoring the boilerplate (as I always do ), you are wrong.

You are probably right about how syzygy intended it, but in ML, inputs which are mostly zero values are called "sparse data", not sparse NNs. I have already given correct definitions of sparse NNs.

As a result of my openness and willingness to be challenged, I have learned some interesting information, so for me, a big win!

You're apparently upset because me taking the meaning of expressions as they were written, and not as the writer intended, has "filled this space up" (your words). Neuroticism is, at heart, over-reacting to things emotionally. You know what to do to reduce that issue!

No doubt you'll carry on trying to BS your way out. Sparseness, here, in computer chess, in particular with NNUE's, refers to inputs. It certainly does NOT refer to the weights (which was where you outed yourself).

Try the main documentation of NNUE

https://github.com/glinscott/nnue-pytor ... cs/nnue.md

sparse is referred to 47 times, always with relation to inputs, and never in relation to weights. The weights are never "sparse". "The ratio of zero weights" which you want someone else to report (it's always someone else, isn't it? Do you ever actually DO anything?) has absolutely nothing to do with anything and certainly not what in your buzzword spinning brain you're suggesting.

Please try and attain even a rudimentary knowledge on subjects before opening your mouth. Dunning-Kruger is amusing, but not everyday and not when delivered with your level of pomposity. Have a nice day.

towforce · Post by **towforce** » Sat May 27, 2023 12:38 pm

chrisw wrote: ↑Fri May 26, 2023 3:48 pmTry the main documentation of NNUE

https://github.com/glinscott/nnue-pytor ... cs/nnue.md

sparse is referred to 47 times, always with relation to inputs, and never in relation to weights...

Minor point: each time the word "sparse" (or "sparsity") is used in that document, it's always made very clear what is being referred to (e.g. "sparse input" or "blocked sparse output" (negating Chris's claim, quoted above, that sparsity is "always with relation to inputs")). It never refers to a sparse network, sparse NN or anything like that.

However, I would urge you not to focus on that, and instead just read the article linked in the quoted text above about NNUE. It's REALLY good!

After reading it, I can see that I wasn't entirely wrong on reading about sparse networks to be triggered to remember what I know about sparse matrices in integer/mixed integer optimisation solvers (see earlier in the thread, when the discussion about sparsity started), even if it turns out to be not very relevant in this case (except, of course, for the truism that at the heart of any artificial (or natural) intelligence lies an optimisation problem).

I repeat: this is a big win for me. The value of what I've learned far outweighs the cost of reading a bit of boilerplate ChrisW rudeness (which is actually no cost at all to me: I'm the opposite of neurotic emotional overreaction - I actually don't mind people being rude to me).

Magnum · Post by **Magnum** » Wed Aug 09, 2023 10:13 pm

The Next Big Thing in Computer Chess?

MacBook 18 and 20-inch M3 ULTRA with ARMv9.

towforce · Post by **towforce** » Thu Aug 10, 2023 2:14 pm

Dann Corbit wrote: ↑Wed Apr 12, 2023 4:13 pm The next big thing will be when the GPUs and CPUs transparently share memory resources so that we do not have to copy to and from GPU memory.
Suddenly, engines like LC0 will become unbeatable.

It's not just the copy time that we save, it is a whole new programming paradigm.

How about... enabling the GPU to run programs independently?

towforce · Post by **towforce** » Thu Aug 10, 2023 2:15 pm

CornfedForever wrote: ↑Thu Apr 13, 2023 4:59 am As this is a non - engine specific question...and the it is an inevitability...the answer is: 8 man tablebase.

Not if an algorithm that plays perfect chess comes first!

smatovic · Post by **smatovic** » Thu Aug 10, 2023 4:33 pm

Larry Kaufman mentioned in another post 97% draws between SF 16 and SF 15 with 2"+1' TC in a 620 games match with standard opening and 2 threads, he estimated 99% draws for Rapid TC. How much Elo is still to gain on CCRL Blitz? Time's running, finish line in sight.
--
Srdja

Uri Blass · Post by **Uri Blass** » Fri Aug 11, 2023 9:16 am

smatovic wrote: ↑Thu Aug 10, 2023 4:33 pm Larry Kaufman mentioned in another post 97% draws between SF 16 and SF 15 with 2"+1' TC in a 620 games match with standard opening and 2 threads, he estimated 99% draws for Rapid TC. How much Elo is still to gain on CCRL Blitz? Time's running, finish line in sight.
--
Srdja

Engines need to use selective search in order to cause the opponent to fall to designed traps and this is not the way they work today.

If you want to find what is possible to achieve you need to build anti-stockfish engine when the target is beating stockfish when you get more time.

Anti-stockfish with white is going to work in the following way:
For white search every possible legal move but for black do not search every possible legal move but simply calculate the move that stockfish is going to play and prune the rest of the moves.

After part of your time(for example half of the target time but maybe different percentage is optimal) you do not calculate stockfish's moves because calculating stockfish's move is too expensive but you remember the stockfish's moves that you already calculated in order to prune the rest of the moves in your search.

In this way the engine may prefer lines when stockfish does mistakes so there is a bigger probability to win and the question is what is the percentage of wins that you can get against stockfish 2''+1' TC in this way(I guess clearly more than using unequal time control).

The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?

Re: The Next Big Thing in Computer Chess?