Stockfish Great Speed Improvements but which Version?

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Magnum
Posts: 195
Joined: Thu Feb 04, 2021 10:24 pm
Full name: Arnold Magnum

Stockfish Great Speed Improvements but which Version?

Post by Magnum »

Thanks for the last 3, no elo, but speed improvements https://abrok.eu/stockfish/

Question:
Which version do we need for Windows 11 ARM?
Which one for macOS with M1, M1 Pro, M1 Max, M1 Ultra, M2, M2 Pro, M2 MAX, M2 Ultra, M3 chips?
Which one for ARMv8?
Which one for ARMv8 NEON?
Which one for ARMv8 pop-neon?

https://github.com/official-stockfish/S ... h/releases

https://github.com/official-stockfish/S ... rom-source
smatovic
Posts: 3312
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Stockfish Great Speed Improvements but which Version?

Post by smatovic »

Homebrew for macOS?

Code: Select all

brew install stockfish

Code: Select all

brew install stockfish --head
https://formulae.brew.sh/formula/stockfish#default

AFAIK, all of your mentioned ARM silicon use NEON SIMD units, and as far as I can see, there is one section in SF NNUE code base for NEON present.

See:
https://github.com/official-stockfish/S ... efile#L320
and:
https://github.com/official-stockfish/S ... mmon.h#L44

So, short answer, compile by yourself for native arch, via Homebrew for example.

--
Srdja
Magnum
Posts: 195
Joined: Thu Feb 04, 2021 10:24 pm
Full name: Arnold Magnum

Re: Stockfish Great Speed Improvements but which Version?

Post by Magnum »

smatovic wrote: Tue Aug 08, 2023 6:54 pm Homebrew for macOS?

Code: Select all

brew install stockfish

Code: Select all

brew install stockfish --head
https://formulae.brew.sh/formula/stockfish#default

AFAIK, all of your mentioned ARM silicon use NEON SIMD units, and as far as I can see, there is one section in SF NNUE code base for NEON present.

See:
https://github.com/official-stockfish/S ... efile#L320
and:
https://github.com/official-stockfish/S ... mmon.h#L44

So, short answer, compile by yourself for native arch, via Homebrew for example.

--
Srdja
Author: AndrovT
Date: Sun Aug 6 21:22:37 2023 +0200
Timestamp: 1691349757

Implement AffineTransformSparseInput for armv8

Implements AffineTransformSparseInput layer for the NNUE evaluation
for the armv8 and armv8-dotprod architectures. We measured some nice
speed improvements via 10 runs of our benchmark:

armv8, Cortex-X1 : 18.5% speed-up
armv8, Cortex-A76 : 13.2% speed-up
armv8-dotprod, Cortex-X1 : 27.1% speed-up
armv8-dotprod, Cortex-A76 : 12.1% speed-up
armv8, Cortex-A72, Raspberry Pi 4 : 8.2% speed-up (thanks Torom!)

closes https://github.com/official-stockfish/S ... /pull/4719

Is it possible to improve also the speed of ARMv8 Pop-Neon on Apples M1 CPUs? And have somebody an idea how to do it?
smatovic
Posts: 3312
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Stockfish Great Speed Improvements but which Version?

Post by smatovic »

From the SF Makefile:

Code: Select all

ifeq ($(ARCH),apple-silicon)
	arch = arm64
	prefetch = yes
	popcnt = yes
	neon = yes
	dotprod = yes
	arm_version = 8
endif
"dotprod = yes" -> I assume if you compile from source (Homebrew) for Apple M-series, it will consider dotproduct-optimization, but I am not into the details.

--
Srdja
Uri
Posts: 505
Joined: Thu Dec 27, 2007 9:34 pm

Re: Stockfish Great Speed Improvements but which Version?

Post by Uri »

What if physics really is what prevents us from creating better and faster chess engines?

I mean Moore's law is coming to an end and this means that we cannot create even smaller computer components because of quantum mechanical effects.

Also we know that nothing can move faster than the speed of light, not even the information inside our computers.

So what if the laws of physics is really what currently prevents programmers from creating even better chess engines?

Take note that I'm a physicist and not a computer programmer so what I say all comes from the point of view of physics.
smatovic
Posts: 3312
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Stockfish Great Speed Improvements but which Version?

Post by smatovic »

Uri wrote: Wed Aug 09, 2023 7:00 pm ...
I see still some room for development, hardware and software:

The Next Big Thing in Computer Chess?
https://talkchess.com/forum3/viewtopic.php?f=2&t=81858

but it seems we are reaching the "death by draw" in Western Chess.

--
Srdja
syzygy
Posts: 5703
Joined: Tue Feb 28, 2012 11:56 pm

Re: Stockfish Great Speed Improvements but which Version?

Post by syzygy »

Uri wrote: Wed Aug 09, 2023 7:00 pm What if physics really is what prevents us from creating better and faster chess engines?

I mean Moore's law is coming to an end and this means that we cannot create even smaller computer components because of quantum mechanical effects.

Also we know that nothing can move faster than the speed of light, not even the information inside our computers.

So what if the laws of physics is really what currently prevents programmers from creating even better chess engines?

Take note that I'm a physicist and not a computer programmer so what I say all comes from the point of view of physics.
The end of Moore's law has nothing to do with an impossibility to improve software. It cannot prevent programmers from creating a better chess engine for the same hardware. It can prevent chip makers from producing chips (CPU, GPU, whatever) that run the same software faster.

At the moment I believe there is still room for improving lithography, and there is certainly room for improving hardware architectures, in particular now that AI-type calculations have become very important for chess engines.

Given a particular hardware platform, there is of course an upper limit to engine strength. There are only finitely many programs that can run on a particular computer, so there is no infinite sequence of ever stronger and stronger chess engines (because there is no infinite sequence of different programs to begin with). I don't think we will ever (before the heat death of the universe) find the optimal chess engine, but it will be harder and harder to find significant improvements.

Even if hardware designers reach the limits of physics, we can still easily simulate the strength of a chess engine running at double speed, namely by giving it double the time. (Hmmm, and it seems to me we could still build machines that use relativistic time dilation to get more work done in units of earth time... Might not be suitable for blitz chess, though.)
Magnum
Posts: 195
Joined: Thu Feb 04, 2021 10:24 pm
Full name: Arnold Magnum

Re: Stockfish Great Speed Improvements but which Version?

Post by Magnum »

smatovic wrote: Wed Aug 09, 2023 5:47 pm From the SF Makefile:

Code: Select all

ifeq ($(ARCH),apple-silicon)
	arch = arm64
	prefetch = yes
	popcnt = yes
	neon = yes
	dotprod = yes
	arm_version = 8
endif
"dotprod = yes" -> I assume if you compile from source (Homebrew) for Apple M-series, it will consider dotproduct-optimization, but I am not into the details.

--
Srdja
Thanks. Actually it looks like developers can improve the speed of Stockfish on Apple M1, M2, M3 devices a lot.
Any ideas how to speed-up Stockfish more on Apple or ARM devices.

All 3 have the same speed:
make -j profile-build ARCH=apple-silicon COMP=clang CXX=clang++

CXXFLAGS="-mcpu=apple-m1" make -j profile-build ARCH=apple-silicon COMP=clang CXX=clang++

CXXFLAGS="-march=native" make -j profile-build ARCH=apple-silicon COMP=clang CXX=clang++

What could be improved?
smatovic
Posts: 3312
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Stockfish Great Speed Improvements but which Version?

Post by smatovic »

Magnum wrote: Thu Aug 10, 2023 7:53 am ....
What could be improved?
My take, you will need a new approach to make use of the whole compute power available in Apple M-series silicon, CPU+SIMD+GPU+TPU via unified memory. Currently SF uses CPU+SIMD, and Lc0 uses CPU+GPU.

--
Srdja
User avatar
Ras
Posts: 2701
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: Stockfish Great Speed Improvements but which Version?

Post by Ras »

Magnum wrote: Thu Aug 10, 2023 7:53 amActually it looks like developers can improve the speed of Stockfish on Apple M1, M2, M3 devices a lot.
M1 and M2 have already been improved software-wise. They are simply mediocre devices for chess, that's the ugly truth. Even an ROG Ally handheld console is better.

M3 might become a different story if Apple decides to use ARMv9 along with its scalable vector extensions instead of the ARMv8 of M1/M2. However, it remains to be seen what the exact implementation would be, and how useful it would be because "scalable" means exactly that, a range of possible implementations.
Rasmus Althoff
https://www.ct800.net